Linux Performance Benchmark Tools Guide

System Architecture Overview

The Linux system architecture can be benchmarked at multiple layers using specialized tools. Here's a breakdown of the architecture and associated benchmark tools:

Hardware Components

CPUs
GPUs/TPUs
DRAM
HBM (High Bandwidth Memory)
FPGAs
I/O Controllers
Network Controllers
SMC (System Management Controller)
Fans

Operating System Components

Device Drivers
Block Device
Network Device
Virtual Memory
File Systems
TCP/UDP
IP Stack
System Call Interface
Scheduler

Application Layer

Applications
System Libraries
VFS (Virtual File System)
Sockets

Benchmark Tools by Layer

System Benchmarks

sysbench

# CPU benchmark
sysbench cpu run

# Memory benchmark
sysbench memory run

# File I/O benchmark
sysbench fileio --file-test-mode=seqwr run

# MySQL benchmark
sysbench mysql-oltp-read-write run

# Common options
--threads=4              # Number of threads to use
--time=60               # Test duration in seconds
--events=1000           # Number of events to process

UnixBench

# Run complete benchmark suite
./Run

# Run specific tests
./Run dhry2reg
./Run whetstone
./Run execl
./Run file1
./Run shell1

# Multi-CPU tests
./Run -c 4              # Test with 4 copies

Application Level Benchmarks

ab (Apache Benchmark)

# Basic HTTP GET benchmark
ab -n 1000 -c 10 http://localhost/

# POST request benchmark
ab -n 1000 -c 10 -p post.data http://localhost/form

# Common options
-k                      # Use HTTP KeepAlive
-H "Header: value"      # Add custom header
-C "cookie=value"       # Add cookie
-T "content/type"       # Set content-type

wrk (HTTP Benchmarking Tool)

# Basic HTTP benchmark
wrk -t12 -c400 -d30s http://localhost

# With custom script
wrk -t2 -c100 -d30s -s script.lua http://localhost

# Common options
-t12                    # 12 threads
-c400                   # 400 connections
-d30s                   # 30 second duration
--timeout 30s           # 30 second timeout

Hardware Benchmarks

MLPerf (Machine Learning Performance)

# Run training benchmark
mlperf-train --config=resnet50 --framework=tensorflow

# Run inference benchmark
mlperf-inference --scenario=Server --model=resnet50

# Common options
--backend=gpu           # Use GPU backend
--precision=fp16        # Use FP16 precision
--dataset=imagenet      # Specify dataset

perf bench

# Scheduler benchmarks
perf bench sched pipe
perf bench sched messaging

# Memory benchmarks
perf bench mem memcpy
perf bench mem memset

# Futex benchmarks
perf bench futex wake
perf bench futex wake-parallel

Storage Benchmarks

fio (Flexible I/O Tester)

# Sequential read test
fio --name=seq-read --rw=read --size=4G

# Random write test
fio --name=rand-write --rw=randwrite --bs=4k --size=1G

# Mixed workload
fio --name=mixed --rw=randrw --bs=4k --size=1G

# Common options
--iodepth=16           # I/O queue depth
--direct=1             # Use O_DIRECT
--numjobs=4            # Number of parallel jobs
--runtime=60           # Test duration in seconds

dd (Disk Performance)

# Write speed test
dd if=/dev/zero of=test bs=1G count=1 oflag=direct

# Read speed test
dd if=test of=/dev/null bs=1G count=1 iflag=direct

# Common options
bs=1M                  # Block size
count=1000             # Number of blocks
conv=fdatasync         # Sync data before finishing

Network Benchmarks

iperf/iperf3

# Server mode
iperf3 -s

# Client mode TCP test
iperf3 -c server_ip

# UDP bandwidth test
iperf3 -c server_ip -u -b 100M

# Common options
-P 4                   # Number of parallel streams
-i 1                   # Report interval
-t 30                  # Test duration
-w 256K                # Window size

ttcp (Test TCP)

# Receiver mode
ttcp -r -s

# Transmitter mode
ttcp -t -s server_ip

# Common options
-l 8192                # Buffer size
-n 2048                # Number of buffers
-p 5001                # Port number

System Call Benchmarks

lmbench

# Run complete benchmark suite
lmbench-run

# Memory latency benchmark
lat_mem_rd 1024 512

# Context switch benchmark
lat_ctx -s 64K 2 4 8 16

# Common options
-P 1                   # Number of processes
-W 3                   # Warmup time
-N 5                   # Number of runs

System Tools

jmeter (Application Performance Testing)

# Run test plan
jmeter -n -t test.jmx -l results.jtl

# Generate HTML report
jmeter -g results.jtl -o report_directory

# Common options
-Jthreads=10           # Number of threads
-Jduration=300         # Test duration
-Jrampup=60            # Ramp-up period

Network Diagnostic Tools

hping3

# TCP SYN flood test
hping3 -S -p 80 --flood target_ip

# ICMP ping test
hping3 -1 target_ip

# Common options
--fast                 # Send packets fast
--rand-source         # Random source address
-i u100               # Wait 100 microseconds between packets

pchar (Network Performance)

# Measure path characteristics
pchar network_path

# Common options
-i interface           # Specify interface
-q                     # Quiet mode
-v                     # Verbose output

Hardware Parameter Tools

hdparm (Hard Drive Parameters)

# Disk read timing
hdparm -t /dev/sda

# Cache read timing
hdparm -T /dev/sda

# Common options
-i                     # Show device information
-I                     # Detailed device information
--direct               # Use O_DIRECT

Best Practices for Benchmarking

Preparation
Clean system state
Consistent environment
Minimal background processes
Representative workloads
Execution
Multiple iterations
Varying parameters
Record all conditions
Monitor system state
Analysis
Statistical analysis
Outlier detection
Performance regression
Bottleneck identification
Documentation
Hardware configuration
Software versions
Test parameters
Environmental factors

Common Benchmarking Scenarios

Web Server Performance

# Using ab
ab -n 10000 -c 100 http://localhost/
# Using wrk
wrk -t4 -c100 -d30s http://localhost/

Database Performance

# Using sysbench
sysbench oltp_read_write --table-size=1000000 prepare
sysbench oltp_read_write --table-size=1000000 run

Storage Performance

# Using fio
fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=randwrite --bs=4k --size=4g

Network Performance

# Using iperf3
iperf3 -c server_ip -t 60 -P 4

Performance Metrics to Monitor

CPU Metrics
Utilization
Load average
Context switches
Cache hits/misses
Memory Metrics
Usage
Page faults
Swap usage
Cache efficiency
Disk Metrics
IOPS
Throughput
Latency
Queue depth
Network Metrics
Bandwidth
Latency
Packet loss
Connection states

Additional Resources

Documentation
Man pages for tools
Vendor documentation
Performance tuning guides
Benchmark specifications
Monitoring Tools
Grafana
Prometheus
Telegraf
Ganglia
Analysis Tools
R Statistical Software
Python with NumPy/Pandas
Jupyter Notebooks
Excel/LibreOffice Calc
Community Resources
Linux Performance website
Performance mailing lists
Tool-specific forums
Academic papers

📊 Analytics

View Dashboard →

📈 This page is being tracked with GoatCounter. Click here to view detailed analytics including page views, referrers, and visitor statistics.

Show Public Statistics

Live view counter for this page