Skip to content

Linux Performance Benchmark Tools Guide

Table of Contents

  1. System Architecture Overview
  2. Benchmark Tools by Layer
  3. Hardware Layer Benchmarks
  4. Operating System Layer Benchmarks
  5. Application Layer Benchmarks
  6. Network Performance Tools
  7. Detailed Tool Reference

System Architecture Overview

The Linux system architecture can be benchmarked at multiple layers using specialized tools. Here's a breakdown of the architecture and associated benchmark tools:

Hardware Components

  • CPUs
  • GPUs/TPUs
  • DRAM
  • HBM (High Bandwidth Memory)
  • FPGAs
  • I/O Controllers
  • Network Controllers
  • SMC (System Management Controller)
  • Fans

Operating System Components

  1. Device Drivers
  2. Block Device
  3. Network Device
  4. Virtual Memory
  5. File Systems
  6. TCP/UDP
  7. IP Stack
  8. System Call Interface
  9. Scheduler

Application Layer

  • Applications
  • System Libraries
  • VFS (Virtual File System)
  • Sockets

Benchmark Tools by Layer

System Benchmarks

sysbench

# CPU benchmark
sysbench cpu run

# Memory benchmark
sysbench memory run

# File I/O benchmark
sysbench fileio --file-test-mode=seqwr run

# MySQL benchmark
sysbench mysql-oltp-read-write run

# Common options
--threads=4              # Number of threads to use
--time=60               # Test duration in seconds
--events=1000           # Number of events to process

UnixBench

# Run complete benchmark suite
./Run

# Run specific tests
./Run dhry2reg
./Run whetstone
./Run execl
./Run file1
./Run shell1

# Multi-CPU tests
./Run -c 4              # Test with 4 copies

Application Level Benchmarks

ab (Apache Benchmark)

# Basic HTTP GET benchmark
ab -n 1000 -c 10 http://localhost/

# POST request benchmark
ab -n 1000 -c 10 -p post.data http://localhost/form

# Common options
-k                      # Use HTTP KeepAlive
-H "Header: value"      # Add custom header
-C "cookie=value"       # Add cookie
-T "content/type"       # Set content-type

wrk (HTTP Benchmarking Tool)

# Basic HTTP benchmark
wrk -t12 -c400 -d30s http://localhost

# With custom script
wrk -t2 -c100 -d30s -s script.lua http://localhost

# Common options
-t12                    # 12 threads
-c400                   # 400 connections
-d30s                   # 30 second duration
--timeout 30s           # 30 second timeout

Hardware Benchmarks

MLPerf (Machine Learning Performance)

# Run training benchmark
mlperf-train --config=resnet50 --framework=tensorflow

# Run inference benchmark
mlperf-inference --scenario=Server --model=resnet50

# Common options
--backend=gpu           # Use GPU backend
--precision=fp16        # Use FP16 precision
--dataset=imagenet      # Specify dataset

perf bench

# Scheduler benchmarks
perf bench sched pipe
perf bench sched messaging

# Memory benchmarks
perf bench mem memcpy
perf bench mem memset

# Futex benchmarks
perf bench futex wake
perf bench futex wake-parallel

Storage Benchmarks

fio (Flexible I/O Tester)

# Sequential read test
fio --name=seq-read --rw=read --size=4G

# Random write test
fio --name=rand-write --rw=randwrite --bs=4k --size=1G

# Mixed workload
fio --name=mixed --rw=randrw --bs=4k --size=1G

# Common options
--iodepth=16           # I/O queue depth
--direct=1             # Use O_DIRECT
--numjobs=4            # Number of parallel jobs
--runtime=60           # Test duration in seconds

dd (Disk Performance)

# Write speed test
dd if=/dev/zero of=test bs=1G count=1 oflag=direct

# Read speed test
dd if=test of=/dev/null bs=1G count=1 iflag=direct

# Common options
bs=1M                  # Block size
count=1000             # Number of blocks
conv=fdatasync         # Sync data before finishing

Network Benchmarks

iperf/iperf3

# Server mode
iperf3 -s

# Client mode TCP test
iperf3 -c server_ip

# UDP bandwidth test
iperf3 -c server_ip -u -b 100M

# Common options
-P 4                   # Number of parallel streams
-i 1                   # Report interval
-t 30                  # Test duration
-w 256K                # Window size

ttcp (Test TCP)

# Receiver mode
ttcp -r -s

# Transmitter mode
ttcp -t -s server_ip

# Common options
-l 8192                # Buffer size
-n 2048                # Number of buffers
-p 5001                # Port number

System Call Benchmarks

lmbench

# Run complete benchmark suite
lmbench-run

# Memory latency benchmark
lat_mem_rd 1024 512

# Context switch benchmark
lat_ctx -s 64K 2 4 8 16

# Common options
-P 1                   # Number of processes
-W 3                   # Warmup time
-N 5                   # Number of runs

System Tools

jmeter (Application Performance Testing)

# Run test plan
jmeter -n -t test.jmx -l results.jtl

# Generate HTML report
jmeter -g results.jtl -o report_directory

# Common options
-Jthreads=10           # Number of threads
-Jduration=300         # Test duration
-Jrampup=60            # Ramp-up period

Network Diagnostic Tools

hping3

# TCP SYN flood test
hping3 -S -p 80 --flood target_ip

# ICMP ping test
hping3 -1 target_ip

# Common options
--fast                 # Send packets fast
--rand-source         # Random source address
-i u100               # Wait 100 microseconds between packets

pchar (Network Performance)

# Measure path characteristics
pchar network_path

# Common options
-i interface           # Specify interface
-q                     # Quiet mode
-v                     # Verbose output

Hardware Parameter Tools

hdparm (Hard Drive Parameters)

# Disk read timing
hdparm -t /dev/sda

# Cache read timing
hdparm -T /dev/sda

# Common options
-i                     # Show device information
-I                     # Detailed device information
--direct               # Use O_DIRECT

Best Practices for Benchmarking

  1. Preparation
  2. Clean system state
  3. Consistent environment
  4. Minimal background processes
  5. Representative workloads

  6. Execution

  7. Multiple iterations
  8. Varying parameters
  9. Record all conditions
  10. Monitor system state

  11. Analysis

  12. Statistical analysis
  13. Outlier detection
  14. Performance regression
  15. Bottleneck identification

  16. Documentation

  17. Hardware configuration
  18. Software versions
  19. Test parameters
  20. Environmental factors

Common Benchmarking Scenarios

  1. Web Server Performance
# Using ab
ab -n 10000 -c 100 http://localhost/
# Using wrk
wrk -t4 -c100 -d30s http://localhost/
  1. Database Performance
# Using sysbench
sysbench oltp_read_write --table-size=1000000 prepare
sysbench oltp_read_write --table-size=1000000 run
  1. Storage Performance
# Using fio
fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=randwrite --bs=4k --size=4g
  1. Network Performance
# Using iperf3
iperf3 -c server_ip -t 60 -P 4

Performance Metrics to Monitor

  1. CPU Metrics
  2. Utilization
  3. Load average
  4. Context switches
  5. Cache hits/misses

  6. Memory Metrics

  7. Usage
  8. Page faults
  9. Swap usage
  10. Cache efficiency

  11. Disk Metrics

  12. IOPS
  13. Throughput
  14. Latency
  15. Queue depth

  16. Network Metrics

  17. Bandwidth
  18. Latency
  19. Packet loss
  20. Connection states

Additional Resources

  1. Documentation
  2. Man pages for tools
  3. Vendor documentation
  4. Performance tuning guides
  5. Benchmark specifications

  6. Monitoring Tools

  7. Grafana
  8. Prometheus
  9. Telegraf
  10. Ganglia

  11. Analysis Tools

  12. R Statistical Software
  13. Python with NumPy/Pandas
  14. Jupyter Notebooks
  15. Excel/LibreOffice Calc

  16. Community Resources

  17. Linux Performance website
  18. Performance mailing lists
  19. Tool-specific forums
  20. Academic papers