How to Identify a Disk I/O Bottleneck on Linux

Your server's load average is elevated. You run top and the CPU looks mostly idle. Processes are piling up in the b (blocked) state in vmstat. Something is waiting — and that something is almost certainly disk I/O. This guide walks you through confirming it, finding the saturated device, and pinpointing the exact process and file responsible.

Step 1: Confirm it's I/O, not CPU

Run vmstat first. It gives you the fastest system-wide snapshot and immediately tells you whether you're looking at a CPU problem or an I/O problem.

vmstat -w 1 5

Two columns tell the story:

b — processes blocked waiting on I/O. Anything above 2–3 sustained is a signal worth following.
wa — the percentage of CPU time spent waiting for I/O. Above 10–20% warrants investigation; above 40% is a confirmed problem.

If b is elevated and us/sy (user and system CPU) are low, you have an I/O bottleneck, not a CPU one. That's your fork in the road.

Dirty secret about wa: it only shows non-zero when a CPU has nothing else to run while waiting for I/O. On a busy server, CPUs may stay occupied with other work even during I/O pressure — so wa can look deceptively low. Don't rule out disk problems just because wa is small. Always cross-check with iostat.

Step 2: Find the saturated device with iostat

iostat is the right tool for identifying which physical device is under pressure. Run it with extended stats and a short interval:

iostat -xz 1 5

The columns that matter most:

Column	What it means	Warning threshold
`r/s`, `w/s`	Read and write operations per second	Context-dependent on device type
`rkB/s`, `wkB/s`	Read/write throughput in KB/s	Near the device's rated maximum
`await`	Average I/O response time in milliseconds	> 20ms for HDDs, > 5ms for SSDs
`%util`	Percentage of time the device was busy	> 90% = saturated

A device showing %util near 100% with high await is your bottleneck. Note the device name (e.g., sda, nvme0n1) — you'll need it in the next steps. The -z flag suppresses idle devices, so everything you see in the output is actively doing work.

Step 3: Find which process is causing it

Now that you know the device, you need to find the process responsible. iotop is the most direct way:

# Show only processes actively doing I/O, non-interactive, 3 snapshots
iotop -o -b -n 3

The -o flag filters to processes currently doing I/O — without it, you're scrolling through hundreds of idle entries. You'll immediately see which PID, user, and process name is generating the most reads or writes.

If iotop isn't installed:

# Debian / Ubuntu
apt install iotop

# RHEL / CentOS / Fedora
yum install iotop

An alternative that's usually pre-installed is pidstat, part of the sysstat package:

# Per-process disk read/write rates, 1-second intervals
pidstat -d 1 5

pidstat shows the same information in a slightly different format and is useful when you need output you can redirect to a file or parse in a script.

Step 4: Find which files that process is touching

Once you have the PID, lsof shows you exactly what it has open:

# All open files for a process
lsof -p <PID>

# Filter to regular files only (cuts out sockets, pipes, devices)
lsof -p <PID> | grep REG

The SIZE column tells you how large each file is. Look for large log files, database data directories, or temp files in /tmp or /var that are being written to continuously. That's usually where your I/O is going.

# Alternatively: watch in real time what files a process accesses
strace -p <PID> -e trace=read,write,open,openat 2>&1 | head -50

strace is slower due to syscall tracing overhead, but it shows you the actual operations as they happen — useful when lsof shows too many open files to make sense of.

Step 5: Check for deleted files still held open

A common and easy-to-miss cause of unexpected I/O and disk exhaustion: a process has a file open that was deleted from the filesystem. The space isn't freed until the process closes it, but writes keep going to the invisible file. This often happens with log files that get rotated but not properly signaled to the writing process.

lsof | grep deleted

Pay attention to the SIZE column. A 10GB deleted file still being written to by nginx or java is a common surprise on full-disk incidents. The fix is restarting the responsible service, which forces it to close the file handle and release the space.

Step 6: Check the I/O scheduler

On Linux, the I/O scheduler controls how read/write requests are queued and dispatched per block device. The default varies by kernel version and device type, but it's worth verifying — especially if you've recently migrated to NVMe or SSD storage where the old HDD-optimized schedulers hurt performance.

# Check current scheduler for a device
cat /sys/block/sda/queue/scheduler

# Example output (active scheduler is in brackets):
# [mq-deadline] none kyber bfq

General guidance:

NVMe / SSD: use none or mq-deadline. These devices have their own internal queuing and don't benefit from the kernel reordering requests.
HDD (spinning disk): use mq-deadline or bfq. Reordering requests to minimize seek time still matters here.

To change it temporarily (survives until next reboot):

echo mq-deadline > /sys/block/sda/queue/scheduler

For a permanent change, add a udev rule:

# Create /etc/udev/rules.d/60-ioschedulers.rules
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="mq-deadline"
ACTION=="add|change", KERNEL=="nvme[0-9]*", ATTR{queue/scheduler}="none"

Step 7: Rule out hardware failure

High await isn't always a software problem. A failing drive will show the same symptoms as a saturated one — but tuning won't help. Check the kernel log for hardware errors before spending time on configuration:

# Disk errors, timeouts, resets in the kernel ring buffer
dmesg -T | grep -iE 'error|failed|timeout|reset|ata' | tail -30

# SMART health check (requires smartmontools)
smartctl -a /dev/sda | grep -E 'Reallocated|Pending|Uncorrectable|Temperature'

# Raw I/O error counters from the kernel
cat /sys/block/sda/stat

If dmesg is full of ATA errors or timeout messages, you're looking at hardware — not a performance tuning problem. Get the disk replaced before data loss occurs.

Quick reference: the full sequence

# 1. Is it I/O? Check blocked processes and iowait
vmstat -w 1 5

# 2. Which device is saturated?
iostat -xz 1 5

# 3. Which process is responsible?
iotop -o -b -n 3
# or without iotop:
pidstat -d 1 5

# 4. What files is it touching?
lsof -p <PID> | grep REG

# 5. Any deleted files still held open (ghost writes)?
lsof | grep deleted

# 6. Check I/O scheduler
cat /sys/block/sda/queue/scheduler

# 7. Rule out hardware failure
dmesg -T | grep -iE 'error|failed|timeout|reset|ata' | tail -30

When you can't find the cause: if iotop shows activity but no single process dominates, check for NFS mounts (mount | grep nfs) — remote filesystems time out silently and all the waiting looks like local I/O. Also check /proc/sys/vm/dirty_ratio and dirty_background_ratio; aggressive write-back flushing can cause periodic I/O spikes that look like a sustained bottleneck.

How to Identify a Disk I/O Bottleneck on Linux

Step 1: Confirm it's I/O, not CPU

Step 2: Find the saturated device with iostat

Step 3: Find which process is causing it

Step 4: Find which files that process is touching

Step 5: Check for deleted files still held open

Step 6: Check the I/O scheduler

Step 7: Rule out hardware failure

Quick reference: the full sequence

Related guides

Diagnosing High Load Average

Find Large Files and Directories

Check Memory Usage on Linux