Disk Benchmarking using FIO

Vineet Kumar
4 min readJul 28, 2022


To benchmark persistent disk performance, use FIO instead of other disk benchmarking tools such as dd. By default, dd uses a very low I/O queue depth, so it is difficult to ensure that the benchmark is generating a sufficient number of I/Os and bytes to accurately test disk performance.

To measure IOPS and throughput of a disk in use on a running instance, benchmark the file system with its intended configuration. Use this option to test a realistic workload without losing the contents of your existing disk. Note that when you benchmark the file system on an existing disk, there are many factors specific to your development environment that may affect benchmarking results, and you may not reach the disk performance limits

Following Parameters used in fio command:

--name= is a required argument, but it’s basically human-friendly fluff — fio will create files based on that name to test with, inside the working directory you’re currently in.

--ioengine=posixaio sets the mode fio interacts with the filesystem. POSIX is a standard Windows, Macs, Linux, and BSD all understand, so it’s great for portability — although inside fio itself, Windows users need to invoke — ioengine=windowsaio, not — ioengine=posixaio, unfortunately. AIO stands for Asynchronous Input Output and means that we can queue up multiple operations to be completed in whatever order the OS decides to complete them. (In this particular example, later arguments effectively nullify this.)

--rw=randwrite means exactly what it looks like it means: we’re going to do random write operations to our test files in the current working directory. Other options include seqread, seqwrite, randread, and randrw, all of which should hopefully be fairly self-explanatory.

--bs=4k block size 4K. These are very small individual operations. This is where the pain lives; it’s hard on the disk, and it also means a ton of extra overhead in the SATA, USB, SAS, SMB, or whatever other command channel lies between us and the disks, since a separate operation has to be commanded for each 4K of data.

--size=4g our test file(s) will be 4GB in size apiece. (We’re only creating one, see next argument.)

--numjobs=1 we’re only creating a single file, and running a single process commanding operations within that file. If we wanted to simulate multiple parallel processes, we’d do, eg, — numjobs=16, which would create 16 separate test files of — size size, and 16 separate processes operating on them at the same time. (VALUE BELONGS TO NO OF CPU . I.E. IF NO OF CPU 5 THEN — NUNJOBS=5)

--iodepth=1 this is how deep we’re willing to try to stack commands in the OS’s queue. Since we set this to 1, this is effectively pretty much the same thing as the sync IO engine — we’re only asking for a single operation at a time, and the OS has to acknowledge receipt of every operation we ask for before we can ask for another. (It does not have to satisfy the request itself before we ask it to do more operations, it just has to acknowledge that we actually asked for it.)

--runtime=60 — time_based Run for sixty seconds — and even if we complete sooner, just start over again and keep going until 60 seconds is up.

--end_fsync=1 After all operations have been queued, keep the timer going until the OS reports that the very last one of them has been successfully completed — ie, actually written to disk.

--output= save output in related file

Benchmarking IOPS and throughput of a disk on a running instance

If you want to measure IOPS and throughput for a realistic workload on an active disk on a running instance without losing the contents of your disk, benchmark against a new directory on the existing file system.

  1. Connect to your instance.
  2. Install dependencies:
    sudo apt update
    sudo apt install -y fio
  3. In the terminal, list the disks that are attached to your VM and find the disk that you want to test. If your persistent disk is not yet formatted, format and mount the disk.
sudo lsblk

sda 8:0 0 10G 0 disk
└─sda1 8:1 0 10G 0 part /
sdb 8:32 0 2.5T 0 disk /mnt/disks/mnt_dir

In this example, we test a 2,500 GB SSD persistent disk with device ID sdb.

Create a new directory,fiotest, on the disk. In this example, the disk is mounted at /mnt/disks/mnt_dir:

sudo mkdir -p $TEST_DIR

4. Test write throughput by performing sequential writes with multiple parallel streams (8+), using an I/O block size of 1 MB and an I/O depth of at least 64:

sudo fio --name=write_throughput --directory=$TEST_DIR --numjobs=8 \
--size=10G --time_based --runtime=60s --ramp_time=2s --ioengine=libaio \
--direct=1 --verify=0 --bs=1M --iodepth=64 --rw=write \

5. Test write IOPS by performing random writes, using an I/O block size of 4 KB and an I/O depth of at least 64:

sudo fio --name=write_iops --directory=$TEST_DIR --size=10G \
--time_based --runtime=60s --ramp_time=2s --ioengine=libaio --direct=1 \
--verify=0 --bs=4K --iodepth=64 --rw=randwrite --group_reporting=1

6. Test read throughput by performing sequential reads with multiple parallel streams (8+), using an I/O block size of 1 MB and an I/O depth of at least 64:

sudo fio --name=read_throughput --directory=$TEST_DIR --numjobs=8 \
--size=10G --time_based --runtime=60s --ramp_time=2s --ioengine=libaio \
--direct=1 --verify=0 --bs=1M --iodepth=64 --rw=read \

7. Test read IOPS by performing random reads, using an I/O block size of 4 KB and an I/O depth of at least 64:

sudo fio --name=read_iops --directory=$TEST_DIR --size=10G \
--time_based --runtime=60s --ramp_time=2s --ioengine=libaio --direct=1 \
--verify=0 --bs=4K --iodepth=64 --rw=randread --group_reporting=1

Clean up:

sudo rm $TEST_DIR/write* $TEST_DIR/read*

OUTPUT Reading:

Run status group 0 (all jobs):
WRITE: bw=127MiB/s (133MB/s), 127MiB/s-127MiB/s (133MB/s-133MB/s), io=8192MiB (8590MB), run=64602-64602msec

Finally, we get the total I/O — 8192MiB written to disk, in 64602 milliseconds. Divide 8192MiB by 64.602 seconds, and surprise surprise, you get 126.8MiB/sec — round that up to 127MiB/sec, and that’s just what fio told you in the first block of the line for aggregate throughput.