Stuff changes; don't take things on faith, get the facts

Disk speeds

A number of years back I saw a mail thread around Java performance. One person made the claim that Java was slow; if you wanted performant apps then write in a different language. This matched my experience, so I felt a little confirmation bias. However the reply really made me think; this may have been true 10 years ago (and, indeed, it’d been over 10 years since I’d done any real Java) but modern techniques meant that Java was perfectly fast. The kicker was that, instead of believing rumour and word-of-mouth ancient results, do the tests yourself; measure and get the facts.

My home server disks

My home machine has mutated a few times over the years, and suffers some technical debt as a result.

Currently the disks are connected to a couple of SAS9211-8i controllers. In theory these are PCIe 8x controllers but my motherboard has a 16x slot and a 4x slot, so one card isn’t being used at full potential. I’m using the controllers in JBOD mode, with Linux mdraid to create the various RAID options. I prefer this because it means I’m not tied to specific hardware and could migrate the disks to another setup, entirely, and the kernel should autodetect and bring up the RAID automatically.

This means the setup is currently on a CentOS 6 based system with the cards and disks distributed over the two cards:

  • 16x slot:
    • 2 Crucial CT512MX1 SSDs in a RAID 1
    • 4 Seagate ST2000 in a RAID 10
  • 4x slot:
    • 8 Seagate ST4000 in a RAID 6

Now I’d been taking it as an article of faith that the SSDs would be fastest, with the RAID 10 being better than the RAID 6 for writes, and the RAID 6 being better for reads.

When I built this I ran some hdparm -t tests on the RAID volumes and was surprised how well the RAID 6 performed

The two SSDs  are in a RAID 1  and get 375 MB/sec
The 4*2Tbytes are in a RAID 10 and get 267 MB/sec
The 8*4Tbytes are in a RAID 6  and get 532 MB/sec

This made me wonder how well they’d work in a more targeted test. Since EPEL has bonnie++ in the repository I chose that. For each of the RAID volumes I picked an existing ext3 filesystem to run the tests on. This should give some form of “real world” feel, since it wouldn’t be on newly created filesystems.

I ran each test twice.

The results (slightly edited for formatting) make interesting reading:

           ------Sequential Output------   --Sequential Input- --Random-
           -Per Chr- --Block-- -Rewrite-   -Per Chr- --Block-- --Seeks--
Machine   K/sec %CP  K/sec %CP  K/sec %CP K/sec %CP K/sec  %CP  /sec %CP
Raid6       800  92 108776  14 103820  11  3276  76 606148  23 170.3   8
Raid6       784  91 108436  13 105046  11  3373  76 606068  23 172.2   8

Raid10      753  94 175905  21  91956   9  2700  76 227793  10 409.2  15
Raid10      745  91 170551  20  93408   9  3209  78 229931  10 531.6   7

Raid1 SSD   786  96 182078  21  97995  10  3685  84 236123  10 460.2  17
Raid1 SSD   796  95 173306  20  98546  10  3012  75 233678  11 470.9  16

What seems clear is that for reading the ability to distribute load over 8 disks gives a clear speed advantage when reading data blocks, but at a CPU cost. Sequential character reads don’t make a lot of difference, so we may be seeing limiting factors of the kernel and CPU and memory, rather than disk I/O here. RAID 6 definitely loses for random seeks, though!

More interesting are the writes. RAID 6 is definitely slower, but surprisingly the SSD wasn’t noticeably faster than the spinning 2TB disks. These disks are meant to be able to do 500MB/s, but I appear to get real-world speeds of 180MB/s, only a little faster than the 2TB Seagates (and those values seem reasonable, looking at this) chart.

And why are the rewrites tipped the other way?

At this point I’d almost consider moving all my data on the RAID 6 because I think that’d give the most balanced performance! However, performance isn’t the only design criteria; data risk (if a 4TB disk failed would the array survive long enough to rebuild the lost disk?) and data separation; I could turn off the RAID 6 and still keep 95% of my functionality, just long term storage wouldn’t be available, so I could run on a smaller machine if this motherboard died.

Summary

This “get the facts” can apply to many things. It becomes tempting to measure everything and try to apply change based on those numbers. But we can see, from this simple disk speed test, that numbers may not be so clear and what you measure can give you different results. If I only measure “block read” speed then I would just have everything on the RAID 6 disks; that’s so much quicker! But if I care about writing then maybe the RAID 10. Would I even care about the SSDs? And there are other factors (resiliency, recovery) to take into account. Overly optimising for one factor isn’t always the best idea, either!

Technology changes; configurations can have impacts; optimisations (JIT bytecode compilers) can add a whole new dimension. What was an article of faith 10 years ago may now hinder you and cause you to build slower more complicated solutions.

If possible you should try out various options early in the design process and pick the one that gives you best results for your use case. You need the facts in order to make the best decision. But don’t over-optimise at the cost of other factors. Decisions are not made in a vacuum.