Do you really need ECC RAM with ZFS?
27 Feb 2014

In short, not really. But your life will be better if you do.

The thing to remember is that ZFS will absolutely refuse to give you data that it thinks is incorrect. If it detects an error, it will give you an error. It will never ever (to a vanishingly tiny probability) give you wrong data.

So, if your non-ECC RAM is already perfect, great! You won’t gain anything by getting ECC RAM.

The thing is, all hardware is imperfect. Modern hard drives are specced with an error rate of 1 in 10^15 or so. And while this number should be taken with a large grain of salt, it’s worth mentioning that the capacity of hard drives is approaching this number. That is, if you merely fill a modern hard drive with data, you should expect that the drive itself has introduced an error into your data.

Most filesystems trust the data that the hardware gives them, and they in turn will pass that data to you, the user. And if there’s an imperfection, you’ll get that imperfection. You almost certainly won’t notice; nowadays, most data is highly-compressed video or audio or pictures, and humans are mostly forgiving of small flaws.

The thing that makes ZFS difficult to use with non-ECC RAM is that it won’t give you flawed data; it’ll give you no data. If you have a 20GB VM image on a ZFS volume and it develops a single uncorrectable bit out of place, the whole thing is marked ‘broken’ and ZFS won’t give it to you. Over a single bit. Which probably wasn’t important anyway.

Note that I said ‘uncorrectable’. If your data hits the disk intact, that error will almost certainly be correctable by one of the other volume members.

If your data hits the disk incorrectly, such as if you have not-quite-perfect non-ECC RAM and it was written to all of the mirrors incorrectly, you’re in trouble. You now have redundantly incorrect data that ZFS won’t serve to you. Hope you have a backup.

You don’t need ECC memory for ZFS. It won’t run any better or faster or clean your bathroom. What it will do is reduce the chance that your data becomes inaccessible because it’s slightly wrong; something which you didn’t know happened before, but which ZFS makes obvious.


comments powered by Disqus