Showing posts from December, 2013

Finding a disk

Q: There are 6 SATA disks in the server. Based on SMART data, one of them (in fourth bay) is going to fail. Someone knows how bays are numbered? Identification by “blinking the bay's LED” does not work.

A: Write down the serial number of the offending disk in the controller configuration tool, turn off the server, pull all the disks out in turn, and find the disk based on the serial number. Then assemble everything back and turn on the server. If everything is OK, turn off the server once again and replace the disk.

Transferring disks along with a controller between servers

Q: A motherboard in a server burned out. There was RAD5 of 5 disks with LSI MegaRAID controller. There is a reason to assume that the controller and the array were in working order. If I transfer the controller and drives to another server (different model/motherboard), the array would work?

A: A hardware controller with disks is a self-contained unit, so if you install drivers as required on another system, the array will work. Just in case, do not swap the drives on controller ports, because in some cases it is significant.

If wishes were horses...

RAID controllers should have a simplified configuration option which should ask just a single question - how much does it cost to re-create your data from scratch? - and then proceed accordingly. This should prevent people from storing useful data, or worse yet, something which is not possible to re-create, on RAID0s.

Hardware and Storage Spaces

Sadly the predictions about people using sub-par hardware to build enormous Storage Spaces configs are gradually coming true. Not that the hardware is bad or faulty per se. It is not just up to the task. Any large storage configuration, except maybe some network-based distributed systems (which are designed to be slow, by the way), requires stable hardware. Even a drive failure rate of one failure per year per drive would still be acceptable. On the disk set of ten or more drives, one failure per drive-year is an annoyance, but one can still expect the system to be able to cope. However, in what we are looking at now, with USB configurations of forty drives or larger, the failure rates are closer to one failure per drive per week. This results in systems where the lifetime expectancy is comparable to the time needed to populate the system with the initial data set. Once the data is copied over, original copy deleted, and original drives reused, whatever lifetime left in Storage Spac…