Monday, 29 November 2010

RAID Tips 4 of 10 - RAID 5 uncorrectable error probability

If you plan on building RAID 5 with a total capacity of more than 10 TB, consider RAID 6 instead.

The problem with RAID 5 is that once the member disk had failed, it is required to read the entire array in order to complete the RAID rebuild. Although the probability of encountering a read error in any particular read operation is very low, the chance of a read error occurring increases as the array size increases. It has been widely speculated that the probability of encountering the read error during rebuild becomes practically significant as the array size approaches 10TB. Although the speculation relies on certain assumption which is not likely to be true (we'll have a writeup on that later), consider being better safe than sorry.

RAID 6, being capable of correcting two simultaneous read errors, does not have this problem.

Friday, 26 November 2010

RAID Tips 3 of 10 - RAID 0

If you are planning to build a RAID 0, consider using an SSD instead. Depending on what your requirements are, you may find a better bang for your buck with just one SSD. Also, higher RPM rotational drives (e.g. WD VelociRaptor series) or hybrid drives (like Seagate Momentus XT) may be interesting.

Tuesday, 23 November 2010

RAID Tips 2 of 10 - The RAID Triangle

The relationship between Speed, Price, and Fault Tolerance mostly determines the RAID level to use. Of these three parameters, pick any two.

  • Fast and Fault Tolerant - RAID 1+0

  • Fast and Cheap - RAID 0

  • Cheap and Fault Tolerant - RAID 5 or RAID 6.

Saturday, 20 November 2010

RAID Tips 1 of 10 - Requirements

When you are about to build a RAID, make sure you understand your storage requirements. The following points are significant
  • Array capacity. Whatever your capacity requirements are, these are underestimated, most likely by the factor of two.
  • Budget limits.
  • Expected activity profile, especially read-to-write ratio. If there are mostly reads and few writes, then RAID 5 or RAID 6 would be OK. If significant random write activity is expected, consider RAID 10 instead.
  • Expected lifetime. Whatever the projected lifetime of the storage system is, it is underestimated.

For a quick estimation of capacities for various RAID levels, check the online RAID calculator.

Wednesday, 17 November 2010

The effect of S.M.A.R.T.-reported temperatures on failure rate

It was previously thought that there was a clear correlation between disk temperatures and failure rates; however, the studies undertaken by Google Inc. on the large disk population have revealed that the correlation is not as strong as it was assumed earlier. In the studies, S.M.A.R.T. data which were collected every few minutes during 9-month window of observation have been analyzed. Only average temperatures were taken into account. It was found that failures don't increase when the temperature increases. Moreover, the higher probability of failure rates was observed for the lower temperature ranges. The positive correlation has been detected only for the disks with temperatures greater than 500C.

However, 3 and 4 year old drives stand out. For such drives, the correlation between average temperatures and failure rates turned out to be more pronounced, probably due to then current HDD technology.

Thus, the studies show that the disk temperature affects the failure rate directly only for old drives and high temperature ranges (above 500C). For the moderate temperatures other factors affect failure rates much more strongly than temperatures do.

Sunday, 14 November 2010

Why do we need as much information as possible?

Once there was a discussion on one of the repair forums, and one poster said something along these lines

The only information needed to recover a RAID are the RAID disks themselves. If the recovery lab asks something like controller model, they are not a professional outfit.

This guy has some merit. If you can get your hands on the actual drives, you do not really need anything else to do the recovery. This is true for the recovery lab, which works with the actual disks (or images thereof). When we are debugging our RAID recovery freeware, there is one significant disadvantage. The actual disk images are always cost-prohibitively large to transfer, so we had to figure the problem out without these.

Lacking the images, we still have our test data sets, crash dumps, whatever, but the customer description of the problem becomes more important.

Consider the following problem report, just for an entertainment purpose:

We were running XP the software RAID5 volume holding the data failed. The array is 4x 1TB WD whatever model hard drives. The hard drives were verified separately with WD Lifeguard and tests returned no errors. However, Windows refuses to mount the array and ReclaiMe Free RAID Recovery fails to produce proper output.

Now what is the problem with the recovery? (select whitespace below for an answer).

There is a discrepancy between two statements 1. running XP and 2. using RAID5. They must have been using RAID0, because XP does not support software RAID5.

This illustrates the importance of all the details perfectly.

Thursday, 11 November 2010

On RAID computational requirements

There is a widespread opinion that software RAID imposes significant processing power penalty on the host system, thereby decreasing overall performance.

For a RAID 0, this is obviously not the case. The only overhead involved is to dispatch the sectors being read or written to their appropriate disk, requiring a fairly simple calculation once for every sector (512 bytes of data) written.

RAID 5 and RAID 6 are more complicated. There is a requirement that parity data is computed for each write. However, the processing power requirements are modest and the resources are in abundance. Given the 100 MB/sec write speed, we need, say, 1,000 MIPS (Million Instructions Per Second) to calculate the parity. Also, there will be an additional memory bandwidth requirement of, say, 200 MB/sec (100 MB/sec in and out). Properly designed caching would alleviate the load even further. Still, a pretty modest CPU (made circa 2005) can provide about 15,000 MIPS and about 5,000 MB/sec memory bandwidth. So, the requirements of the RAID performing a sustained write at a rate of 100 MB/sec do not seem very high compared to the resources available.

On RAID diagnostic messages

Always check what your RAID controller says when a disk failure or other array failure occurs. You should verify by testing that such messages are difficult to miss.

Error messages displayed during the boot sequence are no longer useful as the uptime is now measured in weeks even for the home PCs.

If the controller doesn't report error messages or for some reason you don't take prompt action to restore the array redundancy once the disk failure has occurred, then there is no point in running a RAID. Using hot spares alleviates the problem for disk failures, but not for a silent controller malfunction.

On one of the forums someone told a story about RAID 1 failure where one of the member disks had failed and later it was found out that the second member disk contained the two-month-old data. Two months before the disk failure, the controller lost the array for some reason but didn't report the error. So nobody bothered to restore the redundancy. As a result, when the only remaining disk failed, there was no redundancy and all the data had been lost.

Remember that the RAID recovery may be difficult if the array is seriously out of sync.

Monday, 8 November 2010

Moving drives across ports in a RAID

Can I swap the disks in RAID (connect the disks to different ports) and don't lose the RAID data?

There are two types of RAID implementation:

1. Configuration-on-disks (COD) - in which the information about RAIDs along with the data about to what array exactly the current disk belongs to is stored on the disk itself.
In this case you can transfer the disks between the ports and even between the controllers of the same model. Such a scheme is implemented in modern software RAIDs (Windows LDM and Linux mdadm) and in most hardware controllers as well. Sometimes you can even transfer the array between the different controller models, for example Intel ICH9R and Intel ICH10R.

2. RAID implementations in which the information about member disks is stored in the controller memory. Here, the controller actually monitors not the disks, but the ports and so you cannot swap the disks. Othewise, you lose the array and then you need a RAID recovery.

Wednesday, 3 November 2010

Installing XP on a hard drive larger than 128/137GB limit

If Windows XP doesn't see more than 137 (or 128) GB of disk space on the large disk, then you need to turn on BigLBA.

To enable BigLBA on an XP computer, it is needed to change the parameter in the registry. Such an approach works well when the XP has been already installed, but if you need to make a fresh installation on the large disk, you can't change the parameter because the registry does not yet exist.

To work around this issue, you can do one of the following:

  1. Include the latest Service Park into the installation CD (this process is called "slipstreaming"), and then the full disk capacity will be available during install.
  2. Install the XP on a partition with the size, say, 100 GB, then install the latest Service Pack, enable BigLBA, and use a tool like Partition Magic to extend the partition onto the remaining space. Normally, we'd recommend that you backup before resizing a partition, but since this is a new install anyway, there is nothing useful to backup.

Tuesday, 2 November 2010

Hard disk sounds

what does spindle motor damage sound like?