Thursday, 29 December 2011

Stories


The computer with Asus motherboard based on ICH8R has worked flawlessly for about 1.5 years. Two 250 GB hard disks were in RAID1 divided into two partitions.
Two weeks ago I heard about a problem (the request to press a key when loading) which don't lead to subsequent problems in the operation.

When the system is starting up, the message "Primary HDD not found. Press F1 to resume" is displayed. Pressing F1 leads to usual system start with the message "New device was found ...". When I started to sort out what happened I found out that RAID controller switched over to IDE mode. Disk Management displayed copies of logical disks (originally there were C:, D:, -- but now E: and F: are added with the same sizes and labels). The freshest file dates on the copies coincided with the date when start up problems arose.

When I switched the controller to RAID, RAID1 appeared immediately with the name specified by me in RAID settings. The state of RAID 1 was shown as "Normal" that meant the controller didn't understand that the disks were not synchronized.


That's why it is important to monitor an array's state and to synchronize disks in a mirror or recalculate parity in a RAID 5 periodically. There are situations when a controller loses an array and an user doesn't note or just ignore it as in "RAID seems to work and the rest doesn't matter". However RAID loses its fault tolerance and instead of RAID 1 we get a single backup copy which will not update anymore.

Actually a good example for one of our RAID Tips here.

Monday, 26 December 2011

File system reliability doesn't depend on the load

The idea that filesystem fragmentation degrades reliability, put forward by some of the defragmenter vendors, is not true.

A filesystem handles all the fragments in the same way.

At first significant differences arise between contiguous files which have only one fragment and files consisting of two fragments. The next significant difference appears when the list of fragments becomes large and the list itself is divided into several parts. On NTFS the mark is usually at about 100 fragments. Once it is tested that a filesystem properly handles one fragment, the list of fragments, and the list of lists, you can be sure that a filesystem handles any number of fragments.

The same way a heavy load on the data storage system hardware doesn't by itself decrease reliability. High disk queue length doesn't lead to a disk failure. Definitely it indicates an overload, but if you wait long enough, all the requests will be processed. Strictly speaking, if the load is very heavy, some of the requests will be eventually cancelled due to timeout. This doesn't affect the ability of the storage to keep working properly when the load decreases.

However, there are some factors which arise along with the overload that can lead to failures and losing data. In the hardware these are usually temperature and vibration. A disk system under load warms up and vibrates. If the cooling system is not good enough it can lead to overheating and excessive wear.

In the software, due to an overload, race conditions and other bugs are ore likely come up. In addition, there can be errors in programs which work with the filesystem driver although they are not part of it.

For example when multi-core processors were not yet invented, and dual CPUs were expensive and rare, not many antivirus programs were being tested on a multiprocessor. Once you get the antivirus filter driver on a dualhead, blue screens were quite an ordinary thing.

However, filesystem errors are eventually found and fixed, and drivers of any mature filesystem are reliable and are already tested with most any load you can imagine.

Saturday, 24 December 2011

On average, machine wins

If in RAID recovery the software says the block size is X, while customer says it is Y, X and Y being diffferent, the software is correct at least nine times out of ten. Sometimes, that annoys the customer, but there is little we can do about it.

Monday, 12 December 2011

Recovering confidential data

When one deals with data recovery, sometimes he worries about the confidentially of the recovered data. Look at the example below:

...a Western Digital HD that makes clicking noises. .... The HD has many customer credit card numbers and legal documents on it, so confidentiality is very important to us.

If the automatic data recovery software works well enough, there is no problem. In this case one recovers data himself and data always stays on his own computer.

In all the other cases, for example, when a mechanical repair of disk is required, a technician has full access to a disk and that data which he is able to recover.
Any respectable data recovery company usually doesn't reject to sign either a non-disclosure agreement (NDA) or an agreement that data cannot be viewed at all during a disk repair.

The prohibition of reading data makes data recovering harder because of two reasons:

  1. Quality control becomes more difficult, and often impossible at all. Some data recovery programs provide automatic integrity control of some recovered files (e.g. in ZAR). In addition, many data recovery companies use custom-made programs for this purpose. However, not all file types can be checked without reading them.

  2. In case the data recovery fails at the first try, resetting the program for the second time becomes more complex.


On top of that data recovery companies might worry that the recovered data might be potentialy illegal.

Thursday, 8 December 2011

Ah sh!t!... erm... press on.

If you are reconfiguring a RAID, or whatever other storage system, and something unexpected happens, or something happens that you do not fully understand, stop.

Pressing on in this situation would likely make things worse. Pressing on for long enough will eventually make things irreversibly bad.

This thread on Tom's presents a good example. When reading it, keep in mind two things,

  • Even if one do not initialize the RAID, each time RAID 10 is reassembled and resynched with different order of disks, there is a 50:50 chance of total data loss

  • Repairing a filesystem or troubleshooting boot process without having fixed an underlying RAID first is certainly useless and often damages the data.

Tuesday, 6 December 2011

XFS coming soon

Probably as soon as tomorrow. The only thing still missing is a comparison test against typical failure modes: file deletion, format, and bad blocks.

Friday, 25 November 2011

XFS recovery

We've been working for a couple of weeks now to implement an XFS recovery capability for our ReclaiMe data recovery software. The single most significant impression is that XFS is unnecessarily and exceedingly complex. Having... how many that would be, five? types of directories is actually OK, as long as these types utilize the same basic structures and design. Taking design commonalities into account, the number of distinct directory types is reduced to just two. Data storage comes in three forms (NTFS and ext4 are both OK having just two). The most interesting discovery of all to date was that each allocation group has two different sizes.

Monday, 14 November 2011

Data loss on JBODs

After a disasterous data loss coused by the raid system I have changed all to JBOD assuming that 1/8 data loss is more acceptable than a full disaster.

Wrong. If you have eight drives in JBOD, and one of them dies, there are several options:
  • On NTFS filesystem, if first drive dies, the entire array is lost. If any other drive dies, 7/8th of the data is lost.

  • On ext-whatever, you can salvage an unspecified amount of data because the superblocks are distributed more-or-less evenly across the volume. However, everything in disk groups which span across two disks is likely lost. So you can theoretically approach "1/8th of the data lost", but that involves using some data recovery software and is far from easy.

If you want to be sure that the loss is limited to 1/8th of data in eight-disk configuration, forget JBOD and create eight separate volumes instead.

Monday, 7 November 2011

Human vs. computer in RAID recovery.

Human vs. computer battle.

As far as I am aware there is no way to rebuild HFS+ RAID from file-system analysis. NTFS is simple because has good system counters to use, same as EXT. But HFS+ requires good knowledge of RAID distributions.

As you see, people rely on some property of the filesystem being recovered to produce a strong signal that we can use to determine the correct RAID configuration. NTFS provides plenty of those. FAT provides even more. With EXT, hddguy probably knows more than I do, because I'm not aware of any strong singal in EXT. By and large, humans perfer to find a small bit of data with high signal-to-noise ratio, and use it. Understandable because it is limits amount of effort involved.

The RAID recovery software, on the other hand, mostly works with weaker signals. Weaker signals have far worse SNR, but they are in plenty. For example, you can calculate entropy values for any data. Obvious computer strength is to quickly process large arrays of data, which the software exploits in full. It just crunches through a sheer number of weak signals to produce the RAID layout.

Monday, 31 October 2011

Shopping comparison

Did a shopping comparison of ReclaiMe Free RAID Recovery vs Runtime's RAID Reconstructor the other day. Looks kind of grim for RAID Reconstructor.

Thursday, 27 October 2011

Can I has a write hole?..

in a RAID 6?

Actually, yes.

RAID 6 WRITE HOLE

All it takes is large number of disks in array, intensive I/O, a power failure, and some bad luck.

Monday, 17 October 2011

RAID 5 vs RAID 6

I'm getting tired of people advocating RAID 5 vs. RAID 6. They go on like oh, in a RAID 5 a bit error URE will get you one day! We spent 10,000s dollars sending our RAID5s to OnTrack. Yes, that's tens of thousands of Uncle Sam Dollars.

Single bit error in a RAID 5 is much cheaper to recover from than a RAID 6 controller failure or an operator accidentally deleting the array. Ever asked for a quote on RAID 6 recovery?

Tuesday, 11 October 2011

Non-standard configurations

Some do actually like non-standard hardware and software setups.

If we build a 16 TB RAID 5 (9x 2TB), can we then install Windows on it?

Probably yes, with some U/EFI trickery, but then troubleshooting this contraption if hardware ever dies would be a nightmare with 9 drives.

Now another try

We have a leftover of drives, like all sorts of 160GB to 2TB Parallel ATA, all sorts of Serial ATA, five RAID/HBA controllers, and a motherboard. We thought of putting it all together and deploying ZFS over it. Do you think it is a good idea?

Actually, no.

The complexity of the failure modes for the proposed design is just mind-boggling. First of all, when ZFS crashes, there is no reliable data recovery for it. Then, multiple HBA/RAID cards from different vendors in the same system are not going to work stable. More then, with a different size drives, no common RAID scheme can be applied. Should the RAID fail, the system is not recoverable. OK you can go with ZFS hybrid filesystem-RAID capability, but it is even less recoverable when failed. On top of that, this borderline weird configuration was never tested. The symmetric configurations with md-raid and ext-whatever used in stock NAS units like QNAP are at least well tested and understood (and still even these have problems)

So, what comes of it - stick with simple and standard configurations. The increase in efficiency for a unique build is small and is not woth the problems you encounter when it fails.

Friday, 7 October 2011

Remote recoveries

Every once in a while we do a remote recovery session via TeamViewer. The most annoying thing in remote recoveries actually is not knowing who is in control. TeamViewer does not provide any feedback when the other party is going to take over by pulling a mouse cursor away.

The fact that someone is standing on the remote side watching what you doing, and you cannot even tell if they are there or not, is not very comforting but acceptable. With remote recoveries, it is a part of a job, actually. Someone may choose to ride the shotgun with you.

The real problem starts when they interfere with what you are doing and there is no way to stop it. This is not really because people on the remote side are specifically evil, just because there is no convenient way to establish who is in control, and how to request or how to relinquish it. Damn annoying still.

Thursday, 6 October 2011

Windows Update Error 80072ee2

This comes unrelated to the data recovery, but still it did cost me about two hours of time.

If you have a Windows Update Error 80072ee2 on a freshly installed Vista, install Internet Explorer 9 manually. Btw, installing SP2 does not help.

Thursday, 29 September 2011

Omitting the key information is bad

A forum member goes to great length describing a hard drive problem, which sounds mechanical. Then follows a precise account of what they tried to fix the problem, including the fact they replaced a PCB from a same model drive and so on. The story goes for like a full page, still the actual model of the drive is never mentioned. In this case, the model number is probably the most important piece of data, because we can look up a typical failure modes with it.

Tuesday, 20 September 2011

Tricks to determine the RAID type

If there is a set of disks, but the RAID type is not known, how do we determine what type of RAID is that? Most of the RAID recovery programs, including ours at www.FreeRaidRecovery.com, require the RAID type to be provided by the operator.

In a most simple case, where all disks are available, one can get the idea of the RAID type by just plugging all the disks and looking at the Disk Management data.

The following cases are most typical,

1. One or multiple partitions on exactly one of the disks. This is a RAID 0 or a RAID 5, more likely RAID 0.
2. One or multiple partitions, with two identical sets of partitions on two disks. With three disks, this is a RAID 5. With four or more disks, this is either a RAID 5 or a RAID 10.

The above does not account for RAID 6 or exotics like RAID 3, and assumes MBR-style partitioning on the array, but nevertheless makes for a good start when working with an array of unknown type.

Monday, 19 September 2011

Problem isolation in RAID recovery

A full, start-to-end RAID recovery is generally a three part process.


  1. Determine status of the member disks and make clones when required.
  2. Detect RAID parameters and perform destriping
  3. If the destriped volume is not readily mountable, perform filesystem recovery on it to pump out the data

Now, if the above three steps fail to produce correct data, the question is how do we tell if it is RAID recovery part, or filesystem analysis part that failed?

We tell if the RAID recovery is OK by looking at the sizes of the recovered files. If there are multiple good files recovered which are larger than twice the full row size (i.e. larger than 2 * block_size * num_disks), then the RAID recovery is almost certainly OK. However, if all good files are of the small size, the RAID parameters should be investigated. This also applies to the files found by raw scan; however, keep in mind that file sizes produced by raw scan are not reliable.

Thursday, 1 September 2011

The most common problem with RAID5 is...

... that one does a rebuild with the wrong order of disks.

This is by far the most common scenario we at www.FreeRaidRecovery.com have for an unrecoverable RAID 5. Something bad happens and the configuration is lost. The operator then assembles the array in a way which looks correct, and does a rebuild on it. The configuration which looks correct is just not good enough. You need a configuration which actually is correct.

Doing a rebuild on a RAID 5 with wrong block size or disk order effectively destroys the data on the array. Theoretically, the data can still be restored, but practically the complexity of having two sets of parameters (with unknown block sizes, disk orders, and such) precludes any recovery.

Wednesday, 31 August 2011

Customers' requests revisited

The customer walks in and says something along the lines of you should have more input options for your RAID Recovery app.

Unfortunately, it just does not work that way. As you add more options and combinations thereof, people starting to get lost among these fast. Interestingly, I recall once considering an automatic software to detect all RAIDs attached to the system with just one click of button. Something more along the lines of All your RAID are belong to us, which would eliminate the requirements both to specify array type and to select a disk set. Just do all the probing and produce all possible RAIDs. Unfortunately, this did not work out for technical reasons.

Actually, even providing a correct RAID type may prove difficult if the array was created five years ago, the person setting up the system retired four years ago, and noone even noticed the RAID until it failed. So now you have four disks, some of which may or may not work; so, RAID 0, RAID 10, RAID 5, or RAID 6? OK, RAID 6 can typically be ruled out based on the controller model alone, but the rest may not be that easy.

Sunday, 14 August 2011

Exotics

A customer walks in and says: We need to recover a FATX volume, can you do that? - Sorry, no. Various exotic filesystems, btrfs, logfs, and even ReiserFs have always been considered a job for a data recovery serivce, not an automated software. Software is cheap, but only resolves common cases by applying typical solutions. Data recovery service is expensive, and applies its high fees toward the difficult cases, e.g. writing custom software to deal with just one specific case.

Lately, there is an influx of requests for something nonstandard. The latest hit was
- We have ReiserFs on the RAID5.
- Okay, no problem.
Turns out there was a problem. The RAID5 was using 512 bytes per block. JMB 393 controller. Oops.

So far we have delayed parity (from HP SmartArray), Promise RAID 6 with its non-standard Reed-Solomon, Promise 1E interleaved layout, exFAT filesystem recovery. The capability to recover RAID with a block size of 512 bytes is in the pipeline, currently undergoing testing.

So what's next? The spec for JMB 393 lists RAID 3 as a possible option, anyone actually ever used that?

Thursday, 11 August 2011

"Best guess" parameters in RAID recovery

Every once in a while, we get a feature request for our RAID recovery software (http://www.freeraidrecovery.com/) to implement the ability to interrupt the analysis midway and get a list of possible solutions, sorted by confidence level.

There is some strong reservation against this would-be feature.

Although it looks like a good idea, a very nice thing to have, it has some undesired consequences we cannot allow. The confidence thresholds are there for reason, and we put an effort to ensure they are balanced between faster analysis (lower thresholds) and reliability (higher thresholds). Once we make incomplete solutions accessible, people will start using these solutions on real data. Sooner than later, someone is going to destroy their RAID 5 by reassembling it on the controller using wrong parameters set. In RAID0, this would be no harm (just re-assemble again in correct order), but with RAID 5, incorrect assembly (automatically followed by a rebuild) destroys the array beyond any practical repair. This is similar to consuming unbaked or half-baked food. Sooner or later someone will get poisoned. Determining if RAID configuration is correct is even more difficult than telling baked food from raw meat.

Monday, 1 August 2011

RAID increases failure rate

Surprising, isn't it? Actually, RAID does indeed increase failure rate. If you take MTBF, MTBF decreases with more disks. Even if RAID5, mean time between disk failures decreases.

In a fault-tolerant storage, time between failures (MTBF) does not matter. What matters is time between data loss events. This is called either mean time to data loss (MTTDL) or mean time between data losses (MTBDL).

You know you can setup a three-way RAID1 (three mirrored copies instead of two), i.e. the mirror can have more than two disks. So, let's imagine a RAID1 of infinite number of disks. This unit will have an MTBF of zero, because at any given moment one of the infinite number of disks is failing. It will also be continuously rebuilding while still delivering infinite linear read speed. Still, this imaginary device will have zero probability of losing data because of the disk failure, because the infinite number of disks cannot all fail at the same time.

Monday, 25 July 2011

Test runs

Did a RAID 6 test on an Intel SRCSASBB8I controller (LSI 1078 chipset, also used in Dell PERC 6/i), and it was successful save for a couple minor issues. First we got the R-Studio RAID matrix rendered wrong (data blocks are off by one), and also under the hood it looks like the analysis can use some more noise reduction. Still, the output image is OK. This applies to ReclaiMe Free RAID Recovery build 397 at www.freeraidrecovery.com.

Wednesday, 20 July 2011

RAID 1E analysis delayed

If you want some unusual RAID layouts, call Promise. So far, Promise controller has the most bizzare RAID 6 variation we've seen, and also gave us a nasty surprise with RAID 1E.

Typically, RAID 1E would be laid out as follows

1 1 2
2 3 3
4 4 5
5 6 6

(the above is so-called NEAR layout)

or
1 2 3
4 5 6
7 8 9
3 1 2
6 4 5
9 7 8

(the above is FAR layout).

Now Promise combines the best of both worlds to produce a third variation (Promise 1E layout)

1 2 3
3 1 2
4 5 6
6 4 5

Looks promising, doesn't it?

So, the implementation of RAID 1E recovery capability in ReclaiMe Free RAID Recovery (www.FreeRaidRecovery.com) is delayed until we figure out how to handle this one.

Thursday, 14 July 2011

Detecting RAID0 where it is supposed to be RAID5

Recently, a question arose about our RAID recovery software (www.freeraidrecovery.com), along the lines of "There is a RAID5 array, but I get some results if I try to recover RAID0 as well, what's wrong?"

Actually, there is nothing wrong. RAID Recovery requires the array type as outside input, along with the list of member disks. It has no way of determining if the provided array type is correct or not. Therefore, it tries to produce the closest possible layout for a given array type. There are certain cases (mostly involving RAID6) were nothing meaningful can be constructed, and it gives up with the appropriate error message. However, in most cases some layout will be produced.

If the original array type is not readily known and it is not possible to infer it from the number of disks and available capacity, the only solution is trial-and-error testing of all possible layouts (of which there are about four fundamentally different, RAID5/5E being practically equal).

Thursday, 7 July 2011

Mapping device identifiers to drive letters in Windows.

If you get an error message similar to The file system structure on the disk is corrupt and unusable. Please run the chkdsk utility on the volume \Device\HarddiskVolume1. there may be trouble matching the identifier \Device\HarddiskVolume1 to a drive letter (is it C:?) or an actual physical device.

Most of the time, finding the corresponding drive letter (X:) is good enough because you can then look it up in the Disk Management console.

However, the trouble is that \Device\HarddiskVolume1 is not referenced outside the system event log.

To find out the list of mappings, download and run the Microsoft Product Support Reports utility from http://www.microsoft.com/download/en/details.aspx?DisplayLang=en&id=24745

When it asks you what to collect, you only need the most basic option, something like "Basic information" or "General information". When done, choose to "Open and view the result" - Windows Explorer will launch and open a folder with a file "Master Report.xml" and subfolder "General".

Go down to "General" folder and find the file "ComputerName-DOSDevices.TXT". Open it and you get your mapping list at the very top of the file.

Monday, 27 June 2011

Fragmentation on ext filesystems

Just had to increase the hard limit for a number of fragments per file on ext filesystem in our ReclaiMe data recovery software, from 64K to 256K. On NTFS, it is not uncommon to see 20,000 or so fragments per file, but ext just beats the hell out of NTFS as far as fragmentation is concerned.

Thursday, 23 June 2011

Difference between RAID 1+0 and RAID 0+1

RAID 10, RAID 1+0, and RAID 0+1 are all the same thing, except for imlpementation details.
All of these have the same data on disks; you cannot tell RAID 1+0 from RAID 0+1 if you just have the disks. Also, performance considerations and fault tolerance are the same for proper implementations of all of these, regardless of what Wikipedia says.

Monday, 13 June 2011

What happens if

What happens if you swap two disks in RAID array while the system is powered down?
If there were, say, three disks A, B, C, on three channels 1, 2, and 3 respectively.
Change the configuration so that disk A is now on channel 2 and disk B is now on channel 1, then power on.

Most likely, nothing happens.

The outcome depends on the method the controller uses to identify the disks. Most if not all modern controllers identify their disks by writing some identification data onto the disks. This way, the controller can tell what the order of disks is by looking at the disks themselves. In earlier days, there were some controllers identifying the disks by ports the disks are connected to. These would be fooled by the swap. Modern controllers (in most cases) carry on just fine.

I just tested it with Silicon Image Sil 3114, and Promise SureTrak EX4650, just in case.
Also, all modern software RAIDs (Linux md-raid and Windows LDM/Dynamic) will also handle this just fine.

Sunday, 5 June 2011

Promise and RAID6

Promise RAID6 array, 4x WD EARS (Green) drives, NTFS-formatted. Copying files to the array maxes out at 2 MB/sec. Quite painful as we already lost about a week trying to create a meaningful test sample.

For reference, controller is Promise SuperTrak EX4650, flashed to the latest firmware.

Monday, 30 May 2011

Case-sensitive Filenames

Linux filenames are case-sensitive, and hence ext filesystems allow files which are only different by upper vs. lower case. Okay, this is to be expected. More interestingly, it is possible to have a directory Test and a file test within the sampe parent directory.

Does not look like we can resolve it gracefully, because under Windows we have to rename one of those. Obviously, it seems better to rename a file. However, this still breaks the program which used a file named test to contain some catalog of the folder named Test.

Wednesday, 25 May 2011

RAID6 progress revisited

The automatic RAID6 recovery algorithm currently in the works will most likely require at least N-1 disks for successful recovery. This is because the Reed-Solomon calculation depends on the order of disks, from which the chicken-and-egg problem results. If two disks are missing, the chicken-and-egg problem seemengly cannot be resolved automatically.

Even as it is with the N-1 requirement, the algorithm looks quite heavy CPU-wise. Looks like the RAID6 recovery implementation would be CPU-bound unlike all other RAID types, which are I/O-bound.

Tuesday, 17 May 2011

Uh oh.

Current progress on RAID6 is that just now looking at the log files I found the latest 48-hour test run botched.

Monday, 9 May 2011

x64 data recovery

We have a 5TB raid 5 ... I think the MBR got messed up and windows server 2003 no longer is able to view the contents of the virtual disk... ... recovery software ... will fail due to the large amount of files we have, about 64 million files.
Some raid software will save the recovered files to another disk.. however 5TB is a large amount of data to save and we do not have any disks or other virtual disks that large.
(from ServerFault)

I suppose we can do that 64 million files, should be interesting to test.

As far as pricing goes, 5 TB swap space is just 3x 2TB WD Caviar Green in RAID 0. At a cost of 4x Caviar Greens you get the same swap space protected by RAID5. At about $60 a pop, the entire lot would cost about $250. The initial cost of purchase is then further offset by putting the temporary drives to use as a backup media.

Monday, 2 May 2011

Simple things first

What is the first troubleshooting step given the following situation

A drive disappeared Windows Explorer. In Drive management the drive are being detected but Volume and Filesystem Type are blank. Everything else is OK (Simple, Basic, Healthy disk and Active, Primary partition).

Actually some of the fancy suggestions were flying around, including, but not limited to bad boot secotors, specific controller model issues, SATA 3 issues, and PSU problem.

Select a white space below to see what was the correct troubleshooting action.

In Disk Management, use Change Drive Letters and Paths and assign the drive letter.

.

Monday, 25 April 2011

Difference between raw data and interpretation

One should prefer to look at and to work with raw data, if at all possible.

We got an example recently when an owner complained on a hard drive because SpeedFan reported Fitness around 25%. This does not sound good. However, closer examination revealed that SpeedFan conclusion was based on an attribute with a value of 100 (raw value 0). There appeared to be no trouble at all from the raw SMART output. Upon further investigation I found that there are two versions of the firmware on the same model hard drive, one reporting a perfectly good condition as value 253 (raw 0), and the other reporting the same perfectly good condition as value 100 (raw 0). SpeedFan sees both values as originating from a single model, and decides 100 to be a failure indication.

The more complex interpretation becomes, the more suspectible it is to all sorts of quirks and glitches. Unfrotunately, the more complex interpretation requires more human effort to work from raw data, but that's another matter entirely.

Monday, 18 April 2011

Phantom drives in Disk Management

Occasionally, you can get a hard drive which is indicated Missing in Windows Disk Management. The reason for this behavior is simple: the information about the dynamic disks is replicated across all hard drives on a PC. The dynamic disks are assigned to "disk groups", and every disk in a group holds a copy of information about all the other disks in that group.

This is useful in RAID setups: should a controller or channel fail taking two or more RAID 5 members with it, the array is declared failed and cannot be mounted. Once the controller problem is resolved, the member disks are recognized back and the RAID can be brought online again.

The MBR-based (basic) disks, if removed, just disappear from the system, because these are not tracked.

Thursday, 7 April 2011

Moving RAID 1 between controllers

Is it possible to move a RAID 1 (mirror) set between controllers?

In RAID1, two disks are exact duplicates of each other as far as user data is concerned. The RAID controller metadata is probably disk-specific, but that does not practically matter.

Because the disks are identical, there is no point in moving entire RAID 1. Move one disk, then make a mirror from it. The question should thus be restated as

Is it possible to use one member disk of RAID 1 with a different controller?

The answer depends on a controller in use. With most controllers, you can connect the RAID 1 member disk in whatever way you like and the data would be accessible. There are however some controllers which append their metadata before the user data on the member disks. With these, the direct migration is not possible. Use either a same model or a compatible controller, or use something like TestDisk or DiskPatch to do a partition recovery and make the filesystem mountable again.

Sunday, 3 April 2011

Models and feedback

The most significant problem we facing now is the lack of feedback. As it was earlier discussed, the RAID recovery software works with models, not actual data. Now the difficulty is to determine how good these models are in real-world application. Relying on people feedback does not look very promising. We'll consider incorporating certain automatic dial-home system sending statistical info we can use to get a clearer view of things. The obvious alternative of buying a sample of all possible RAID controllers and NAS units for tests looks prohibitively expensive, both in terms of money and effort.

Thursday, 31 March 2011

Automatic recovery of RAID 5 with delayed parity

Recently we set to develop the algorithm to automatically recover RAID 5 array with delayed parity, as implemented by HP SmartArrays. Today we have mostly finalized the work. From this day on (build 306 onwards) ReclaiMe Free RAID Recovery is capable of recovering such arrays.

Monday, 28 March 2011

RAID 5 of two disks

The common wisdom says that RAID 5 requires a minimum of three disks.
If one thinks of it, it is not true.
You can create a RAID 5 of two disks. With even parity, the array becomes a mirror (RAID 1), and with odd parity (although nobody ever uses that), becomes a RAID 4 of two disks.

And yes, the same applies to a RAID 4 of two disks.

Saturday, 26 March 2011

More modeling issues

I recall we discussed that earlier, that our ReclaiMe Free RAID recovery software (and pretty much any generic RAID recovery software, like Runtime's RAID Reconstructor) does not actually work with RAIDs. It works with models of RAIDs instead. Data recovery service view on RAID5 more like "disks (by specific vendor), controller (also by specific vendor), and cables". The software sees it like

1 2 P
3 P 4
P 5 6

and pretty much nothing else.

The problem arises when the actual data does not match the model used in software. The synchronous array (below) cannot be described in terms of the asynchronous model (above).

1 2 P
4 P 3
P 5 6

So the software has to acount for all possible models of the RAID 5, of which there are quite a number.

We're currently working on automatic analysis of the so-called delayed parity arrays, used in HP SmartArray controllers. This subtype of RAID5 does effectively have two distinct stripe sizes, one for data and the other (larger) one for parity, with the array looking like

1 2 P
3 4 P
5 P 6
7 P 8

Would be nice to account for that, because I'm not aware of anything capable of recovering this type of array automatically.

Thursday, 24 March 2011

Even in 2011..

... people still believe you can get an electron microscope and recover overwritten data.
  1. It is not an electron microscope, it should be an MFM, Magnetic Force Microscope. Electron beams are no good against the hard drive platter. Probably no harm either, just useless.
  2. Even with MFM, no recovery of overwritten data on a modern hard drive is possible, because of various aspects of applied physics.

Sunday, 20 March 2011

"Write hole" in filesystems

Unsurprisingly, filesystems are also suspectible to damage when the power failure occurs during write. The most simple example is the file being deleted. If the clusters are deallocated (marked free) first, and then a power failure occurs before the file record is removed, then we got a file having its data stored in free clusters on the disk. If a new file is subsequently created and uses the same cluster, the cross-link siutation occurs, potentialy leading to data loss.

There are several ways around this problem.

Careful write ordering. The sequence of operations can be ordered in such a way that the damage due to the incomplete write is predictable, easy to repair, and confined to a single file. This is the cheapest option. It does not require any change to the on-disk structures if you want to implement it on the existing filesystem.

Multisector transfer protections (used e.g. in NTFS). If several sectors are to be written out as a group, each sector in a group stores a specific signature. When the group is later read, the driver verifies signatures in all sectors of the group. Should the signatures not match, the data is rejected as corrupt. This only allows for error detection, but not correction.

Intention logging is the most complex option, similar to a database transaction logging. The filesystem driver ensures so called atomicity of certain operations, meaning that the operation either completes entirely, or no change occurs at all. This option is implemented in most modern high-capacity filesystems, most widespread being NTFS and ext3/ext4.

Wednesday, 16 March 2011

Write hole in RAID 1

Actually, RAID 1 has the same write hole problem as RAID 5 does. Should the power fail after one disk is updated, but the other is not yet updated, and then the first disk fails, there will be data corruption.

As usual, scheduled synchronizations of the array reduce probability of this effect causing any practical trouble.

Sunday, 6 March 2011

Activities

Improved memory usage in ReclaiMe Free RAID Recovery (www.FreeRaidRecovery.com), by the factor of ten. The update is not yet live, but will be out shortly, probably no later than tomorrow. Surprisingly, there was not at all that much loss of speed.

Saturday, 5 March 2011

Exotic RAID types

With a RAID recovery, certain exotic RAID types can be recovered using the same basic algorithms, because these exotic RAID types can be reduced to one of the three basic types (RAID0, RAID1, and RAID5).
The list goes like this:
  • RAID 1+0 or 0+1 can either be reduced to RAID 0 by removing mirrors (near layout), or can be recovered as a RAID 0 straight away (far layout).
  • RAID 5E or RAID 6E can be recovered as a RAID 5 or RAID 6 respectively, because all the extra data is at the end of the array.
  • RAID 5EE can be recovered as RAID 6 with one of the sets of parity corrupt.
  • RAID 4 is actually a variation of RAID 5 where parity does not change position across rows.
The array types requiring special processing are RAID 1+0 using offset layout and RAID 1E.

Wednesday, 2 March 2011

Data recovery - probably cheaper than backup, ...

... but less than 100% reliable, though.

See this - http://www.technibble.com/forums/showthread.php?t=24345.
Given that you can literally burn the laptop in fire, and still recover all the significant data, why would anyone bother with backups at all?

Disclaimer: just kidding

Sunday, 27 February 2011

Hotspare vs. Hotswap

What is the difference between hotspare and hotswap?

Hotswap is the capability to replace a failed part with a spare without having to shut down the system. To resolve a problem, somebody still has to bring the spare part and do the actual work replacing it.

Hot spare is the spare part that is placed into the system at build time, in anticipation that something will eventually fail. The hot spare ties the part which could be otherwise used, but instead sits idly waiting for something to fail. On the bright side, when something fails, there is no need for a human intervention because the spare part is already there and ready.

Thursday, 24 February 2011

On models in data recovery

The data recovery (be it filesystem parser or RAID recovery) software does not work based on the actual data alone. The equally important ingridient is the model of the correct state of the device being recovered.

Take a RAID0 for example. The model of the RAID0 would include stripe size, disk order, and first disk. There are often some less than obvious requirements, like "a block size must be a power of two". This works just fine until someone decides to implement a software RAID with 3 sectors per block. The recovery software then fails because its internal model of a "correct" RAID does not match the reality any longer.

Similarly with RAID5, the minimum practically useful model includes a notion of a possibly missing disk, to be reconstructed from the partity data. If you throw in a blank hot spare, the recovery fails because you just went outside of the design envelope - the model does not account for a possibility of a blank drive being included into the disk set for recovery.

Monday, 21 February 2011

Seagate and Raw Read Error Rate

Seagate drives are known to report exacerbated S.M.A.R.T. data for Raw Read Error Rate. This is well-known, normal, and should just be ignored.

Thursday, 17 February 2011

Images of disks in RAID recovery

In RAID recovery, if there is a controller failure, or a known software failure, there is no need to create the images of the RAID member disks. In single-disk operations, it is often considered a good practice to always make an image a disk. With RAID, this may be not so easy, considering sheer size of the modern arrays.

Actually, if there is no reason to suspect the physical damage of the RAID member disks, the imaging may be skipped altougether or put off until the decision is made to modify the data on the disks (possibly to revive the controller metadata).

Monday, 14 February 2011

Ratings instead of numbers

Circa 1998, AMD used a "Performance Rating" or "Pentium Rating" (PR) to indicate their CPU's performance by comparing it to then-current Intel Pentium. That was mostly because AMD could not deliver a CPU operating at frequencies matching these of Intel's, so they opted to move frequencies out of sight. Then, comparison shopping became little messy. And btw that did not help AMD much.

Given this thread on AnandTech, looks like we might get a similar issue with SSD benchmarks. Not that I
particularly care about SSD benchmarks.

Friday, 11 February 2011

Modern RAID recovery capabilities

Speaking of automatic RAID recovery software, there is still much to do.

In ReclaiMe Free RAID Recovery, we have the basics and classics pretty well covered, that includes RAID0, RAID5, and RAID 0+1/1+0 by reduction to RAID 0. There are a couple of other vendors out there who provide similar capabilities, so it is a done deal.

There are other RAID levels, in which neither we nor other vendors offer anything automatic. We could probably do something of E-series layouts (RAID 1E or 5E/EE), but we don't see a real demand for it. Automatic RAID 6 recovery looks more interesting and maybe we'd even give it a shot someday.

Also, all the current automatic RAID recovery tools rely on the RAID members having the same offset across all the physical disks. This works fine for hardware RAIDs, but can be a hinderance if you need a software RAID recovery. This requirement is not likely to go away in a near future because the computational power requirements to find offsets for array members exceed the available capabilities. Especially if we're talking something practical like 6x 2TB hard drives in the array.

Tuesday, 8 February 2011

SSD, TRIM, and NTFS

Reading the article on NTCompatible as they test data recovery software and fail to recover data from TRIM-enabled SSD (which is pretty much the expected behavior), I see they're a little bit puzzled because some data would still remain even on a TRIM-enabled SSD.

Interestingly, some traces to specific data remained, and that's one oddity I don't quite understand.

The answer is actually pertty simple - these were NTFS resident files. On NTFS, when the file is deleted, its MFT record is marked "free, available for reuse", but never actually relinquished back to the free space. Because NTFS uses MFT entry numbers internally to address a parent-child relationships. Removing one entry would require an entire volume to be reunmbered, which is cost-prohibitve.

So, the data outside MFT is zero-filled immediately once TRIM command is issued. The MFT entires however remain unchanged. This explains they were able to get file names and correct file sizes, but the data was all zeros.

Now there is one special case called resident file. If a file is small enough so that the file name, attributes, and data all fit into 1024-byte MFT record, the data is stored within the MFT record. This saves little bit of disk space, and, more importantly, saves one additional seek to get the data on the rotational hard drive.

Since the MFT entires are not relinquished into free space, and for a resident file a file data is stored within its MFT entry, it is possibe to recover a resident file even on a TRIM-enabled SSD. However, this is of little practical applicability becase only files smaller than approximately 800 bytes can become resident.

Saturday, 5 February 2011

Thursday, 3 February 2011

Intel's new chipset

Intel reports there is a flaw in SATA controller on the SandyBridge chipset, causing functionality to degrade over time. I suppose functionality goes as in functionality to store data, actually. They say
  • the flaw only affects 3 Gbps ports (SATA II), while 6 Gbps (SATA III) ports are OK, but I'd wait for further confirmation.
  • the revised chipset will hit the market April, 2011.
So we got quite a number of data-loss time bombs somewhere, and the number still grows.

Wednesday, 2 February 2011

Even in 2011..

some people are still concerned about DoubleSpace and Stacker.



Do you still remember using MS DOS in production?

Bonus item: MFM hard drive interface in the same screenshot.

Sunday, 30 January 2011

How to enable root login in Ubuntu

Open terminal, type sudo passwd root, then enter the current password, then twice a new root password.
System reports back password updated successfully. Good.

Unfortunately, I did not figure out autologin.

Friday, 28 January 2011

Partition alignment

You can only align a partition (on a hardware block boundary) if a cluster size is an integral multiple of your hardware block size. This means you cannot align a HFS partition, because it would happily use 19 or 27 sectors per cluster. Most other filesystems would use a cluster size that is a power of two, matching the hardware block size.

Another thing is that if the boot sector and the data area use different alignments, it is the data area that you align. On FAT, there may be 1025 or whatever odd number of sectors between the boot sector and the first cluster of actual data. If you align the FAT partition's boot sector, the data would then be offset by one sector.

Tuesday, 25 January 2011

Multiple cluster sizes on a single volume?

it's been 10 years and I've yet to hear of a filesystem that uses multiple cluster sizes on the same volume. Otoh, I don't see what benefit that could possibly give.

With multiple cluster sizes on the same volume, you can do tailpacking.

Ext-series filesystems had a different block and fragment size, but never actually used it. Quite possible that was removed in ext4, although I'm not really sure.

OLE Structured Storage actually uses two different block sizes (small block and big block).

Saturday, 22 January 2011

The coincidences

The Skype website was slow to load today. Went to tweet about that only to find that "Twitter is over capacity". Kind of ironic.

Thursday, 20 January 2011

Is RAID5 faster?

Is RAID5 faster than a single disk?

On reads, yes.
On writes, sometimes.
On random writes, no.

Check the RAID Calculator if you need similar data on RAID 0 or RAID 6.

Monday, 17 January 2011

MBR/Boot in RAID

Which drive in a RAID stripe contains the MBR?

The first drive.

Bonus:

Which drive in a RAID stripe contains the boot sector?

On a hardware RAID, no specific location.
On a software RAID, the first drive.

RAID5:
In RAID5, if there are mostly zeros at the start of the volume, the parity block may contain an exact duplicate of the MBR/boot sector.

For other parameters, check RAID 0 Recovery primer.

Friday, 14 January 2011

RAID is for servers only?

Not true.

As long as you do backups properly, you can use RAID on workstations (HTPCs, DIY servers, whatever) as you see fit. Depending on what RAID type you choose, it gives you a performance boost, more capacity, or some added convenience, or all of this at once.

If you do not do backups properly, it does not really matter if you use RAID or not. Someday, you would evaluate our data recovery software. While it is rather good, admittedly not all cases of data loss are recoverable.

Tuesday, 11 January 2011

Mixing disks in RAID

Can I use one 10,000 RPM and one 7200 PRM hard drives in RAID 1?

Yes. In RAID1, that would not cause any noticeable issues.

Saturday, 8 January 2011

Incompatibilites

Got a DVD burner under Vista that would eject the drive if the card reader with a memory card in it is plugged into the USB port. The burner is internal (SATA). The disc would not stay in (gets ejected after about five seconds) until the card reader is disconnected. Go figure.

Friday, 7 January 2011

What they don’t tell you about all the new technology

Some new technologies have side effects which decrease data redundancy. Such technologies are:
  • NTFS compression
  • Windows Vista/7 vs. Windows NT/2000/XP complete format
  • TRIM on SSDs
  • ZFS deduplication
Initially, the effect of decreasing redundancy is unnoticeable, but once the technology becomes popular, the unintended consequences appear.
  • Volume-level NTFS compression never became widespread among home users.
  • As Windows Vista and then Windows 7 become widespread, the number of cases when data is lost irreversibly due to reinstallation increased. You can recover some data after XP reinstallation, but not after Vista/7 reinstallation with a complete format.
  • SSDs with TRIM are not widespread enough (yet) to notice that data recovery software does not properly work with them.
  • As for ZFS-based NASes, then on the one hand they are not widespread among the home users, on the other hand - even the first installations did not yet reach the end of their service life.

Monday, 3 January 2011

Deduplication

Data recovery software is based on the filesystem and user data redundancy.

In case of significant filesystem damage user data redundancy is often required for data recovery software. Elimination of user data redundancy using a compression (as in NTFS) or a deduplication (as in ZFS) complicates automatic data recovery.