Tuesday, 31 July 2012

Hindsight in filesystem design


NT4 (NTFS 1.2) did not store numbers of MFT records. If MFT became fragmented, it appeared that there are millions of file records linked in the tree based on the record numbers. However, it was difficult to establish what number belonged to what record. Algorithms for detecting record numbers were complex, unstable, and requiring a lot of resources. Starting with Windows XP, the MFT entry number is stored inside the entry itself that, as it was planned, had to facilitate data recovery. 

When the volume was updated, for example, when a volume created in NT4 was connected to Windows XP for a first time, all records remained in the old format, without numbers. Records were brought into a new format only when a file, which the records referred to, was changed. So, we were encountering hybrid volumes over the several years thereafter. When at last, users ceased to format disks using NT4, all the records were numbered and data recovery software eventually stopped to support the old NTFS version.

The old ext2 implementation has the similar quirk when a superblock did not contain the number of group to which the superblock belonged to. Thus, the redundancy that designers have planned to have could not be used. The reason is that in case of the failure of the first superblock it was possible to find all the remaining superblocks, but even knowing all superblocks it was impossible to detect the first one. Therefore there was no chance to know where the files were. In the newer version of ext2 a superblock contains the numbers of groups that simplifies data recovery significantly.

Thursday, 26 July 2012

Storage Spaces

Been doing some research on Storage Spaces recovery lately, learned two important things,

  1. If you delete a Storage Space pool, the pool layout information is well and truly gone, overwritten with zeros.
  2. Once the pool layout is lost, if you've been using thin provisioning, you're most certainly dead in the water, and if not, recoverability depends on how the volumes and filesystems were created/deleted.
We trying to make a working prototype to recover thin-provisioned volumes from a deleted Storage Space pool, but it is still weeks from any usable result.

Friday, 13 July 2012

24x 1TB RAID

RAID 10 or RAID6? - from this topic at hardware analysis.

The question is which one would I choose for total data protection? Quite typically, the answer is neither, although the idea never came up in the original thread.

What would be the correct setup and why? The correct setup would be to split the drives in two systems, with separate power, and LAN connection between them. First system is hardware RAID5 (with 11+1 disks), or RAID 10 (with 2x6 disks) depending on what you need, and second is the backup machine with 12 drives based on software RAID (because only one hardware controller is available per the original question).

Now if the backup system is normally shut off with wall plug removed from the wall, and is only connected once a week while the backup is underway, the data is protected against most threats (save for fire and natural disasters).

And as far as boot time goes, no, you rather design a system so it boots using a RAID 1 or maybe even a plain vanilla drive.

Thursday, 5 July 2012

Reducing opportunity for human error

Based on the number of incidents where people mistake RAID 5 for RAID 10 and vice versa, and that happens even to people who are supposed to know better, we will be soon adding automatic detection for these cases. This way, the possibility for confusion will be eliminated.

Sunday, 1 July 2012

Probably the first real-life ReFS recovery

We probably got the first real-life (as in this is not a drill) data recovery involving ReFS filesystem. There are still some issues to work out as far as speed is concerned, but ReclaiMe did remarkably well. Remember that the filesystem itself did not reach a production status yet, it is still a release candidate.