Thursday, 12 November 2015

Drobo recovery

We now have what I think is the words-first commercial (for sale) Drobo recovery capability, included in our ReclaiMe Pro software. This comes in two parts,

1. read Drobo metadata is readable
2. partially rebuild Drobo metadata if there is not enough of it to be readable

and we have it in a for-sale software

there was recent announcement on HddGuru forum by some fellows doing the remote recovery, but they are not selling their software; we sell software.

 What it works with, well, Drobo 5D/5N/FS (NAS and DAS variants). Not tested yet with B800i iSCSI variant of the Drobo.


Wednesday, 24 June 2015

Repeating questions

There is a set of questions which customers seem to be repeating perpetually,


We have a large fancy storage system which failed. Can we fix it in place, without copying data?

No, you can not. You absolutely have to copy the data away. Yes, this means purchasing the disk set of the same size. Yes we know full well that it is 50TB, meaning 13x 4Tb drives. Yes, plus a RAID controller for it.

Can we have a RAID5 recovered when two disks failed?

No, that won't work.

OK if we can get the RAID recovered, maybe we can just get 2/3rds of files, since we'are only missing a little bit? 

No,  that won't work either, because any file lager than block size will be broken.

So we still can get some small files?

There are surprisingly few useful small files when the value of small is defined by the RAID controller.

OK maybe we can get our Word/Excel files and repair them using whatever DOC/XLS repair tool?

No, that won't work. The data is missing, and when the data is missing, no fancy repair tool can reconstruct it.



Saturday, 6 June 2015

Things to avoid in data recovery

This is a short list compiled based on the tech support experience,without much explanation

Avoid
  • USB-to-SATA converters (for bad reliability);
  • Marvell chipsets (for bad handling of bad sectors);
  • Silicon Image controllers or any RAID cards based on these (after SIL_QUIRK_MOD15WRITE; even if SIL_QUIRK_MOD15WRITE does not apply to your controller, this is not an excuse to use Sil);
  • nVidia chipsets (for bad reliability of disk controllers under load)

Thursday, 28 May 2015

Timestamp and other metadata reliability

In forensics, turns out it is important to know timestamps reliably. In olden filesystems, like NTFS and FAT, you either have a timestamp (if the record is intact) or you don't (if the record is overwritten). Now, CoW filesystems like ReFS and BTRFS, produce a whole lot of different versions of metadata records - do you want a generation 3 timestamp or generation 8 timestamp? Considering that metadata generation numbers (as used for timestamps) do not necessarily match file pointer generation data, there seems to be no way to get forensically reliable timestamps on modern filesystems. This is probably something worth looking into.

Wednesday, 15 April 2015

USB

Been answering a support query recently, and mentioned to client that USB is outright bad in all respects [for data recovery use].

Well, pretty much so,
  • if one of the drives has a bad block, quite likely the USB converter will lock up on hitting that block;
  • with USB 2.0, speed is 15 MB/sec maximum, for all drives combined,
    • even if you have what appears to be different ports, they will be routed through a same root port or hub anyway;
  • devices advertised as USB 3.0 often work at 2.0 speeds, with no warning whatsoever;
  • power supply issues and limitations are difficult to control, 
    • especially so if hubs are involved;
  • any setup with daisy-chained hubs is unstable,
    • especially so with USB 3.0;

So, think twice before starting a recovery with a laptop-based all-USB setup.

Tuesday, 14 April 2015

RAID block size limiters

If you are doing a RAID Recovery and the software has the capability to limit the allowed block sizes for search (which is quite often actually, ReclaiMe Pro has it, Runtime has it, ZAR has it, and perhaps R-Studio has too), and if you happen to know the block size exactly, do not set the limiter to exact block size.

If you know the block size is 128 whatever units, set limits to 64 low and 256 high (of the same units, repeat, the same units). Otherwise, if the automatic detection gets you the value at one of the edges of the range, you do not know if it is because the value is correct, or because it hit the limit and was not able to further change the block size. The final block size must be inside the allowed range, not on the edge.

Friday, 20 March 2015

Fault tolerance in storage systems

When someone says RAID5 is fault-tolerant, this is not meaningful enough.

  1. Specific implementation must be named. 
  2. The set of anticipated failures must be listed.
  3. For each of the anticipated failures, the extent of degradation must be specified.
So, generic implementation of RAID5 does not lose data when exactly one drive fails. This does not say anything about performance and, generally, data availability. Another example is generic NAS does not lose data if its network connection fails. However, the data is unavailable until connection is fixed in some way or other.

So, when talking fault tolerance, don't forget to include at least the set of anticipated failures.

Monday, 19 January 2015

Two different definitions of fragmentation

There are two distinct definitions of file fragmentation on a filesystem.

1. The file is fragmented if reading the entire file requires a head seek after the first data block has been read. 

This definition mostly concerns read performance on rotational  drives. Head seek is slow, so it is something you want to avoid.

If the file is sparse, i.e. has large area full of zeros in it, then the filesystem will not store zeros. Instead, only non-zero start and end of file are stored. From the performance standpoint, that's fine. Filesystem driver will read the first part of the data, then will generate the amount of zeros as required, and continue reading the last part of data without ever making a head seek.

If the file contains some metadata (i.e. ReFS page table) embedded in content, that is also fine from performance standpoint. The ReFS driver will read the file data, and at some point the page table will be needed. It conveniently happens that a page table occupies the next cluster after file data, and the driver will read the page table, analyze it, and continue reading file data.

2. The file is fragmented if the extent of data starting with the first data block and the same size as the file is does not match the file content when the extent of data is directly read from disk.

Now this definition is about recoverability. From data recovery standpoint, the file is not fragmented if it can be recovered without having any metadata, by finding its header and reading an extent of appropriate size from the disk, starting with the header.

If the file is sparse, you cannot read it by just taking the required amount of disk data starting from the file header. You must insert some unspecified amount of zeros between start and end of the file.

Also if the metadata is embedded between file content, as is often the case with ReFS, the recovered file will contain ReFS page tables in it, rendering the file useless.

Sunday, 11 January 2015

X-RAID2 with two drives

Just thought I would clarify one misconception about X-RAID2.

If there are two identical drives, then the array is in fact RAID1.

No, not exactly. The array may be RAID1, but there are cases when there are two (or more) RAID1s, combined by LVM. This happens when a two-disk set is upgraded by replacing two drives with larger ones. If the array never had it's drives replaced, it is indeed a single RAID1.

Monday, 5 January 2015

NAS recovery training

What we see in the support requests is that people have difficulties recovering their (or customers') NASes. For the duration of December 2014, more than half support queries associated with ReclaiMe software were in some way related to a NAS. So we decided to put off filesystems for a while and do a solid course on NAS recovery, starting with initial data collection, through partitioning schemes like FlexRAID or SHR (Synology Hybrid RAID), and to file extraction. This should be available on our training site eventually (hopefully by the end of February).

Sunday, 4 January 2015

EXT3 undelete

As you probably know, there are three basic variants of EXT filesystem versus undelete, depending on whether extents and journaling are used.
  • no extents and no journaling - EXT2; 
  • journaling only, but no extents - EXT3;
  • both journaling and extents - EXT4. 
Journaling for whatever fancy filesystem development reasons requires inodes to be cleared (overwritten with zeros) when files are deleted.

However, we now have an undelete capability for EXT3 (where extents are not used) in our ReclaiMe Pro software. The undelete requires the full scan of the disk to work, and also file names cannot be recovered (because inodes are gone), but otherwise works fairly decent.