Posts

Showing posts from 2015

Drobo recovery

We now have what I think is the words-first commercial (for sale) Drobo recovery capability, included in our ReclaiMe Pro software. This comes in two parts, 1. read Drobo metadata is readable 2. partially rebuild Drobo metadata if there is not enough of it to be readable and we have it in a for-sale software there was recent announcement on HddGuru forum by some fellows doing the remote recovery, but they are not selling their software; we sell software.  What it works with, well, Drobo 5D/5N/FS (NAS and DAS variants). Not tested yet with B800i iSCSI variant of the Drobo.

Repeating questions

There is a set of questions which customers seem to be repeating perpetually, We have a large fancy storage system which failed. Can we fix it in place, without copying data? No, you can not. You absolutely have to copy the data away. Yes, this means purchasing the disk set of the same size. Yes we know full well that it is 50TB, meaning 13x 4Tb drives. Yes, plus a RAID controller for it. Can we have a RAID5 recovered when two disks failed? No, that won't work. OK if we can get the RAID recovered, maybe we can just get 2/3rds of files, since we'are only missing a little bit?  No,  that won't work either, because any file lager than block size will be broken. So we still can get some small files? There are surprisingly few useful small files when the value of small is defined by the RAID controller. OK maybe we can get our Word/Excel files and repair them using whatever DOC/XLS repair tool? No, that won't work. The data is missing, and when the data

Things to avoid in data recovery

This is a short list compiled based on the tech support experience,without much explanation Avoid USB-to-SATA converters (for bad reliability); Marvell chipsets (for bad handling of bad sectors); Silicon Image controllers or any RAID cards based on these (after SIL_QUIRK_MOD15WRITE; even if SIL_QUIRK_MOD15WRITE does not apply to your controller, this is not an excuse to use Sil); nVidia chipsets (for bad reliability of disk controllers under load)

Timestamp and other metadata reliability

In forensics, turns out it is important to know timestamps reliably . In olden filesystems, like NTFS and FAT, you either have a timestamp (if the record is intact) or you don't (if the record is overwritten). Now, CoW filesystems like ReFS and BTRFS, produce a whole lot of different versions of metadata records - do you want a generation 3 timestamp or generation 8 timestamp? Considering that metadata generation numbers (as used for timestamps) do not necessarily match file pointer generation data, there seems to be no way to get forensically reliable timestamps on modern filesystems. This is probably something worth looking into.

USB

Been answering a support query recently, and mentioned to client that USB is outright bad in all respects [for data recovery use]. Well, pretty much so, if one of the drives has a bad block, quite likely the USB converter will lock up on hitting that block; with USB 2.0, speed is 15 MB/sec maximum, for all drives combined, even if you have what appears to be different ports, they will be routed through a same root port or hub anyway; devices advertised as USB 3.0 often work at 2.0 speeds, with no warning whatsoever; power supply issues and limitations are difficult to control,  especially so if hubs are involved; any setup with daisy-chained hubs is unstable, especially so with USB 3.0; So, think twice before starting a recovery with a laptop-based all-USB setup.

RAID block size limiters

If you are doing a RAID Recovery and the software has the capability to limit the allowed block sizes for search (which is quite often actually, ReclaiMe Pro has it, Runtime has it, ZAR has it, and perhaps R-Studio has too), and if you happen to know the block size exactly, do not set the limiter to exact block size. If you know the block size is 128 whatever units, set limits to 64 low and 256 high ( of the same units , repeat, the same units ). Otherwise, if the automatic detection gets you the value at one of the edges of the range, you do not know if it is because the value is correct, or because it hit the limit and was not able to further change the block size. The final block size must be inside the allowed range, not on the edge.

Fault tolerance in storage systems

When someone says RAID5 is fault-tolerant , this is not meaningful enough. Specific implementation must be named.  The set of anticipated failures must be listed. For each of the anticipated failures, the extent of degradation must be specified. So, generic implementation of RAID5 does not lose data when exactly one drive fails . This does not say anything about performance and, generally, data availability. Another example is generic NAS does not lose data if its network connection fails . However, the data is unavailable until connection is fixed in some way or other. So, when talking fault tolerance, don't forget to include at least the set of anticipated failures.

Two different definitions of fragmentation

There are two distinct definitions of file fragmentation on a filesystem. 1. The file is fragmented if reading the entire file requires a head seek after the first data block has been read.  This definition mostly concerns read performance on rotational  drives. Head seek is slow, so it is something you want to avoid. If the file is sparse, i.e. has large area full of zeros in it, then the filesystem will not store zeros. Instead, only non-zero start and end of file are stored. From the performance standpoint, that's fine. Filesystem driver will read the first part of the data, then will generate the amount of zeros as required, and continue reading the last part of data without ever making a head seek. If the file contains some metadata (i.e. ReFS page table) embedded in content, that is also fine from performance standpoint. The ReFS driver will read the file data, and at some point the page table will be needed. It conveniently happens that a page table occupies the nex

X-RAID2 with two drives

Just thought I would clarify one misconception about X-RAID2. If there are two identical drives, then the array is in fact RAID1. No, not exactly. The array may be RAID1, but there are cases when there are two (or more) RAID1s, combined by LVM. This happens when a two-disk set is upgraded by replacing two drives with larger ones. If the array never had it's drives replaced, it is indeed a single RAID1.

NAS recovery training

What we see in the support requests is that people have difficulties recovering their (or customers') NASes. For the duration of December 2014, more than half support queries associated with ReclaiMe software were in some way related to a NAS. So we decided to put off filesystems for a while and do a solid course on NAS recovery, starting with initial data collection, through partitioning schemes like FlexRAID or SHR (Synology Hybrid RAID), and to file extraction. This should be available on our training site eventually (hopefully by the end of February).

EXT3 undelete

As you probably know, there are three basic variants of EXT filesystem versus undelete, depending on whether extents and journaling are used. no extents and no journaling - EXT2;  journaling only, but no extents - EXT3; both journaling and extents - EXT4.  Journaling for whatever fancy filesystem development reasons requires inodes to be cleared (overwritten with zeros) when files are deleted. However, we now have an undelete capability for EXT3 (where extents are not used) in our ReclaiMe Pro software. The undelete requires the full scan of the disk to work, and also file names cannot be recovered (because inodes are gone), but otherwise works fairly decent.