Posts

Showing posts from February, 2011

Hotspare vs. Hotswap

What is the difference between hotspare and hotswap?

Hotswap is the capability to replace a failed part with a spare without having to shut down the system. To resolve a problem, somebody still has to bring the spare part and do the actual work replacing it.

Hot spare is the spare part that is placed into the system at build time, in anticipation that something will eventually fail. The hot spare ties the part which could be otherwise used, but instead sits idly waiting for something to fail. On the bright side, when something fails, there is no need for a human intervention because the spare part is already there and ready.

On models in data recovery

The data recovery (be it filesystem parser or RAID recovery) software does not work based on the actual data alone. The equally important ingridient is the model of the correct state of the device being recovered.

Take a RAID0 for example. The model of the RAID0 would include stripe size, disk order, and first disk. There are often some less than obvious requirements, like "a block size must be a power of two". This works just fine until someone decides to implement a software RAID with 3 sectors per block. The recovery software then fails because its internal model of a "correct" RAID does not match the reality any longer.

Similarly with RAID5, the minimum practically useful model includes a notion of a possibly missing disk, to be reconstructed from the partity data. If you throw in a blank hot spare, the recovery fails because you just went outside of the design envelope - the model does not account for a possibility of a blank drive being included into the dis…

Seagate and Raw Read Error Rate

Seagate drives are known to report exacerbated S.M.A.R.T. data for Raw Read Error Rate. This is well-known, normal, and should just be ignored.

Images of disks in RAID recovery

In RAID recovery, if there is a controller failure, or a known software failure, there is no need to create the images of the RAID member disks. In single-disk operations, it is often considered a good practice to always make an image a disk. With RAID, this may be not so easy, considering sheer size of the modern arrays.

Actually, if there is no reason to suspect the physical damage of the RAID member disks, the imaging may be skipped altougether or put off until the decision is made to modify the data on the disks (possibly to revive the controller metadata).

Ratings instead of numbers

Circa 1998, AMD used a "Performance Rating" or "Pentium Rating" (PR) to indicate their CPU's performance by comparing it to then-current Intel Pentium. That was mostly because AMD could not deliver a CPU operating at frequencies matching these of Intel's, so they opted to move frequencies out of sight. Then, comparison shopping became little messy. And btw that did not help AMD much.

Given this thread on AnandTech, looks like we might get a similar issue with SSD benchmarks. Not that I
particularly care about SSD benchmarks.

Modern RAID recovery capabilities

Speaking of automatic RAID recovery software, there is still much to do.

In ReclaiMe Free RAID Recovery, we have the basics and classics pretty well covered, that includes RAID0, RAID5, and RAID 0+1/1+0 by reduction to RAID 0. There are a couple of other vendors out there who provide similar capabilities, so it is a done deal.

There are other RAID levels, in which neither we nor other vendors offer anything automatic. We could probably do something of E-series layouts (RAID 1E or 5E/EE), but we don't see a real demand for it. Automatic RAID 6 recovery looks more interesting and maybe we'd even give it a shot someday.

Also, all the current automatic RAID recovery tools rely on the RAID members having the same offset across all the physical disks. This works fine for hardware RAIDs, but can be a hinderance if you need a software RAID recovery. This requirement is not likely to go away in a near future because the computational power requirements to find offsets for array members…

SSD, TRIM, and NTFS

Reading the article on NTCompatible as they test data recovery software and fail to recover data from TRIM-enabled SSD (which is pretty much the expected behavior), I see they're a little bit puzzled because some data would still remain even on a TRIM-enabled SSD.

Interestingly, some traces to specific data remained, and that's one oddity I don't quite understand.

The answer is actually pertty simple - these were NTFS resident files. On NTFS, when the file is deleted, its MFT record is marked "free, available for reuse", but never actually relinquished back to the free space. Because NTFS uses MFT entry numbers internally to address a parent-child relationships. Removing one entry would require an entire volume to be reunmbered, which is cost-prohibitve.

So, the data outside MFT is zero-filled immediately once TRIM command is issued. The MFT entires however remain unchanged. This explains they were able to get file names and correct file sizes, but the data was a…

Computer Stress Syndrome

Came across a discussion of the Computer Stress Syndrome on TechNibble, that was just hilarious:

are they ever going to do a test study on end user unrealistic expectations causing computer technician stress syndrome??

Intel's new chipset

Intel reports there is a flaw in SATA controller on the SandyBridge chipset, causing functionality to degrade over time. I suppose functionality goes as in functionality to store data, actually. They say
the flaw only affects 3 Gbps ports (SATA II), while 6 Gbps (SATA III) ports are OK, but I'd wait for further confirmation.the revised chipset will hit the market April, 2011. So we got quite a number of data-loss time bombs somewhere, and the number still grows.

Even in 2011..

Image
some people are still concerned about DoubleSpace and Stacker.



Do you still remember using MS DOS in production?

Bonus item: MFM hard drive interface in the same screenshot.