Monday, 26 December 2011

File system reliability doesn't depend on the load

The idea that filesystem fragmentation degrades reliability, put forward by some of the defragmenter vendors, is not true.

A filesystem handles all the fragments in the same way.

At first significant differences arise between contiguous files which have only one fragment and files consisting of two fragments. The next significant difference appears when the list of fragments becomes large and the list itself is divided into several parts. On NTFS the mark is usually at about 100 fragments. Once it is tested that a filesystem properly handles one fragment, the list of fragments, and the list of lists, you can be sure that a filesystem handles any number of fragments.

The same way a heavy load on the data storage system hardware doesn't by itself decrease reliability. High disk queue length doesn't lead to a disk failure. Definitely it indicates an overload, but if you wait long enough, all the requests will be processed. Strictly speaking, if the load is very heavy, some of the requests will be eventually cancelled due to timeout. This doesn't affect the ability of the storage to keep working properly when the load decreases.

However, there are some factors which arise along with the overload that can lead to failures and losing data. In the hardware these are usually temperature and vibration. A disk system under load warms up and vibrates. If the cooling system is not good enough it can lead to overheating and excessive wear.

In the software, due to an overload, race conditions and other bugs are ore likely come up. In addition, there can be errors in programs which work with the filesystem driver although they are not part of it.

For example when multi-core processors were not yet invented, and dual CPUs were expensive and rare, not many antivirus programs were being tested on a multiprocessor. Once you get the antivirus filter driver on a dualhead, blue screens were quite an ordinary thing.

However, filesystem errors are eventually found and fixed, and drivers of any mature filesystem are reliable and are already tested with most any load you can imagine.

No comments:

Post a Comment