Jan 2, 2009

Anecdotal data on drive failure.

While cleaning up my old archived data, I realized I had a couple of backups of my old computer, Totoro. Totoro was my main machine during university, from 1997-2001. It never got to the point of failure, it was sitting around mostly unused between 2001 and 2007 when I eventually recycled it. One backup was from around 2004, the other around 2007. Since that machine was mostly unused during that period, I expected most of the files to be the same. Just for fun, I did a comparison to check. The interesting thing was that I had 19,294 identical files, and 18 files that were different. Somehow, after sitting around for 3 years, about 0.093% of my files became corrupt.

Now, I have way too many uncontrolled variables to even determine where the corruption occurred. I had backed up that original disk twice - the first time was over the network to my archive server, the second time was after I had pulled it out of Totoro, stuck it into a different computer, and copied it onto a portable disk. My guess is that the disk became corrupt, although the errors could also have been introduced at some point when I was copying the files around.

In any case, this introduces a bit of paranoia. Just because I have something backed up, I have no idea that the backup copy is perfect. My current "backup process" is just to copy important files over to my archive server, which has mirrored disks, meaning if one of the disks in the server dies, I still have a 2nd copy. This experience has shown that I ought to look for some specialized backup software that generates CRCs and can verify file integrity instead of simply using "copy".

Addendum: Found another partition from Totoro that I had backed up twice. This time the "newer" backup contained 285 truncated files out of 4377, that's 6.5%.

No comments: