There’s only one thing worse than having no backup at all: thinking you have a backup, and finding out it doesn’t work when you go to restore.
It seems like common sense, but unfortunately backup validation is often overlooked. After establishing a backup system, it’s vital that you periodically check to make sure that it is working and that you are able to restore in the case of a disaster. A system administrator that has configured a nightly backup but has never performed a test restore hasn’t done his job; he can’t say with any degree of certainty that he’ll be able to recover lost data in the event of a disaster, because he’s never tried.
There are many reasons why a backup might turn out to be a dud, even if you thought it was functional. Here’s a quick list, including things that I’ve either seen personally, or that I’ve heard about second-hand:
- Your backup restores, but you don’t have a few vital files or directories it turns out you really need.
- The backup used to work, but at some point the media became full, and backups haven’t been working for some time.
- The backup used to work, but at some point the media became corrupted, and you can’t restore.
- The backup target is a remote network host, and the SSH key changed. The backup script has been failing to login since then, and you don’t have any recent backups.
- Your backups are working, but you aren’t backing up often enough to save everything you need in the event of a disaster.
- The backup target is an NFS share, which at some point became unmounted, and you’ve just been making second copies of everything on the local hard disk — not very useful.
Notice that in all of these cases, the very first backup was completely successful, but over time something fell apart. Therefore, it’s not enough just to check your backup once; it’s vital that you periodically check to make sure everything is working and complete. While many of these cases can be handled by proper backup software (corrupt media, unmounted NFS share, etc…), others are things that require human intervention (such as making sure you’ve got all the files you need, and that you back them up on a proper schedule).
Additionally, when you periodically perform a restore test, you gain:
- Confidence that the backups are good, and that they are dependable in the event that you should need them.
- A tested backup procedure.
Peace of mind is priceless.
Having a data-loss event on your hands and having to restore without a procedure to follow is nerve-racking. If you’re in charge of restoring data for a company, you should have a written procedure to follow. This way, not only will you know exactly what needs to be done, but it will be possible for others to restore in the event of an emergency. The procedure should be written in such a way that someone familiar with the backup system can follow it to restore the vital data if necessary.
So if you haven’t already, go now and make sure that your backups can be restored, and that you have everything you need.
What’s your backup system? Do you periodically make sure that your backups are functional, and you’re backing up everything you need to have in the event of a disaster? Are you sure that your are safe in the event of system loss? Share your backup and restore method in the comments below.
Absolutely brilliant photo credit goes to Matalyn.