Designed to Fail: Apple Time Machine and MacOS X Server
I have a 1TB Apple Time Capsule which I say by way of introduction and not as a means to lord over you, dear reader, that I’ve got something that you might not have. I use this Time Capsule to backup my MacOS X Server.
According to the product page, here’s what Time Capsule can do:
Time Capsule is a revolutionary backup device that works wirelessly with Time Machine in Mac OS X Leopard. It automatically backs up everything, so you no longer have to worry about losing your digital life.
and
Backing up is something we all know we should do, but often don’t. And while disaster is a great motivator, now it doesn’t have to be. Because with Time Capsule, the nagging need to back up has been replaced by automatic, constant protection. And even better, it all happens wirelessly, saving everything important, including your sanity.
The only requirement for your Mac is that it is running MacOS X 10.5.2 or later.
That’s because MacOS X 10.5 introduces Time Machine like this:
For the initial backup, Time Machine copies the entire contents of the computer to your backup drive. It copies every file exactly (without compression), skipping caches and other files that aren’t required to restore your Mac to its original state.
and
By default, Time Machine backs up everything on your Mac.
(Emphasis mine.)
So when I hooked it up to my MacOS X Server 10.5.4 server and turned on Time Machine (one click!… and some typing to set the password, but we’ll overlook that), I was pleased to see that, indeed, my server was backed up. I could browse the backup and see it was good. Cool!
But then the other day, disaster struck only hours before we were to leave for the airport: my server had crashed in such a way as to be unrecoverable. Things were grim, and the only thing I could do was to restart from the Server installation DVD and restore the machine from the Time Capsule. Well, it did that in about 9 hours and with some remote system administration from my beautiful wife, the machine was back up and running.
Except it wasn’t.
E-mail didn’t work. Web services wouldn’t start. All hell was breaking loose in the system logs as various services which were trying to start up just plain wouldn’t start up. Things which previously had a quiet existence on this machine were suddenly vociferously complaining about a plethora of problems. While each had its own gripe, most were unhappy about the nonexistence of a log directory, /var/log. “Huh?” I thought to myself, “I thought I read that Time Machine backed up my entire machine to get it back to its original state. What’d I miss? And why do I feel like it’s my fault all of a sudden?”
A little searching on the web reveals that there’s a list of stuff Time Machine doesn’t back up which, on a normal MacOS client machine might be OK, but for a server is disastrous. The list, stored at
/System/Library/CoreServices/backupd.bundle/Contents/Resources/StdExclusions.plist
has, among other things, these items which are excluded
/private/var/log
and
/private/var/spool
OK, log files don’t necessarily need to be backed up, maybe only the last one so you can see what happened before the crash would be nice. Nonetheless, if there are various services which need log files or log directories to exist to run, then something, somewhere must recreate these logfiles or the system never gets up and running, and the backup has, in fact, failed. Apache is quite content to gripe that it can’t make logfiles. Amavis, too, can’t do anything unless the directory is there. Sorting out which logs need to be where and who owns them and what their permissions should be took me the better part of two hours, and I’m know I don’t have them all right. (And that’s only for the few services I’m running. God only knows what I’m missing for the others.)
But… and this one is inexcusable… not backing up /var/spool, which includes /var/spool/imap which is where my IMAP users’ E-mail is stored!! is insane and has my blood boiling. This is an oversight which is completely uncharacteristic of Apple but for which there is no excuse.
The next four hours I spent trying to recover my IMAP users and getting Postfix to run were maddening. I had lost nearly a gigabyte of E-mail. Multiple IMAP directories had to be “reconstructed,” sometimes a success and sometimes not, according to the many webpages out there. Even Apple’s own webpage describing this process failed. (Something about partition “/var/spool/imap/user” not existing.) My users were similarly peeved. “You mean we bought that expensive Time Capsule instead of a simple external hard drive for mirroring and it didn’t work?!”
Never mind the fact that it’s not the Time Capsule at fault, it’s Time Machine. They didn’t see it that way. They saw an expenditure that was unjustified because it simply didn’t work. I saw a maddening amount of work on a weekend, on my vacation, from 1000 miles away, because it simply didn’t work. And that’s just wrong, wrong, wrong.
Note to Steve Jobs You would be incensed, too.
My users and I are angry, and rightfully so. You, Apple, make a promise about a very important function and you don’t keep it. This is backup, for goodness’ sake! This is the kind of thing that has to work. If you say “backup,” it implies that it will, it shall, it must work.
And it didn’t.
Another note to Steve Jobs: If this had happened to you, you would have seen to it that somebody got yelled at and it would have been fixed immediately. Really. This is the kind of thing you hate. (I’ve read enough of Fake Steve Jobs to know how you think, man.)
Coming in an article “real soon now:” an article about how to recover from the various messes left behind by a Time Machine restore of MacOS X Server. (Just as soon as someone answers the question I pose here, that is.)
Oh man, this sux. Tell me you've filed a bug report with Apple about this. I mean i laughd it off when i restored my OS X client from a TM backup and noticed those log folders were not created but the same thing is happening on the SERVER ?! Jesus.. i wonder sometimes about Apple.
Adi