Designed to Fail: Apple Time Machine and MacOS X Server

| | Comments (5) | TrackBacks (1)

I have a 1TB Apple Time Capsule which I say by way of introduction and not as a means to lord over you, dear reader, that I’ve got something that you might not have. I use this Time Capsule to backup my MacOS X Server.

According to the product page, here’s what Time Capsule can do:

Time Capsule is a revolutionary backup device that works wirelessly with Time Machine in Mac OS X Leopard. It automatically backs up everything, so you no longer have to worry about losing your digital life.

and

Backing up is something we all know we should do, but often don’t. And while disaster is a great motivator, now it doesn’t have to be. Because with Time Capsule, the nagging need to back up has been replaced by automatic, constant protection. And even better, it all happens wirelessly, saving everything important, including your sanity.

The only requirement for your Mac is that it is running MacOS X 10.5.2 or later.

That’s because MacOS X 10.5 introduces Time Machine like this:

For the initial backup, Time Machine copies the entire contents of the computer to your backup drive. It copies every file exactly (without compression), skipping caches and other files that aren’t required to restore your Mac to its original state.

and

By default, Time Machine backs up everything on your Mac.

(Emphasis mine.)

So when I hooked it up to my MacOS X Server 10.5.4 server and turned on Time Machine (one click!… and some typing to set the password, but we’ll overlook that), I was pleased to see that, indeed, my server was backed up. I could browse the backup and see it was good. Cool!

But then the other day, disaster struck only hours before we were to leave for the airport: my server had crashed in such a way as to be unrecoverable. Things were grim, and the only thing I could do was to restart from the Server installation DVD and restore the machine from the Time Capsule. Well, it did that in about 9 hours and with some remote system administration from my beautiful wife, the machine was back up and running.

Except it wasn’t.

E-mail didn’t work. Web services wouldn’t start. All hell was breaking loose in the system logs as various services which were trying to start up just plain wouldn’t start up. Things which previously had a quiet existence on this machine were suddenly vociferously complaining about a plethora of problems. While each had its own gripe, most were unhappy about the nonexistence of a log directory, /var/log. “Huh?” I thought to myself, “I thought I read that Time Machine backed up my entire machine to get it back to its original state. What’d I miss? And why do I feel like it’s my fault all of a sudden?”

A little searching on the web reveals that there’s a list of stuff Time Machine doesn’t back up which, on a normal MacOS client machine might be OK, but for a server is disastrous. The list, stored at

/System/Library/CoreServices/backupd.bundle/Contents/Resources/StdExclusions.plist

has, among other things, these items which are excluded

/private/var/log

and

/private/var/spool

OK, log files don’t necessarily need to be backed up, maybe only the last one so you can see what happened before the crash would be nice. Nonetheless, if there are various services which need log files or log directories to exist to run, then something, somewhere must recreate these logfiles or the system never gets up and running, and the backup has, in fact, failed. Apache is quite content to gripe that it can’t make logfiles. Amavis, too, can’t do anything unless the directory is there. Sorting out which logs need to be where and who owns them and what their permissions should be took me the better part of two hours, and I’m know I don’t have them all right. (And that’s only for the few services I’m running. God only knows what I’m missing for the others.)

But… and this one is inexcusable… not backing up /var/spool, which includes /var/spool/imap which is where my IMAP users’ E-mail is stored!! is insane and has my blood boiling. This is an oversight which is completely uncharacteristic of Apple but for which there is no excuse.

The next four hours I spent trying to recover my IMAP users and getting Postfix to run were maddening. I had lost nearly a gigabyte of E-mail. Multiple IMAP directories had to be “reconstructed,” sometimes a success and sometimes not, according to the many webpages out there. Even Apple’s own webpage describing this process failed. (Something about partition “/var/spool/imap/user” not existing.) My users were similarly peeved. “You mean we bought that expensive Time Capsule instead of a simple external hard drive for mirroring and it didn’t work?!

Never mind the fact that it’s not the Time Capsule at fault, it’s Time Machine. They didn’t see it that way. They saw an expenditure that was unjustified because it simply didn’t work. I saw a maddening amount of work on a weekend, on my vacation, from 1000 miles away, because it simply didn’t work. And that’s just wrong, wrong, wrong.

Note to Steve Jobs You would be incensed, too.

My users and I are angry, and rightfully so. You, Apple, make a promise about a very important function and you don’t keep it. This is backup, for goodness’ sake! This is the kind of thing that has to work. If you say “backup,” it implies that it will, it shall, it must work.

And it didn’t.

Another note to Steve Jobs: If this had happened to you, you would have seen to it that somebody got yelled at and it would have been fixed immediately. Really. This is the kind of thing you hate. (I’ve read enough of Fake Steve Jobs to know how you think, man.)

Coming in an article “real soon now:” an article about how to recover from the various messes left behind by a Time Machine restore of MacOS X Server. (Just as soon as someone answers the question I pose here, that is.)

1 TrackBacks

Listed below are links to blogs that reference this entry: Designed to Fail: Apple Time Machine and MacOS X Server.

TrackBack URL for this entry: http://www.bill.eccles.net/cgi-bin/mt/mt-tb.cgi/310

Time Machine is still borked a bit. Read More

5 Comments

Adi said:

Oh man, this sux. Tell me you've filed a bug report with Apple about this. I mean i laughd it off when i restored my OS X client from a TM backup and noticed those log folders were not created but the same thing is happening on the SERVER ?! Jesus.. i wonder sometimes about Apple.

Adi

Dave said:

I have found another flaw with Time Machine.
I have struggled for days to understand why my development directory, which happens to be in a directory called dev on a seperate disk partition (Called 'Data') was not being backed up.

It seems that the standard exclusions list that you kindly pointed me to in this article is GLOBAL and applies to all disk partitions, and not tied to any particular partition.

So, the std exclusion of /dev as you may well expect excludes the /dev directory as they are not real files and should not be backed up. HOWEVER it also causes my development directory /dev on my data partition not to be backed up. (It's real path is actually '/volumes/Data/dev', but time machine seems to see the Data partition as a root partition.

OK, slapped wrists to me for having the cheek to create a directory called dev, but hell, Time Machine SHOULD be able to differenciate between /dev and /volumes/Data/dev especially when there is no OS on that partition !!

I will still continue with Time Machine as it is useful for basic file backups (once you work out what it is NOT copying!), but you could never trust it for a full restore. For that, only SuperDuper or CarbonCopyCloner will really do the real job. Before Time Machine I used iBackup which is free and very good, but without the wizzy zooming effects of Time Machine, and it does seems to backup whatever you ask it to backup!

Bill Eccles Author Profile Page said:

Hey, that's an excellent find, and certainly worthy of reporting to Apple with their feedback mechanism. You can certainly include the URL of this post in your feedback, and hopefully they'll get my message... again.

Heads up, folks--Dave found an interesting one!

(This post, by the way, is the most popular post on my website. This is a hot topic!)

Followed the link from /dev/why you left in the comments, I didn't know there were other things exempted from the backup like that! I also run a little mail server on OSX Server and would be very unhappy to recover my server only to discover that it wasn't recovered... Did you experiment with removing those exemptions from the plist file?

Bill Eccles Author Profile Page said:

Actually, I did, but forgot to mention it. I did eliminate the exclusion with

<!--            /private/var/spool -->

which did it. After doing this, I rebooted (because I wanted to make sure it really did take effect), watched a backup, and verified that this directory was there and populated with real information.

So that works for me... YMMV.

Leave a comment