Leave It To Apple To Bring Revision Control To The Masses
On August 7, 2006, Apple previewed Mac OS X 10.5 Leopard at the WWDC. As usual, some new features received a lot of excitement whereas some received criticism. Time Machine, Apple’s automated backup solution, received acclaim from many people because, as usual, Apple makes the process seem so easy and of course even manages to make it fun with the eye candy and intuitive interface. Some people yawned at the app, claiming that it doesn’t deserve any attention because it is just a glorified backup app, which has been available in Windows for years.
Summing it up as a glorified backup app is an obscene oversimplification. As a developer, I immediately recognized in Time Machine the likelihood that it is running on top of a revision control system, and that is what really fascinates me about it, and incidentally that is also what I feel many people misunderstand about it, leading them to discount it. I’m not alone in making this connection; really, any developer who uses and understands source code repositories probably had the thought cross his or her mind. One of the major advantages mentioned in discussions about revision control is the fact that you can always retrieve your file as it existed at any point in the past when it was committed to the repository.
And it’s not a stretch to see Apple borrowing revision control software to implement Time Machine. Subversion is a very popular revision control project in the opensource community. Sitting on top of FreeBSD, Apple has incorporated many other opensource projects in Mac OS X. For instance, OpenGL, Apache, SSH, FTP, CUPS, and Samba come to mind, just to name a few. Why not add Subversion to the list of excellent technologies to incorporate into an excellent OS?
If Time Machine is running on top of Subversion or a similar revision control system, that would address the concerns that some people have–people who might not understand how revision control systems work. Some people have expressed a concern that a backup drive would not be able to support Time Machine for very long before filling up. For instance, if you have a 1GB file (perhaps a video clip) and make a very small change to it, will Time Machine record that change by making another copy of the entire file? If Time Machine made its backups as full copies of the changed files, you can imagine how quickly a backup volume would fill up. But that simply isn’t how most revision control software works. Most revision control software, including Subversion, uses Delta Compression to calculate only the changes made to a file, and then saves only those changes. Thus, a file can be committed to a repository a dozen times, and yet the amount of drive space taken up would only amount to the changes made, plus perhaps a very slight amount of overhead in the repository for tracking each change.
So that covers feasibility. With some software like Subversion, Time Machine could provide a backup of your entire volume and actually support it for a decent period of time thanks to efficient delta compression. What about the logistics of finding the changed files and saving those changes? This would be simple even with Subversion in its present state. Out of the box, Subversion provides the ability to search for files that have changed since the last “commit”, or the last time changes were saved. Time Machine would just have to ask Subversion to report all the files that have changed, then run the commit. This could be scheduled to execute at a given time, say, midnight. In the event that the computer was off at the scheduled time, the steps could be executed immediately upon startup.
In the past, I’ve thought how nice it would be if all the files on my computer could go under revision control just like my source code when I’m working on an application. It would be just Apple’s style to make that possible not just for tech geeks who use Subversion, but to make it possible for anyone, even the guy who doesn’t care a lick about understanding revision control.
CrossOver: The Worst Way to Do the Wrong Thing
From the moment Intel Macs became available, running Windows apps on Macs has been a topic of interest in the Mac community. Suddenly, Macs became the ultimate machines for those of us interested in running multiple operating systems. Sure, Linux was always available for PowerPC Macs–at least several distros–but running Windows at satisfactory speeds was always a challenge. With the advent of Intel Macs, there are choices aplenty. We can (1) dual-boot with Bootcamp; we can (2) run Windows with virtualization software like Parallels or VMWare, which run at near-native speeds, significantly faster than the emulation we did on our PowerPC Macs; and now we can (3) run Windows apps without Windows using CodeWeavers’ CrossOver.
Those solutions are geared to respectively increase convenience. Bootcamp requires an inconvenient reboot. Then virtualization software allows us to boot up Windows without rebooting our Mac, and we can even use Mac apps and Windows apps simultaneously albeit with Windows in an encapsulated environment. Finally, CrossOver aims to take the convenience to the next level by eliminating Windows and enabling Mac OS X to execute Windows apps!
One article at LinuxWorld called CrossOver Office “the best way to do the wrong thing”. I contend that CrossOver is the worst way to do the wrong thing. If I have to run a Windows app on my Mac, I certainly want it to run as smoothly as possible, just like the rest of my Mac experience, and I want it to run like the developer intended. Largely, dual-booting and virtualization don’t compromise the behavior of the app I’m running. When you run your app with CrossOver, however, you don’t know what kind of performance you’ll get. Your app may just die; it may run but be full of bugs. The reports coming from the web show that your experience will be very hit-or-miss.
Don’t get me wrong; I’m not knocking the WINE project, which is the basis for the CrossOver codebase. A project to port the Windows APIs to another platform, although ambitious, is fascinating and, frankly, perfect as an opensource project. The problem I have is with the attempt to commercialize this technology that is great as a free resource but destined to never deliver a level of quality that befits a commercial product.
Why? There are so many variables and pitfalls to porting an API that the technology will never be able to work for even a large percentage of Windows apps, let alone all or the majority. And the technology can continue to be refined, only to see a Windows upgrade completely shatter the compatibility of future apps, and the development process of tweaking the port starts all over again. And from an opensource perspective, that’s fine. That’s the strength of an opensource initiative. It’s not such a great model for a commercial product.
What would you rather do? Buy CrossOver for $59 and be able to run only a few apps with it, and perhaps with a few bugs at that, or buy Parallels for $79 and be able to run practically all Windows apps with it, nearly bug-free? Of course, the price differential increases if you need to purchase a copy of Windows. Nevertheless, your experience will be infinitely more reliable if you use a virtualization solution like Parallels.
For the tech geek who likes tinkering with new software, this solution is worth a gander, especially while the free public beta is available. However, if you just need to get down to business, I recommend sticking with virtualization.
The Mysterious Vanishing WordPress Posts
Three of my recent posts from August 2006 have been mysteriously cut off at the knees (two articles about SELinux and one article about Apple releasing the Mac Pro). The first opening sentences remained, but then mid-sentence at a variant length, the article body was truncated. One of the posts was particularly lengthy, and naturally I didn’t have a backup of the article in any fashion. That is extremely disappointing.
To prevent the lamentable agony of this kind of loss, I could: (a) Set up a WordPress scheduled task (with plugins that provide such functionality) to backup the database on a regular basis, or (b) I could backup the database manually after I post an article. As a different approach, and the most fun because it involves programming, (c) I could set up a scheduled task on my server at home to pull the RSS feed from my site on a daily basis and save that.
My server at home is a Linux box (currently Fedora Core 4), so a quick little Linux script is the best way to go. This is exceptionally easy, so let’s take a look:
fn=/backuppath/rss/nazin-`date –iso-8601=date`.xml
url=http://blog.nazin.com/index.php/feed/
curl -o $fn $url
This obviously could be a one-liner, but to dumb it down, I put the backup file path and the URL of the RSS feed in script variables. The first line says, “Make the path inside the rss directory (relative to the location the script is ran from), with a file called ‘nazin-yyyy-mm-dd.rss’, using today’s date.” If you are new to Linux scripting, anything wrapped in ` symbols will be processed and replaced with its output. So “nazin-`date –iso-8601=date`.rss” will actually become “nazin-2006-09-07.rss” if that is today’s date. The second line obviously just assigns the value of the url variable. The third line is then a basic curl call. It says, “Go browse the $url and put the output in the file at $fn.”
I then just set up that script to run as a scheduled Cron job, and we’re in business! A quick note about the path: You can leave it as a relative path, and the script will work fine when you execute it at a shell prompt, but it may fail as a Cron job. To be safe, provide an absolute path so that it works at both places.
Don’t leave backups to humans. We’re too unreliable. Leave it to your server to handle. ![]()

