Technology_Internet

Caveats on Using Snapshots for Server-less Backups

Whether you are dealing with disk I/O in reading the data from the disks, or CPU for compressing or encrypting the data (or both- remember to compress and then encrypt!), or network for transferring the data to a backup server, the added load of a backup on your production servers is unwelcome. For this reason, the period of time during which backups can be made, aka. backup window, may be limited- even severely.

You may say, "It only takes me X hours to do a full backup of everything", but over time backup windows are notorious for becoming too small. Backups are split over multiple days, technologies upgraded, etc. When planning a backup strategy, my approach is to eliminate the backup window altogether- that is do whatever you can to take the backup off the production hardware altogether.

Storage Snapshots are one method for taking the production servers out of the backup equation. By creating a consistent, point in time snapshot on your storage, and mounting it on your backup server, you can backup your data using your backup server's resources while your production servers continue as usual.

Caveats of this method in general are:

EMC Replication Manager in Solaris

UPDATE: No ZFS Support for Replication Manager in the near future

Using storage level snapshots can be used to run backups without directly requiring resources from the original host.

EMC Replication Manager coordinates the creation of application consistent snapshots across all the hosts in your network. It handles scheduling creation/expiration of snapshots,  mounting and unmounting from backup servers, etc. from a single console.

Although it is not tightly integrated into EMC Networker like the similar Networker PowerSnap module, it can be used to start a backup process after taking a new snapshot and it has the capability to manage snapshots unrelated to backups from a GUI.

While the data sheet claims support for Solaris, there are several caveats which I have run into.

Webservd Default Home Directory

Someone currently building an internal development environment required some integration between servers using SSH and the webservd user.

He came to me when he saw that the default home directory for the webservd user is /.  He didn't want to create a /.ssh/authorized_keys file and I didn't blame him. My first reaction was to change the home directory but I didn't want to break something so I opened up Google and found something incredible.

DISCLAIMER: The following is quoted from documentation at docs.sun.com (emphasis is mine). I do not recommend you actually listen to it's instructions:

Real Time Reporting Databases

Reporting projects are the kind of projects which never seem to end. After a couple iterations I've come to the following conclusions:

  1. Absolutely no reports should run on a production database.
  2. Moving/aggregating data from a production database to a reporting database using ETL tools prone to synchronization issues and pretty unreliable.
  3. The best option is to set up real time replication of the data and build additional views on that.

Unfortunately, if you need to get data from heterogeneous databases, ie. Oracle, MySQL, SQL Server, etc. into a single reporting database, replication is not a simple solution. If you are running expensive database software in production, it may not be cost effective to run the same database for reporting.

Of course there are cross database replication solutions like Golden Gate or SharePlex but they are very expensive. I had already given up on getting data from Oracle into MySQL for reports when I stumbled across Tungsten Replicator.

Sun's Predicament

I've been working with Unix for a fairly long time now- about 13 years.

I'll admit that I started with Linux and thought it was light years ahead of SunOS 4.x running on those old SPARC machines- I mean who had heard of SPARC processors? I remember my boss trying to explain to me that even an older SPARC processor was more powerful than a newer Intel Pentium processor. I didn't really believe him. In time, I convinced them to get rid of most of their SPARC/Solaris in favor of the hip, free, and cheap Intel/Linux combination.

Now I see that I couldn't have been more wrong. I realize that SunOS 4.x probably still has features which I don't know how to use properly. When I look at Solaris 10, ZFS, Zones, LDOMS, DTrace, etc. I not really sure you could pay me to work with Linux (that would be soo depressing). That isn't even mentioning the SPARC hardware it runs on- Can any Intel server compare to a T5140???