Aug 6 2007

The Linux effect on storage

cgibney


Linux is currently used in about 20% of the medium to large sized data centers, and according to some reports, it will be in some 33% of data centers before the end of the year. By 2011, it is expected that most data centers will have at least half of their environment running some flavor of the Linux OS. As this platform really begins to settle in, it is important to consider the ramifications that it will have on storage, data protection and disaster recovery.

When I look at how a supplier handles coverage of a platform, I compare it to the games checkers and chess. When a supplier has “checker coverage”, that means they have just enough support of the platform to be able to get a check mark. When I say they have “chess coverage”, that means they have deep coverage, including specific databases that are popular on the platform.

Looking at the foundation of data protection, backup software is a good place to start. Most of the major suppliers certainly have “checkers” type coverage of the Linux environment. Most have Red Hat and maybe Suse variants covered, but some still only support Linux as a client, meaning that the Linux servers cannot have locally attached tape. As your Linux environment grows, this can be a real problem. A handful of the backup software suppliers have also ported over their Oracle hot backup modules, and while Oracle on Linux is significant and growing, the MySQL install base seems to be growing faster. And, while in the past the size of the MySQL data set was not nearly as large as the Oracle data set, it seems to be catching up there as well. A little farther behind is PostgreSQL, but it still has a significant install base and it too seems to be growing. So, it is important that your backup application supports more than just Oracle and can do more than just hot database backup, being granular to the table space level to help with faster backups and recoveries for example.

There are backup applications that support Linux completely, and there is no longer a need to sacrifice. This may mean supporting two backup applications in the enterprise: one for Windows and one for Linux. But, as I have said in past articles while not ideal, that is not unacceptable, especially if it means you significantly improve your level of data protection on the second platform. You may find that your new product provides as good as or even better support than your original one.

When looking at core storage the situation is equally interesting. For block-based storage or SAN storage, basic support or “checker coverage” seems to be there across the board. Most of the SAN vendors support fibre attaching Linux servers to their SAN storage and their growing support for iSCSI connections. There is not much support beyond this basic connectivity though. There is limited support for boot from SAN.

Interestingly, when it comes to SAN-based storage the manufacturers have created modules for specific applications that allows their SAN arrays to better interact with them. For example, they might have a module for Exchange that will quiesce the Exchange environment, take a clean snapshot and then mount that snapshot to a backup server for off-host back up. Despite the increased growth of the Linux install base, and especially the growth of MySQL and PostgreSQL in that environment, we have not seen many specific tools to protect these increasingly critical applications. You can write scripts to accomplish the above, and in many cases now you have to. But, it would be better to have this integrated into the storage solution, so you can avoid all the issues that surround homegrown scripts.

With Linux and NAS based storage, you have to be equally careful. The Linux file system is Unix, so that means working with a Windows Storage Server based NAS can often be problematic. In all fairness, a Linux based NAS often has problems with Windows clients. There are two options here. You could focus on the Tier 1 NAS providers that have the Unix and Windows files system differences mostly resolved. This has challenges in cost, but provides comfort and reliability. Another option is to use a virtualized network file management tool. With a network file management product you can have both a Windows NAS and a Linux NAS and have data directed to the appropriate NAS based on data type, allowing for a seamless support of both file systems. Of course, a network file management product delivers far more than this. For example, it can enable a migration of data as it ages to a disk-based archive or it can help with migration to a new NAS platform all together.

Disaster recovery is another point of consideration. If you are replicating at the SAN level, then the SAN storage controller itself can cover most of this. But if all of your Linux data is not on a SAN then you may have issues with replication of disaster recovery data. With the available replication software applications, you have some very Linux focused applications but not many that can cover the enterprise. Replication is an area where you don’t want to have too many different tools to monitor and manage. Focus on finding a solid multi-platform tool than can replicate Linux, Unix and Windows data.

Linux is going to be increasingly important in enterprises of all sizes and it seems that the traditional market leaders in storage are going to ignore the platform or give it just “checker” type of coverage. The new players on the market are taking advantage of this and are moving quickly to fill the void. It is interesting to note that most of the manufacturers that have a strong Linux solution also have an equally strong Windows and Unix solution. So, in only providing the very basic of support for Linux the market leaders may end up ceding the entire enterprise.

    very interesting. i never thought Linux would be this popular.
  • MS
    Linux is such a Windows wannabe. Good job on using a "free" OS to save a bit of dollars and then you have to hire seperate support people who don't work well with others and the OS has no reliabilty without relying on Windows wannabe drivers. LOL!
  • Jesse
    As a storage consultant I have to admit I'm seeing more and more Linux in the enterprise datacenters. A recent client, a well established telecom provider, used a 32 node SAN attached bladecenter environment running SUSE Linux as it's primary OS. I was impressed by the ease of configuration and at the same time how deep the configuration detail could go. As far as the comment above about Linux being a "Windows Wannabe" - I beg to differ. I've been in a position to hire linux engineers as well as windows, and the one thing I can definitively say about the difference is that Windows engineers are really good at following procedures and whitepapers, but often fold when the time comes to troubleshoot/think creatively. This is the result of M$ engineers having a severly myopic view of the world. If it's not Microsoft, it's not worth knowing. Linux engineers on the other hand want to know as much as they can about as much as they can. When it comes time to hire a windows person, I will often toss the resume of anyone who puts MCP/MCSE etc, directly into the trash, and instead look for a Linux/Unix engineer who also knows windows. That way I'm getting someone who learns and grows beyond their original programming.
  • The author makes the common mistake that enterprises run only windows and linux. Of course this is false, and they will need backup applications that can handle the above, as well as the various vendor supplied UNIX systems, OpenVMS, tandem, mainframe, as/400, etc... I'm pretty sure there is no one 'enterprise backup solution' that can handle all of the above. If it's possible for an application to be quiesced/frozen while a storage array snapshot it taken, the snapshot can be backed up (a number of ways) block for block, which can be good for quick easy image backup/bare metal recovery. We'll still probably need more than one file-based/application-based backup, in addition to this.
