Storage Soup

Jun 4 2007   7:54AM GMT

Cross correlation engines reaching into primary storage

Ndamour Nicole D'Amour Profile: Ndamour

You have seen my writings on (and may even have heard me speak about) Cross Correlation (CC) analytics engine as a necessary part of a Data Protection Management (DPM) product. DPM products make your backup and restore environment work more efficiently. Recently, I have seen the application of CC techniques to solve problems on the primary storage side. And much to my pleasure, I have also seen the technique applied to manage application performance.

Several players are delivering products in the DPM market including Aptare, Bocada, Illuminator, Servergraph, Tek-Tools and WysDM, and most recently, Symantec, with their NetBackup Reporter product. These products, as a category, are delivering real value, based on my conversations with many of you. EMC, who resells WysDM as Backup Advisor, is apparently shipping in large quantities. All big data protection vendors have gotten religion on this recently, and they are all scrambling to add DPM functionality via in-house R&D or through a partnership.

To be sure, not all products are created equal in terms of the strength of the CC engine (or even the existence of one), which to me is the essence of the product. Without a sound CC engine, the best a product can do is rudimentary analysis and basically report on changes.

I have seen two new and interesting uses of CC recently. First, WysDM announced WysDM for File Servers. Essentially, that means the same CC engine is being used to look at NetApp filers (primary storage) to determine if the filer is behaving as it should. Much as before, the product gathers data from the application and through all hardware and software layers that reside between it and the filer, and applies analytics to determine if the system is behaving within acceptable boundaries. Are response times to file requests deteriorating? Is capacity being utilized efficiently? Is a file system ready to run out of storage? What needs to be done to solve the problem? Will an additional GE connection make a difference? You get the point.

I know you are probably saying to yourself, “I get some of that information from filer’s integral management tool?” Of course, you do. But, just like on the data protection side, the amount and type of information about the environment that was being delivered before this tool was available was rudimentary and static. Unless one escapes outside of the filer and looks at the entire picture from end-to-end it is hard to determine the root cause of a problem that exists or is in the making. That can only be done with a sophisticated CC tool. And only a sophisticated tool will give you predictive information with a high degree of confidence.

Another company that has applied CC to the primary storage is Illuminator Software, whose DPM product now includes functionality about snapshots and replication. But, the product is still true to its data protection roots. In this case, the product provides information on the readiness of volumes from a data recoverability point of view. Whether the volume is protected using snapshots or replication or secondary disk or tape, its recoverability is established and reported on. The product also offers advice on the actions necessary to improve recoverability.

The third company, Akorri Networks, has applied a CC engine for an entirely different purpose: to provide insight into application performance. Of course, application recoverability is improved when application availability is improved so there is an underlying connection here. But, the overt focus is to provide insight into how storage resources are being used to deliver a certain level of performance at the application level. In other words, given a particular SLA for an application, does one have adequate or inadequate storage resources applied? Would extra resources (higher throughput storage, more storage, another pipe to storage, etc.) help to bring application performance back into SLA boundaries? Or would it be a waste? What would help the most? With this kind of information the right type and quantity of resources can be applied thus saving time and resources.

The progress in these areas has been truly phenomenal in the last three years, and yet, we are still in infancy stages of utilizing these tools. Most of these technologies have become available from smaller companies, whose reach is limited. Given that your environment is only getting more complex it behooves you to check these out! Send me an email if you need any help.

 Comment on this Post

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: