Storage Channel Pipeline

Jun 13 2011   2:12PM GMT

Help for data hoarders, Part 2: Content indexing systems

Eric Slack Eric Slack Profile: Eric Slack

In the last post we talked about the propensity most of us have toward saving data, or at least not deleting it (I think there’s a difference), because we might need it someday. There are some hidden costs to saving too much data, outside of simple acquisition, power, cooling and floor space. These are “opportunity costs” related to how excess data can make finding the information you need take longer, reduce productivity and increase frustration. The idea is that we have a fixed number of hours in a day, and when we’re doing one activity, we’re not able to do another (the opportunity).


From a VAR’s perspective, these are the kinds of “pain” situations to look for. Solving a data deluge problem can have a number of solutions and involve multiple vendors’ products — the ideal integration opportunity. One solution is to use a content indexing application to create a searchable index of all this unstructured and unorganized data that people can’t seem to get rid of.


There are number of ways to implement file indexing. One approach is to integrate archiving with other data applications, like email and backup. Symantec’s Enterprise Vault and CommVault’s Simpana offer data archiving with storage tiering and content indexing for e-discovery and other requirements. They integrate archive with backup, email and data management to help companies get their arms around their unstructured data across the enterprise.


Digitiliti takes a little different approach, combining active archiving storage and content indexing to provide what it calls a Universal Archive Platform (UAP). This solution consists of an “information director” and an “archive store,” with a software agent for each client computer or server. When first implemented, data stores — typically file servers, NAS appliances and email servers — are “ingested” into the archive, whereby a content-aware index is created. Then, subsequent changes or new files are added to the archive as the files are saved, keeping the archive current without impacting users or performance, similar to the way a continuous data protection (CDP) process works. A scale-out, modular architecture with an object-based file system allows the “archive store” to be geographically dispersed or to be located in Digitiliti’s cloud. As data is moved off the primary storage locations, the UAP can free up this premium space, and moving data to the archive tier can take it out of the backup rotation, if so desired.


A more specific use case, dealing with legal discovery, uses an application that crawls all file sources on the network and creates an index that can be used for e-discovery, as well as data mining and overall organization and productivity. This type of solution can be implemented as a network-based application or as a cloud-based service, or it can be offered on an engagement basis by service providers. We’ll take a look at some of these in a future post.


Follow me on Twitter: EricSSwiss

 Comment on this Post

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: