Storage Soup

May 9 2013   12:09PM GMT

Keeping all data is a dangerous policy

Randy Kerns Randy Kerns Profile: Randy Kerns

There is a prevalent problem in Information Technology today – too much data.

Most of the data is in the form of files and called unstructured data. Unstructured data continues to increase at rates that average around 60% per year according to most of our IT clients.

Structured data is generally thought of as information in databases and this type of data is experiencing a much smaller increase in size than unstructured data. The unstructured data is produced internal to IT and from external sources. The external sources include sensor data, video information, and social media data. This type of growing data is alarming because there are so many sources and the information is used in data analytics that typically originate outside of IT.

The big issue is what to do with all that data that is being created. The data is stored while needed, which is during the processing for applications or analytics and while it may be required for reference, further processing, or the inevitable “re-run” in some cases. But what is to be done with the data later? Later in this case means when the probability of access drops to the point that it is unlikely to be accessed again. There is also cases when the processing is complete (or project is complete) and the data is to be “put on the shelf” much as we would in closing the books on some operation. Does the data still have value as new applications or potential usages develop? Will there be a potential legal case that will require the data to be produced?

The default decision for most operations is to save everything forever. This decision is usually made because there is no policy around the data. IT operations do not set the policies for data deletion. Because the different types of data have different value and the value changes over time, the business owners or data owners must set the policy. IT professionals generally understand the value but usually are not empowered to make those policy decisions. Sometimes the legal staff sets the policy, which absolves IT of the responsibility, but that may not be the best option. In a few companies, a blanket policy is used to delete data after a specific amount of time. This may not withstand a legal challenge in some liability cases.

Saving all the data has compounding cost issues. It requires buying more storage, adding products to migrate data to less expensive storage, and increasing operational expenses for managing the information, power, cooling, and space. Moving the data to a cloud storage location has some economic benefit, but that may be short-sighted. The charges for data that does not go away continue to compound. Storing data outside the immediate concern of IT staff takes away from the imperative to make a decision about what to do with it.

Besides the costs of storing and managing the data, the danger is that there may be some legal liability for keeping data for a long time. The potential for an adverse settlement based on old data is there and has been proven extremely costly. More impacting to IT operations is the discovery and legal hold required. Discovery requires searching through all the data, including backups, for requested information and legal hold means no deletions of almost anything – no recycling of backups. This causes even more operational expense.

Not establishing a deletion policy that can pass a legal challenge is a failing of a company and results in additional expense and liability. IT may the first responders on the retain-forever policy, but it is a company issue.

(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).

1  Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
  • JimMcGann
    We just completed a data profile at a client site where they had a 100 TB user share and were confident most of the data was of no value.   They found that 22% was owned by ex-employees (inactive Active Directory users), 34% was not accessed in more that 3 years, 24% was duplicate, 4% was personal multimedia files (iTunes).  In the end they found that about 20% was active data required for the business and the balance was useless content that even legal had no interest in.  As you say storing 80% of abandoned and rotted data is not only a waste of money and resource but also a liability.  Jim McGannIndex Engines
    20 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: