Posted by: Brian Gracely
Big Data, Data Gravity, Data Science, Data Scientist, Database, DBaaS, Google, Storage
Earlier this week, Andi Mann (@andimann) posted a simple question on Twitter about how we should treat our data. It evolved out of the commonly used phase that “you should treat servers like cattle and not pets“, a reference to more modern applications that are modular and designed around the failure characteristics of commodity hardware.
[NOTE: There is some debate over who originates or popularized this phrase]
While choosing how to treat your servers can be closely aligned to the types of applications being used, it’s a little more complicated when trying to align an analogy to data. First of all, some people think data (by itself) isn’t very valuable, until you apply some context around it and turn it into information. The “Data Gravity” theories (podcast). Other primarily focus on how data is organized and manipulated within various databases as the critical element to address. Still others are focused on the complexity and variability of the storage mechanism of the data (lots of architectures and form factors).
So why would I say that we should treat the data like “grandparents”?
Let’s start with some basic analogies and comparisons.
Value of Data – Most people would not argue with the idea that their data is important and are willing to spend large sums of money to keep it protected from being lost (or corrupted). This is not only true of the storage industry, but also of the human life industry. In data, just as in life, we will often spend 3-10x the cost to retain and protect the data as we did to create it. Backups, Clones, Snapshots, Archieves, Cloud Storage, Home Storage, etc.. All to make sure that data is around for a long, long time.
Eliminating Data – Keeping track of data is a difficult task, whether it’s in local (on-premise facilities) or stored externally (Cloud, CoLo, Outsourced). And as much as storage can grow (in terms of cost, complexity, physical space), almost nobody is in a rush to eliminate the data – even if it might make economic sense. People are inherently “hoarders“, always believing that information will be needed again at some point. Just as we’ve seen in recent years, even suggesting that things might be eliminated can become a highly emotional topic of discussion.
While we live more and more in a real-time information driven world, with more data being created each second, we still value non real-time information at a high level. We learn from it. We use it to do research. We use it to support our arguments. We use it to learn from history so that maybe we won’t make mistakes again.
By no means is the “grandparents” analogy as clean as the pets vs cattle analogy, but I believe it has some strong correlations. We might not all love it because at times it can be complex and messy, often times frustrating. But is almost always the case when trying to retain something of value and then trying to gain new value from it.
What analogy would you use to answer Andi’s original question?