December 21, 2008 5:14 PM
Posted by: Taylorallis
, Storage Vendors
Jingle bell storage: What to buy a geek
A great list for the geek in your life. Best gift idea? Kiva.org! I’m the geek Beth mentions in the last paragraph. My buddy who gave me the gift? CBS news’ Hari Sreenivasan.
Economic Downturn Hits Storage Spending
Ok – so I think we all know that our IT budgets will be flat or cut next year. We also know that data will continue to grow (and grow fast) regardless of what we do with our budgets. So what to do? Optimize, optimize, optimize. If you haven’t run a formal efficiency and optimization program in your storage infrastructure environment, then you are overlooking a huge chuck of wasted capacity and space. The storage utilization and allocation rates are far worse than most vendors are telling us (I know b/c I came from one!) It’s not in their best interested to sell less storage, but it is in yours to buy less…
The state of data backup in 2009, Part 3
Good reading in Beth’s Data Backup series – this issue covers disaster recovery and the emergence of Cloud backup. Amazon was first to market with S3, EMC came out with Mozy, and Beth writes about Symantec’s offering…
Symantec adds change management to SRM
Symamtec mentions making agentless options available too. When I work with IT admins this is top of their list – they don’t like agents crawling all over their environment. I understand – but agents will still be around, maybe minimized. ESG says that SRM tools aren’t viewed as a must-have – this is unfortunate. There is HUGE value in them – but you need to know (or have a partner that knows) how to deploy them, interpret them, and ACT upon the data. If you don’t take this step, you just bought shelf-ware. You do this, and you can free between 30% – 70% of your capacity – I’ve seen this done with multiple infrastructures and count THAT as a must have.
Brocade Buys Foundry Networks for $3B
Brocade drops $3B to pick up Foundry – a good move on the surface. This will make them more competitive with Cisco – offering LAN and WAN equip. 10GbE still looks to be a great bet, and networking companies are investing in it. See Stephen Foskett’s blog.
HDS embraces SSDs
HDS is a little late to the party (EMC led the charge, followed by Sun and others). But USP is a great disk system, and SSD will make it better. STEC is making out as the SSD partner to have. Again – if you have any apps that live or die by latency times, you need to be researching SSD options.
“Despereaux” uses clustered storage
For you HPC junkies, the movie “Despereaux” had to chunk through 1,700 shots and 90 million images. They did it the way most do, Linux clusters running an HPC filesystem – this one Lustre. They stored 200TB of generated data on Infortrend’s EonStor RAID system (comment if you know anything about EonStor – I don’t). I’ll probably take my boy to the movie and bore him on the details on how it was made…
December 19, 2008 6:32 AM
Posted by: Taylorallis
, Storage Vendors
Andrew Reichman has written a Forrester report titled “Do You Really Need A SAN Anymore?”
The title itself has generated some industry buzz for obvious reasons, and several blog posts from SAN providers. Check out Chuck Hollis’ (EMC), Hu Yoshida’s (HDS), Tony Asaro’s, and Chris Evans’.
My partner-in-crime Randy Chalfant has also commented on the report and blogs – but I’ll give you fair warning that my friend’s storage knowledge is only surpassed by his passion – and he holds no punches!
In a nutshell, Andrew discusses the benefits and challenges of SANs. He concludes that SANs have not lived up to their expectations, and a new approach should be evaluated. This new alternative is “Application-Centric Storage” in which he defines as storage infrastructure managed by applications like mail, database, or hypervisor apps. If data management functionality (snapshots, replication, provisioning, de-dup, etc.) lives in the application layer – then you only need commodity disk (JBOD or RAID) on the backend. (This is not unlike Sun’s “Open Storage” message of which I used to blog often on when I was with them.)
What’s wrong with SAN?
The report lists four current-day SAN challenges: On the first issue with SANs (low utilization rates) I give Andrew kudos for because he hits the nail on the head. The second issue he lists (limited workload-sharing) is really not a SAN issue at all. Same with his third issue (vendor heterogeneity) – not a SAN issue, in fact it’s an issue with ANY storage solution. His last point (block storage has limited information context) is also valid.
Poor Utilization: On low SAN utilization rates, a distinction and definition needs to be made clear:
Allocation = How efficiently you use what you buy – allocated storage space vs. how much storage you bought. BUT if you over-allocate your storage, or do not allocate space efficiently, this rate can look good. This can be misleading if not measured the right way – you can overlook a significant amount of wasted space.
Utilization = How frequently data is re-referenced (i.e. utilized). We have found that up to 40% of a primary disk array’s capacity holds data that is inert – data that has not been referenced in 6 months or more. Does anyone want to spend tier 1, disk/SAN prices on data that is not referenced for over 6 months?
Some storage vendors use the terms incorrectly or interchangeably – so if a storage manufacturer says their arrays offer great utilization rates, make sure they are not just talking about allocation. Mr. Reichman gets this right – and his numbers show utilization rates around 20% to 40%. This is absolutely what we see in the industry.
But I disagree on what he says the root cause is – SANs!?!? You can find utilization challenges on DAS and NAS. Randy rants on the root causes here – but lack of good data management practices, good storage technology implementation, and other process issues are the root cause – not SANs.
Limited workload-sharing: I commented on a good point Chuck Hollis makes – the Forrester report mixes technology issues with people issues. The report says SANs are challenged because individual departments silo their storage infrastructure and applications. Storage silos have to do with data management practices (or lack thereof) and little to nothing to do with SAN technology.
The New Solution?
Another kudos for Forrester – they proposed a solution. A pet peeve of mine is when someone critiques something without offering an alternative approach.
The new approach according to Forrester is Application-Centric Storage: Basically, embed data management software in an application and directly attach cheap RAID or JBOD to it. And if you wanted to get creative you could cluster these systems together at the file system level.
The Benefit: One of the largest benefits to this approach is that applications give data context and support business objectives directly – so they are in a better position to manage and tier data (i.e. ILM). I couldn’t agree more.
The Challenge: I don’t think Application-Centric Storage will disrupt the current SAN model anytime soon and here is why:
Storage as a Feature: First of all, Forrester gives a list of storage software that can make its way into applications like Oracle, Exchange, etc. (snapshots, replication, reporting, thin provisioning, deduplication, etc.). While some applications and file systems are starting to offer this functionality, I have a hard time believing they will offer all of these features at a level that storage manufactures do currently. I came from StorageTek’s RD&E department and have seen my fair share of development roadmaps. The first items to work on are the software’s core value and offering. The first items to go are the ones that are “nice to haves” but don’t significantly enhance the core offering. Storage management technology will always be secondary to most app vendors and receive less resource than other application features. This will also hamper storage innovation…
An application vendor didn’t come up with data deduplication, a storage start-up did!
App admins take on storage: A second point made is that application admins can manage storage better than storage admins because they know what the data is being used for. Same problem as above – storage is not their core competency nor do they want it to be (otherwise they’d be storage admins!). One needs to know storage process and technology – even if your storage is directly attached. There is a way to prove my point as well – take all the tasks of an overworked storage admin and give them to an overworked application admin and see what happens…
The solution is not to throw out SANs and their storage administrators with them.
The answer is for application managers and storage managers to work better together. And I put the responsibility on the storage people to do this – storage should support application requirements which support business requirements – so storage departments should be surveying application departments monthly on what their requirements are.
Application-Centric Storage Availability:
So what’s available today? Some approaches mentioned include:
Microsoft Exchange: They have a whitepaper on an Exchange + DAS solution.
Oracle Exadata: They use an application-based data volume manager. This is a great feature, but only good for your BI and data warehouse needs. What’s more, they use a proprietary Infiniband network which doesn’t show much more cost benefit over FC SAN.
VMware: I am getting excited about the new storage features that are popping up here. Another point here was made by a brilliant colleague of mine at Sun:
Several customers have deployed SANs specifically to support growing VMware environments!
The Bottom Line:
Do You Really Need A SAN Anymore? YES.
If you are running Exchange or Oracle ExaData and want storage solely dedicated to these apps, then check out Application-Centric Storage. For everything else, look at networked storage as a viable option.
With that said, I can almost guarantee that the allocation and utilization rates for your storage systems (including SAN and everything else) are not where they need to be. The solution is NOT to throw out your SAN and deploy something else. The answer is best stated by Randy’s blog on this subject:
“The best answer for you is to better manage what you already have against a criterion that actually means something to the business. In other words – not technology for technology’s sake, but infrastructure for business’s sake.”
December 12, 2008 7:37 AM
Posted by: Taylorallis
I was recently talking about a Storage Magazine article, Dedupe moves beyond backup.
The conversation led me to look back on some of my past analysis around de-dup. I ended up looking 5 years into the past.
Global Compression at StorageTek
At StorageTek there used to be an engineering research and IP department called Advanced Technology Research or “AdTek.” My current business partner and boss, Randy Chalfant, used to run it. A brilliant engineer by the name of Chuck Milligan ran the group after Randy – Chuck is the one who hired me at StorageTek. I eventually ended up heading the department.
I was looking at an old list of Research Probes we were recommending to STK execs for productization – there were 11 cases we presented in 2003 (Grid Storage, Flash/SSD, Encryption, etc.) On the list was “Global Compression.” In our pitch to management, we stated that this yielded extremely high compression ratios and had the potential to disrupt tape. We recommended adding it as a feature to the backup disk products STK was looking to bring to market – we even recommended some companies to evaluate for investment. (Unfortunately, some other probes were picked for further research that year!)
Fast forward some years and my strategy team and I found ourselves briefing Sun executives (after the STK acquisition) on the future of de-duplication as it has come to be known. I remember saying two things:
1. De-duplication has officially moved from cutting edge to a must have for disk backup, VTL, and secondary storage
2. Dedup will move from secondary storage to primary storage in the future (we backed up our claims with an excellent 451 Group report on the subject)
Dedup in Primary vs. Secondary Storage
Now we have dedup in primary storage. However, some think primary storage is not always the best place for dedup. The thinking is that de-dup works where there is a lot of…duplication. Primary storage tends to hold more transactional data, while secondary storage has more duplicate data. While this is true, there is more duplicate data on primary storage than users know.
I have moved from simply recommending storage strategies to actually implementing them in my new venture (which is much more fun!) Dedup is one of the steps we use with clients to get to a more efficient and optimized storage infrastructure.
We help storage users identify all of the inert data sitting on their primary storage – data that has not been referenced in more than 6 months. Users are almost always surprised about how much we find – around 40% on average.
The next question is what to do with this data – it needs to be cleaned up or moved in order to return that 40% to free pool capacity.
One clean up step is dedup – and in some instances a significant amount can be deduplicated. What are duplicates doing on primary storage? A lot of data management practices (or lack thereof) lead to this.
One example: In many cases application engineers will be testing new applications or updates. They need to run tests on real data – but obviously can’t run them on live, production data. So, they make a snap copy of the production data and run the tests against this data set. If they want to run another test, they’ll make another copy and so on. Do they remember to go back into the system and clean up their copies? Most often the answer is no – and this simple process (which is one of many) robs a primary disk system of its precious capacity.
So, deduplication can have a significant impact on primary storage in addition to secondary storage. But like any storage technology, the way in which it is implemented is the critical part of the equation.