Posted by: Eric Slack
big data, Eric Slack, scale-out storage, Storage Channel Pipeline
Scale-out storage solutions that use industry-standard hardware have some interesting capabilities since they include significant processing power in each storage node. For these grid-based architectures, having relatively abundant CPU physically distributed with the storage gives them the ability to maintain performance as they scale and to support additional storage services (snapshots, replication, deduplication, etc.). They can also load a compute engine, such as Hadoop, into these storage nodes and perform distributed processing tasks.
Cleversafe has established itself in the scale-out storage market with its Distributed Storage Network (dsNet), an object-based storage system that provides secure, geographic data dispersion using proprietary erasure coding and data encryption. It has built a user base that includes some very large cloud companies, including the ShutterFly photographic website.
Now Cleversafe is announcing a compute capability that will leverage available CPU and memory resources on each node to run an embedded Hadoop MapReduce application.
Hadoop environments have historically used the Hadoop Distributed File System (HDFS), leveraging a single metadata server, which can become a bottleneck as the environment scales. It also creates multiple copies of these metadata to maintain availability, which adds to storage overhead and decreases efficiency. Cleversafe’s dsNet system replaces HDFS, eliminating data redundancy and improving metadata handling with its distributed storage architecture.
The abundance of resources in current system design swings like a pendulum; what use to be a bottleneck can one day be in abundance. Leveraging the compute power that’s available in scale-out storage systems promises to bring some very interesting capabilities to this market. For VARs this can be a pretty specialized area, but one that can get them in the door.
Follow me on Twitter: EricSSwiss