This blog entry is Part 4 of a series on the top 10% of products and technologies reviewed by Storage Switzerland in 2011. See Part 3 here.
Unstructured (big) data management
Digital Reef’s File Governance Platform is a content indexing and unstructured-data management software solution that creates a grid of Linux servers running on commodity hardware. This high-performance runtime platform can scale processing power and storage capacity to index billions of files and petabytes of data, making it an excellent solution for legal discovery, data migration and data mining.
In the legal discovery market, corporations are often faced with hundreds of terabytes or more of electronically stored information (ESI) that needs to be scanned and analyzed. Using traditional e-discovery products, which typically crawl data stores for hours or days, can be a nonstarter.
Digital Reef can scan an enterprise’s entire data infrastructure, building a virtual warehouse that can be used to increase productivity or mined to provide additional business value. This technology can also migrate and deduplicate large data volumes in support of storage modernization projects and, in the process, provide visibility into unstructured data for compliance and policy enforcement.
Digital Reef has conducted several performance benchmarks on its software, in which its scalable grid architecture produced full-file content indexing performance of 17.3 TB per day and file metadata indexing of 273 TB per day. In addition to a traditional software implementation, Digital Reef also has a cloud-based, hosted e-discovery processing service, which conducts ingestion, early case analysis, review and production for customers at the rate of 10 TB per day.
WAN optimization between data centers
WAN optimization use cases typically involve bandwidth and data reduction between a data center and users or remote offices. A different workload — being driven by things such as replication for DR and backup, big virtualization environments and the cloud — is now creating a different issue. This movement of data between data centers is causing problems for existing WAN optimization solutions.
Infineta’s Data Mobility Switch (DMS) is designed to handle this entirely different kind of traffic that’s flowing between data centers. This new network environment, which the company is calling the “hyper-scale WAN,” sees machine-to-machine traffic with a high-capacity/low-latency profile and a relatively small number of high-capacity connections that could potentially scale to gigabits per second and beyond.
The DMS is a purpose-built device that can maintain 1 Gbps rates per connection and can fully saturate a 10 Gbit WAN segment with as few as 10 connections. The DMS can do this because its architecture is fully distributed, processing packets for each data stream independently in an FPGA instead of in software running on a server CPU. This processing includes network deduplication, carried out in a hardware subsystem that Infineta calls the Velocity Dedupe Engine (VDE). The VDE processes each data packet in dedicated hardware, instead of the higher-level processor instructions of a shared CPU, enabling the DMS to produce data deduplication rates of up to 90% and controlling latency through the system.
As a comparison, to accommodate the data levels that typically occur in these hyper-scale WANs, existing solutions must split an aggregated, high-speed data connection into multiple slower connections and feed each through a separate WAN optimization controller. Using the DMS — which is up to 10 times faster than existing solutions — results in fewer devices to buy, much less complexity to manage and a lower total cost.
Follow me on Twitter: EricSSwiss