THE NEXT WAVE OF HYPER-CONVERGENCE TAKES ON SECONDARY STORAGE
Date: Tuesday , May 03, 2016
Hyper-converged platforms using standardized appliances have completely changed architectural philosophies for enterprise data centers. These appliances are usually turnkey, in contrast to the piecemeal server, networking and enterprise storage products that needed to be orchestrated together in decades past. The results have been dramatic benefits to IT, including not just cost reduction but a more strategic shift in focus towards running business applications since less time is spent managing the individual server and storage infrastructure pieces.
The company I co-founded, Nutanix, led the way in this hyper-convergence movement for primary storage used for virtualization, but I believe there is an even bigger opportunity to impact the world of secondary storage. For many decades, secondary storage consumption focused on data protection and DR, with tape as the secondary media. When data deduplication and SATA HDDs came on the scene, tape was swapped out though it never went away completely. Customers benefited from better backup experiences in terms of speed of recovery, but this remained a separate silo in the datacenter that served as an expensive insurance policy.
The Bigger Challenges in Secondary Storage
Meanwhile, several other use cases in secondary storage are creating silos of their own, including test and development, which leverage additional copies of production data. Storage is often managed separately for this use case, rapidly creating data sprawl. File services are also another secondary storage use case because they may not require high performance, but end users depend on high resiliency for productivity. This broader issue around copy data proliferation has only gotten worse as organizations take on big data, creating yet another silo of infrastructure and copies of data for use with Hadoop and its own use of secondary data. All of this makes it increasingly difficult to make use of public cloud services because there isn\'t a consolidated view of all the on-premise data and what workloads make sense for migration.
To better understand the depth of these challenges, our company conducted a survey of 74 IT professionals in late 2015 to help identify the top pain points of data storage. Our survey revealed the growing complexity of enterprise data storage-along with widespread lack of visibility into stored data. The survey highlighted that most companies rely on three or more storage solutions to manage company data, and in turn respondents rated \"managing complexity of different products\" as a top concern, second only to growing storage costs.
Enterprises need to start thinking about a strategy for unifying how they manage their data to provide a clear view of data type, access, usage, and other important metrics for data since not all data is created equal. Only 28 percent of respondents stated they have a good understanding of which users are accessing company data, while just 40 percent claimed they have good information about how many copies of data exist. 51 and 45 percent of survey respondents, respectively, reported the greatest challenges they face with data storage are costs associated with high data growth and the complexity of managing different storage products.
The inefficiency, redundancy, and chaos in the secondary storage space is exactly why I believe my company, Cohesity, can bring significant value to IT by leading the next wave of hyper-convergence to deliver a simpler, consolidated approach to all of secondary storage.
Architecting a Hyperconverged Secondary Storage Platform
We define Secondary Storage Hyper-convergence as a new category of storage that is based on the same technological foundation as provided by hyper-converged primary storage, but which also tightly integrates secondary storage use cases that may include some or all of the following: Data Protection, Test/Dev, File Services and Analytics. Hyper-convergence must not only deliver simplified deployment and scaling of underlying physical storage, but also transparently eliminate redundant data stored today across these disparate data silos of backup, test/ dev, file shares, object storage, and analytics, all while seamlessly integrating with the public cloud as another repository for archive and active data tiering.
A hyper-convergent secondary storage platform will require similar architectural elements to those required on the primary side, including a web-scale distributed file system and a scale-out architecture. Together, these elements enable a simplified model for creating and growing a private cloud of virtualized applications. However, for secondary storage, there is an additional architectural challenge because the workloads will be a heterogeneous mix of virtualized and non-virtualized applications that span a range of performance and resiliency requirements for backup, DR, test/dev, file shares and analytics. Being able to balance these disparate workload issues and still maintain consistency and performance using commodity servers, takes hyper-convergence to another level.
The Impact of Hyper-converged Secondary Storage
Cohesity offers a scale-out platform targeted specifically at secondary storage that deduplicates, compresses and indexes all data upon ingestion. This enables efficient storage of backups as well as a Google-like global search capability. Cohesity also has integrated Data Protection, File Services, Test/Dev and Analytics support along with tight public cloud integration for archive and tiering. There are builtin applications for monitoring storage utilization trending, reporting on user, VM, and file data.
We built our Analytics capabilities as an integral element of the Cohesity platform, providing an open application integration framework, along with a SDK. We took this approach because we want to allow third party and end users to develop their own analytics applications to be deployed alongside the native Cohesity applications that are included with the platform. Combined, these applications will be able to shine a light on the ever-increasing \'dark data\' that organizations are plagued with, to finally deliver key insights and business value while managing security and compliance.
For most organizations in these financial times, that strategic insight still requires a business justification. The use of hyper-convergence for secondary storage delivers massive benefits on the cost savings front to satisfy that requirement, including: a single platform to manage, no separate silos of data or separate software vendors for backup to deal with, control over data sprawl and redundancy. Altogether, these benefits have a real and immediate impact on the bottom-line, making it an easy choice to begin riding this next wave in hyper-convergence.