By Louis Imershein, VP Product
Wayne Salpietro, Director of Marketing
Permabit Technology Corp.
Cambridge, MA
www.permabit.com
The cloud continues to dominate IT as businesses make their infrastructure decisions based on cost and agility. Public cloud, where shared infrastructure is paid for and utilized only when needed, is the most popular model today. However, more and more organizations are addressing security concerns by creating their own private clouds. As businesses deploy private cloud infrastructure, they are adopting techniques used in the public cloud to control costs. Gone are the traditional arrays and network switches of the past, replaced with software-defined data centers running on industry standard servers.
Efficiency features make the cloud model more effective by reducing costs and increasing data transfer speeds. One such feature, which is particularly effective in cloud environments is inline data reduction. This is a technology that can be used to lower the costs of data in flight and at rest. In fact, data reduction delivers unique benefits to each of the cloud deployment models.
Public Clouds
The public cloud’s raison d’etre is its ability to deliver IT business agility, deployment flexibility and elasticity. As a result, new workloads are increasingly deployed in public clouds. Worldwide public IT cloud service revenue in 2018 is predicted to be $127B.
Data reduction technology minimizes public cloud costs. For example, deduplication and compression typically cut capacity requirements of block storage in enterprise public cloud deployments by up to 6:1. These savings are realized in reduced storage consumption and operating costs in public cloud deployments.
Consider AWS costs employing data reduction
If you provision a 300 TB of EBS General Purpose SSD (gp2) storage for 12 hours per day over a 30 day month in a region that charges $0.10 per GB-month, you would be charged $15,000 for the storage.
With data reduction, that monthly cost of $15,000 would be reduced to $2,500. Over a 12 month period you will save $150,000. Capacity planning is a simpler problem when it is 1/6th its former size. Bottom line, data reduction increases agility and reduces costs of public clouds.
One data reduction application that can readily be applied in public cloud is Permabit’s Virtual Disk Optimizer (VDO) which is a pre-packaged software solution that installs and deploys in minutes on Red Hat Enterprise Linux and Ubuntu LTS Linux distributions. To deploy VDO in Amazon AWS, the administrator provisions Elastic Block Storage (EBS) volumes, installs the VDO package into their VMs and applies VDO to the block devices represented for their EBS volumes. Since VDO is implemented in the Linux device mapper, it is transparent to the applications installed above it.
As data is written out to block storage volumes, VDO applies three reduction techniques:
1. Zero-block elimination uses pattern matching techniques to eliminate 4 KB zero blocks
2. Inline Deduplication eliminates 4 KB duplicate blocks
3. HIOPS Compression compresses remaining blocks
This approach results in remarkable 6:1 data reduction rates across a wide range of data sets.
Private Cloud
Organizations see similar benefits when they deploy data reduction in their private cloud environments. Private cloud deployments are selected over public because they offer the increased flexibility of the public cloud model but keep privacy and security under their own control. IDC predicts in 2017 $17.2B in infrastructure spending for private cloud, including on-premises and hosted private clouds.
One problem that data reduction addresses for the private cloud is that, when implementing private cloud, organizations can get hit with the double whammy of hardware infrastructure costs plus annual software licensing costs. For example, Software Defined Storage (SDS) solutions are typically licensed by capacity and their costs are directly proportional to hardware infrastructure storage expenses. Data reduction decreases storage costs because it reduces storage capacity consumption. For example, deduplication and compression typically cut capacity requirements of block storage in enterprise deployments by up to 6:1 or approximately 85%.
Consider a private cloud configuration with a 1 PB deployment of storage infrastructure and SDS. Assuming a current hardware cost of $500 per TB for commodity server-based storage infrastructure with datacenter-class SSDs and a cost of $56,000 per 512 TB for the SDS component, users would pay $612,000 in the first year. In addition, software subscriptions are annual, over three years you will spend $836,000 for 1 PB of storage and over five years, $1,060,000.
The same configuration with 6:1 data reduction in comparison over five years will cost $176,667 for hardware and software resulting in $883,333 in savings. And that’s not including the additional substantial savings in power cooling and space. As businesses develop private cloud deployments, they must be sure it has data reduction capabilities because the cost savings are compelling.
When implementing private cloud on Linux, the easiest way to include data reduction is with Permabit Virtual Data Optimizer (VDO). VDO operates in the Linux kernel as one of many core data management services and is a device mapper target driver transparent to persistent and ephemeral storage services whether the storage layers above are providing object, block, compute, or file based access.
VDO - Seamless and Transparent Data Reduction
The same transparency applies to the applications running above the storage service level. Customers using VDO today realize savings up to 6:1 across a wide range of use cases.
Some workflows that benefit heavily from data reduction are;
- Logging: messaging, events, system and application logs
- Monitoring: alerting, and tracing systems
- Database: databases with textual content, NOSQL approaches such as MongoDB and Hadoop
- User Data: home directories, development build environments
- Virtualization and containers: virtual server, VDI, and container system image storage
- Live system backups: used for rapid disaster recovery
With data reduction, cumulative cost savings can be achieved across a wide range of use cases which makes data reduction so attractive for private cloud deployments.
Reducing Hybrid Cloud's Highly Redundant Data
Storage is at the foundation of cloud services and almost universally data in the cloud must be replicated for data safety. Hybrid cloud architectures that combine on-premise resources (private cloud) with colocation, private and multiple public clouds result in highly redundant data environments. IDC’s FutureScape report finds “Over 80% of enterprise IT organizations will commit to hybrid cloud architectures, encompassing multiple public cloud services, as well as private clouds by the end of 2017.” (IDC 259840)
Depending on a single cloud storage provider for storage services can risk SLA targets. Consider the widespread AWS S3 storage errors that occurred on February 28th 2017, where data was not available to clients for several hours. Because of loss of data access businesses may have lost millions of dollars of revenue. As a result today more enterprises are pursuing a “Cloud of Clouds” approach where data is redundantly distributed across multiple clouds for data safety and accessibility. But unfortunately, because of the data redundancy, this approach increases storage capacity consumption and cost.
That’s where data reduction comes in. In hybrid cloud deployments where data is replicated to the participating clouds, data reduction multiplies capacity and cost savings. If 3 copies of the data are kept in 3 different clouds, 3 times as much is saved. Take the private cloud example above where data reduction drove down the costs of a 1 PB deployment to $176,667, resulting in $883,333 in savings over five years. If that PB is replicated in 3 different clouds, the savings would be multiplied by 3 for a total savings of $2,649,999.
Permabit’s Virtual Data Optimizer (VDO) provides the perfect solution to address the multi-site storage capacity and bandwidth challenges faced in hybrid cloud environments. Its advanced data reduction capabilities have the same impact on bandwidth consumption as they do on storage and translates to a 6X reduction in network bandwidth consumption and associated cost. Because VDO operates at the device level, it can sit above block-level replication products to optimize data before data is written out and replicated.
Summary
IT professionals are finding that the future of IT infrastructure lies in the cloud. Data reduction technologies enable clouds - public, private and hybrid to deliver on their promise of safety, agility and elasticity at the lowest possible cost making cloud the deployment model of choice for IT infrastructure going forward.