point
Menu
Magazines
Browse by year:
New Horizons in remote Data Continuity
Rajeev Atluri
Monday, November 17, 2008
Remote Data Continuity (RDC) includes Disaster Recovery (DR) measures that protect a business at the site level and the protection of data spread across an enterprise’s branches.

Until very recently, implementing DR or backup consolidation plans was considered affordable only by the Fortune 500 class of enterprises.

Driven by compliance requirements and the enhanced importance of business data in the globally connected economy, the management and operational staff of medium to small enterprises are demanding and deploying solutions that reduce the business risk while balancing the cost of ownership. Even the large enterprises are seeking alternatives to expensive solutions that force vendor lock-in and only provide protection for a reduced set of the enterprise data. Businesses are also analyzing their distributed environments and are looking for solutions that can consolidate DC infrastructure and management into a single or reduced number of data centers.

The traditional business drivers for RDC include protection against
l Site level Software/Hardware Failures
l Extended Power Loss/Disruption
l Viruses, Hackers
l Loss of Key Personnel
l Natural and Man-made disasters (floods, earthquakes, civil strife, war etc.)

These take on even more importance due to the newer business drivers that we will examine below. The new drivers can be broadly classified under Business Risk and Cost of protecting against the business risk.

Business Risk
A stricter worldwide corporate climate due to mandatory compliance with governmental regulations and standards has made RDC an imperative at the topmost executive level. Sarbanes Oxley (SOX), GLBA, HIPAA, The Patriot Act, OFAC, FCRA, NCLB, Data Protection Act, ITAR are some of the worldwide regulations that now play a major role in determining enterprise data protection measures.

Globalization of the economy
Customers and partners now make contracts contingent on Business Continuity Plan (BCP) that effectively addresses local risks. A typical example is the software and call center outsourcing businesses. Another significant effect of globalization has been the impact of regulations around the globe. For example, a regulation employed in the U.S. or Europe has an immediate impact on Indian companies since data sharing is essential for collaboration.

Cost of reducing
the Business Risk
In most businesses RDC is treated as a cost center, not a revenue generator. This has been a significant impediment in sourcing the right funds for implementing adequate solutions. There have been several changes in recent years that have reduced the cost of effective solutions. A couple of the significant factors are cheaper bandwidth and lower cost disk becoming viable as secondary storage.

Current Approaches to RDC
Recovery Point Objective (RPO) and Recovery Time Objective (RTO) are two useful metrics defining and classifying RDC/DR solutions. RPO can be defined as the maximum data loss tolerated in the case of a disaster. RTO can be defined as the maximum downtime tolerated in the case of a disaster.

The commonly used RDC solutions are:
l Tape Vaulting involves using backup software to take daily or weekly backups and shipping the tapes or their copies to a remote site. RPOs of less than a day are not achievable. Any failure in the backup chain directly affects the RPO. The failure rates for backup to tape implementations are high (some estimates reach 40-70% at restore time). The lack of automation and the impact on production (for full backups) usually makes this form of RDC/DR less than ideal.
l Array Based Synchronous/Asynchronous Replication (ABR) is used by enterprises with high-end SAN storage subsystems. Storage subsystem-based software is used to replicate data changes in either a synchronous (RPO=0) or asynchronous (RPO>0) fashion to a remote storage subsystem. ABR is typically possible only between storage subsystems of the same kind. Given that the secondary data center is typically only used for DR purposes, this imposes a large expense for maintaining the DR site infrastructure. These solutions usually require very expensive equipment to translate Fiber Channel (FC) protocol to IP and back. These solutions typically do not demonstrate acceptable failure tolerance when applied to WAN link outages or bandwidth limitations that are commonplace.
l Host Based Replication (HBR) solutions are used to replicate data changes asynchronously to a second host in a remote data center over TCP/IP. These solutions have a significant performance impact on the production hosts as they do all of the replication processing (such as compression and encryption) using the host CPU cycles. They also have limitations that require other software such as volume management to be in place before they can be used for replication.

Next Generation Solutions and their benefits
Newer solutions that take advantage of the technology trends and address the limitations of the current approaches are now becoming available. These innovative solutions address the limitations of the current solutions while being more cost effective. The following are a few guidelines that can help in choosing an effective RDC/DR solution.

Ease of Deployment
l Data migration to new volume manager/file-system formats or storage subsystems should not be necessary.

lHeterogeneous DAS, SAN and NAS storage, OS and server platforms should be supported.
lAccurate provisioning of the necessary bandwidth (with and without compression) should be provided.

Cost Effectiveness
lShould be possible to use different storage and server hardware at the remote data center.
lShould be highly WAN efficient and send only highly compressed data changes.
lShould be able to support bandwidth policies to allow for sharing of WAN lines for production and RDC/DR traffic without introducing the need for expensive networking equipment.

RPO enforcement
l Should support volume and file replication.

l Should offload compute intensive tasks such as compression, encryption, data transmission operations from production hosts and storage subsystems.

l Should tolerate failures such as LAN, WAN, Server outages and bottlenecks without resorting to data resynchronization.

RTO/Recoverability enforcement
l Should proactively enforce application consistency measures.

l Should allow for multiple recovery points.

l Should allow search based finely granular recovery.

Manageability
l Should be centralized and accessible (through VPN).

l Should consolidate data dispersed among multiple satellite offices to a single site to protect data using a centralized tape infrastructure.

l Should integrate near-line (within the datacenter) protection of data to provide for fast Operational Recovery (OR).

Conclusions
While the bar for an ideal RDC/DR solution remains quite high, there are solutions already in the marketplace that surpass it at affordable prices. Most mid-tier enterprises should now be able to implement automated DR plans. Dedicated secondary storage service providers are also making it possible for enterprises that cannot afford their own secondary data centers or dedicated hosting sites to use their services and meet their RDC/DR needs.

Rajeev Atluri is the co-founder, VP of Engineering and CTO of InMage, which provides data continuity solutions.
Twitter
Share on LinkedIn
facebook