Most discussions regarding the possible transition to the cloud tend to center around transition of on-premise applications to the compute cloud (e.g. Amazon EC2 or Google App Engine). In this article we will focus on transition of on-premise data to the storage cloud – specifically public storage clouds offered at the infrastructure layer (e.g. Amazon S3) – and the factors one should consider before making the switch.
Primacy of Secondary and Tertiary Use
Latency issues will keep the cloud from becoming the primary storage for most on-premise applications. On the other hand, the cloud is increasingly becoming a destination for secondary and tertiary storage (Fig 1). Instead of writing to a tape and shipping tape to an off-site secure location, many businesses are now sending their backup data and archives to a remote storage cloud. In the process, they acquire the risk profile of a much larger organization, such as Amazon, which spends millions of dollars to ensure robustness and accessibility of their data centers.
The recent downtime of clouds has grabbed the headlines, but fortunately, most secondary and tertiary applications of storage don’t require always-on service level agreements. In other words, if the storage cloud is not accessible right now, backup software should be intelligent enough to make it up in the next backup run. Of course, you may hit Murphy’s Law if you happen to be recovering a file simultaneously while your storage cloud vendor is down. Fortunately, sending data to multiple clouds is a far easier option than trying to send physical media to two different locations.
Just like the answering machines gave way to voicemail, tape libraries will eventually give way to cloud-based storage.
Over-provisioning: Expensive; Under-provisioning: Hara-kiri
A key reason to move to a storage cloud is provisioning, one of the most hairy problems for storage administrators. Storage capacity needs, especially for backup applications, can gyrate widely over time, making it nearly impossible to accurately provision physical storage for a particular application. Since under-provisioning tends to have significantly dire consequences, most storage subsystems are over-provisioned.