Many in the IT industry seem to enjoy arguing exactly what does and does not constitute a cloud service. As I mentioned in my post on the controversy over private cloud services, I do not feel that these arguments are productive. We should focus on results and business value instead of arguing about semantics. However, the current crop of cloud storage solutions have many important differences from traditional SAN and NAS storage, something that seems to surprise many end users I meet. Cloud storage capacity is not your fathers blocks and files!
Primary, Secondary, and Tiered Storage
Most IT infrastructures contain a wide variety of storage devices, but these have traditionally been divided into two categories:
- Primary or production storage serves active applications and is accessed randomly. The primary category includes most familiar direct-attached disks (DAS), storage area networks (SAN), and network-attached storage (NAS). Newcomers in the primary category include content-addressable storage (CAS) and cloud storage services, including the Nirvanix Cloud Storage Service.
- Secondary storage is used for data protection and is normally accessed sequentially. Tape media and optical discs were the traditional secondary storage types, but disk-based systems including virtual tape libraries (VTL) have recently become popular. CAS and cloud systems are also often used for secondary storage due to their lower cost.
The performance and capability of primary storage systems varies greatly, as does the price. For this reason, many large organizations classify their primary storage into a number of tiers. Tier 1 storage typically boasts the highest performance, reliability, and cost. Fibre Channel SAN arrays from companies like EMC, HDS, and IBM have dominated this market for over a decade. Most organizations also offer less expensive lower-tier SAN, NAS, and DAS capacity in an effort to reduce their capital equipment cost.
Primary Storage Options
IT architects are faced with a dizzying variety of primary storage options. Dozens of companies build and sell storage devices, and these leverage a variety of connectivity protocols. Each type of storage presents a trade-off in a number of areas, from performance to cost. There is no intrinsic reason to reject one type or adopt another - the selection process must take into account the technical and business requirements of the application that will use it.
A large number of options are available for primary storage
Enterprise storage technology has evolved a great deal over four decades. The first great step was the separation of the disk from the server in the mid 1960's. Over the next 20 years, protocols were developed to share disk storage among multiple servers, creating the first storage networks. The introduction of RAID in the 1980's led to the development of more virtualized SAN storage systems in the next decade. At the same time, networking companies developed file sharing protocols, creating the NAS market. By the end of the 1990's, the enterprise storage market was divided between block-based SAN and file-based NAS.
The limitations of these block- and file-focused paradigms led to the development of content-addressable storage in the first half of this decade. CAS systems discarded traditional protocols and concepts in favor of application-focused APIs and a universal naming standard for unique objects. Many early applications treated CAS objects as simple files. But applications soon developed to take advantage of the capabilities of the unique capabilities of these systems, especially in the document management and archiving space.
Enter The Cloud
Cloud storage was developed independently from all historical storage concepts, although it might appear to be an evolution of CAS. Both are object-based, use APIs rather than traditional storage protocols, and include per-object metadata. In fact, it is fairly straightforward to integrate today's cloud storage systems into applications developed to leverage CAS. But cloud storage goes further in terms of application integration and programmability (take a look at the Nirvanix API, for example). Vendors have added many features, from replication to indexing to media transcoding, each of which can be called by applications through custom APIs. Cloud storage also leverages the openness of the Internet and modern programming concepts, incorporating the Internet Protocol (IP), HTTP, SSL, REST, and SOAP.
This is not to say that cloud storage can only be used by specialized applications, however. Most cloud systems include basic web browser interfaces. More interestingly, many interface solutions have been developed to bridge traditional storage protocols to the cloud. One major contributor to the success of Amazon's S3 storage offering was Jungle Disk, a consumer-oriented application that allows users to automatically back up their files to the service. Nirvanix developed CloudNAS for enterprise users, which presents cloud storage service as a Linux filesystem or Windows drive. And EMC and Emulex recently revealed that they are working on a bridge between block-based SANs and cloud storage.
Although it can be leveraged by existing applications, often at lower cost, the real benefit from cloud storage comes when applications take advantage of its compelling distribution, collaboration, and programmability capabilities. The entire storage industry is moving toward greater levels of application awareness and integration. LUNs (fake disk drives) served up by SAN arrays are being hidden behind shared file systems in the server virtualization space. NAS is also being updated for greater integration with applications. This is a necessary step to bring about a real storage revolution that will see a transition from bulk management of capacity to granular management of data to integrated use of information.
That's why cloud storage is different, and why cloud storage matters!