This Blog

Syndication

News

Enterprise Storage Strategies

Deploying enterprise storage has never been more confusing, with a wide variety of technology choices available. On this blog, Nirvanix Director of Consulting, Stephen Foskett, presents proven strategies for building an internal storage service in the enterprise.

August 2009 - Posts

  • Why Isn't Storage Getting Cheaper? Part 4: The Glass Floor

    Storage capacity keeps growing, but unstructured data grows at least as fast. IT organizations have tried to contain costs, but tiered storage has not worked out that well. Although there are technical limits to the effectiveness of tiered storage, the biggest challenge is a business one: Disk drives are a very small component in the overall cost of storage.

    Why isn't storage getting cheaper? This series of articles attempts to answer this question:

    1. Too Cheap to Manage
    2. Too Much to Manage
    3. Tiered Storage
    4. The Glass Floor
    5. Storage as a Service

    The Glass Floor

    As I've mentioned in the past, there is a "glass floor" below which storage costs simply cannot drop for technical reasons. This seriously limits the effectiveness of cost savings activities in general and tiered storage concepts in particular!

    Hardware Costs

    Consider an average enterprise storage array. For about $250,000 you get a rack-sized device with all sorts of software and hardware capabilities, plus about 50 disk drives. If we were just buying "spinning rust," as industry wags often claim, no one would consider buying an enterprise storage array. Instead, enterprise storage companies sell their kit for its value add over plain old disk capacity (JBOD): Networked enterprise storage offers tremendous advantages over JBOD. These advantages are apparently so compelling that businesses everywhere have been ready to spend serious amounts of money for comparatively little disk space.

    The disk drives themselves aren't run-of-the-mill consumer items, but they're not all that expensive compared to the total cost of a storage array. Each of those 50 drives probably lists for $2,000, so disk drives make up 40% of the list price of our example storage array. Much ink has been spilled arguing whether enterprise disk drives are worth their list price, which can be 10 times higher than consumer units. But disk drives are not manufactured by enterprise array companies anymore, and they are purchased at steep discounts in massive numbers. Most of that cost is profit to the array vendor, and they are normally willing to sacrifice margin there rather than on their core value-added software features. After negotiation, disk capacity probably makes up less than 10% of the purchase price of a storage solution. The disk drives themselves are simply not a major component of the cost of enterprise storage.

    Let us now apply these basic facts. Organizations typically implement tiered storage in one of the following ways:

    1. Extra-array tiering involves the use of multiple independent storage systems. An organization might buy a smaller amount of capacity on expensive high-end arrays and a larger amount on cheaper midrange models.
    2. Intra-array tiering requires a storage array or virtualization controller that supports multiple disk drive types. These often can also move all or some of a virtual drive, or LUN, between these internal tiers of storage

    Adding an entirely new type of storage array presents a major challenge for IT. The initial purchase price is high, there is a capacity planning and product selection process to deal with, and administrators have to be trained in managing this new system. Then comes the process of migrating data from the existing tier to this new tier of storage. Most have found that the somewhat lower unit cost for capacity on low-end storage systems is offset by the added overhead involved in buying and managing these arrays. Since the price of a disk drive makes up little of the total cost of a storage solution based on even the least-expensive storage system, extra-array tiering often fails to deliver the expected savings.

    Many enterprise storage vendors are also touting the ability of their larger storage systems to support multiple disk types at different price points. This intra-array tiering approach is made more enticing with the promise of automated storage tiering. But the effectiveness of intra-array tiering is limited by the high cost of the storage systems involved. Let us imagine that disk drives made up 40% of the purchase price of such a solution; a 50% savings on 50% of the disks would represent an overall savings of about 10%. And even this cost savings might be eaten up by the additional cache memory, controllers, licenses, and maintenance cost required to support more disks. The automated storage tiering software features often have high additional price tags as well!

    In truth, neither extra-array nor intra-array tiered storage delivers much benefit even in the best of circumstances. In typical scenarios, tiered storage often results in no cost savings whatsoever.

    Operational Cost

    As we have seen, the falling cost of disk drive capacity fails to deliver the savings needed to support the ever-growing volume of data in businesses, since it makes up such a small part of the cost of an enterprise storage system. But storage costs as a component of IT budgets are not falling. In fact, storage costs are escalating relative to other IT disciplines.

    The high cost of operating and managing enterprise storage systems tends to soak up any cost savings from less-expensive hardware. More data requires more equipment, more floor space, more electricity and cooling, more engineering, administration, and operational personnel, more backup issues, more compliance concerns, and more management oversight. In short, a given amount of storage requires the same operational expense regardless of the cost of the hardware it sits on.

    Operational expense hits mid-sized businesses hardest. The smallest organizations rely on just one or two IT staff, so hardware cost is a major concern. The largest businesses have such a huge footprint of data that the cost of storage gear dwarfs their operational expense. But no matter the size of the business, simple changes to storage hardware infrastructure has failed to deliver any overall IT budget savings.

    It is simply not enough to attack the hardware side of the equation with cheaper disk drives. Organizations must dramatically lower their aggregate storage infrastructure costs and the associated operational costs at the same time. The impact that managed storage services can have on this equation is tomorrow's topic for this series!

  • Why Isn't Storage Getting Cheaper? Part 3: Tiered Storage

    The growth of storage capacity led to an attitude that storage was too cheap to manage, but this didn't last long. Before we knew it, IT was faced with a flood of data, easily too much to manage. Faced with limits to their ability to control data growth, IT tried to get the business interested in information lifecycle management (ILM). When this didn't work, they opted for cheaper capacity. Today, we look into the outcome of this tiered storage adventure.

    Why isn't storage getting cheaper? This series of articles attempts to answer this question:

    1. Too Cheap to Manage
    2. Too Much to Manage
    3. Tiered Storage
    4. The Glass Floor
    5. Storage as a Service

    Tiered Storage

    The uncontrolled growth of unstructured data left IT managers trying to reduce storage costs at the wrong end: Frantically adding cheaper capacity and trying to lower overall cost with tiered storage. This is simply the only option they had, after the business failed to focus on data classification, content management, and information lifecycle management concepts.

    Classical tiered storage assumed that there was flexibility in the relationship
    between capacity, performance, and cost

    Tiered storage was essentially information lifecycle management (ILM) without the lifecycle: It assumed that high-performance storage is scarce, expensive, and ought to be reserved for high-value applications. Since most enterprise storage was taken as high-performance, the implementation process generally revolved around adding cheap "bulk storage" and migrating data down the pyramid.

    Tiered storage was supposed to dramatically reduce cost, but it has not helped all that much. There are a few reasons for this:

    1. Automation has been something of a holy grail for tiered storage, with data classification engines algorithmically assigning value and data movers migrating content between tiers. But neither element worked all that well in practice, so most tiered storage architectures lack automation: Two or three tiers of storage was installed, and whole applications were manually placed where the administrators thought appropriate.
    2. A lack of automation implies a lack of granularity of data placement. Most require entire LUNs (also called drives, drive letters, or volumes) to be manually placed on initial creation. This unit of management is simply too large, placing data on the wrong storage tier.
    3. Some storage hardware offers automated internal tiering to various degrees, but these tend to be inherently expensive. If a high-end enterprise storage array is required to implement tiers, the difference in cost between one disk type and another is unlikely to make a significant difference.
    4. The expense and effort of migration and management further reduces the impact of even the best automated tiered storage approaches. This is something I will get into in the next article in this series.
    5. Some placed tape in the tiered storage pyramid, but the offline nature of content on tape causes serious issues. Unless an integrated hierarchical storage management (HSM) application is used (as in the case with many mainframe systems), tape really is not a tier of online or primary storage. It's an alternate location for data protection.

    Solid-state disk and cloud storage can complement traditional on-site disk,
    delivering real performance and cost advantages

    All is not lost, however. We can try to address each of these deficiencies, and many storage vendors are focused on doing just this. Hu Yoshida of HDS suggests that flash makes up the apex, and most of the pyramid will be taken up by disk. IBM's Barry Whyte even suggests that perhaps the tip of the pyramid might actually expand into an egg timer shape. Regardless, both agree that tiered storage is changing, especially with regard to what we can expect from conventional hard disk drives. But neither talk about where this storage will reside. Considering the massive cost of on-site disk storage, cloud storage looks like a great alternative for lower-tier data.

    The next article in this series takes a deeper look into the reason that reducing the cost of disk capacity has not impacted overall cost of enterprise storage.

  • Why Isn't Storage Getting Cheaper? Part 2: Too Much to Manage

    As discussed yesterday, the incredible growth of storage capacity led to an attitude that storage was too cheap to manage. Excess data capacity seemed always to absorb any new demand. But the unchecked growth of data led to the serious issues that storage managers face today: Difficulties in protecting massive data sets, concerns about compliance and litigation, and storage budgets that refuse to shrink.

    Why isn't storage getting cheaper? This series of articles attempts to answer this question:

    1. Too Cheap to Manage
    2. Too Much to Manage
    3. Tiered Storage
    4. The Glass Floor
    5. Storage as a Service

    Too Much to Manage

    There is a name for the digital swamp we've found ourselves in: Unstructured data. That first word is the critical one: Open systems store files in a user-created directory tree, with each file residing in one location in that tree. This storage paradigm creates a maze of paths leading to pockets of data here and there. Although many organizations try to tame this mess with directory and file naming standards, there is only so much they can do.

    As most have already discovered, the filename/directory location concept inherently limits the organization and usability of file systems. This is the reason we call this type of data unstructured! IT has responded with "resource management" and search tools that try to index unstructured data and make it more usable, but these systems can only go so far. Another attempted solution is content management systems, which organize documents with version control, keywords, and much more flexible filing concepts than a simple tree.

    Although the proponents of organization have been successful in some locations, the majority of us still drown in unstructured data. It fills up our personal computers and file servers, hampering productivity and limiting the benefits of ever-increasing capacity and ever-dropping cost. But it's not just a lack of organization that is turning "too cheap to manage" into "too much to manage".

    Although office files have expanded, multimedia killed the disk quota. The new XML-based document formats are actually quite efficient, but the proliferation of embedded graphics, audio, and movies causes them to balloon from tens of kilobytes to tens of megabytes or more. With our multi-megapixel digital photos, iTunes libraries, and ripped movies, giant disk drives are like a drug for digital pack rats. A basic laptop now has a 160 GB hard disk, and terabyte laptop drives are on the way, enabling the problem rather than forcing a solution.

    We don't have storage problems, we've got usage issues. Quotas are passe and storage resource management (SRM) and hierarchical storage management (HSM) software never took off. Hard system limits are about all that stops the flood of data, and vendors are working to eliminate these, too.

    This is one answer to the core question of this series: Storage isn't getting cheaper because even as capacity expands, our use of space keeps growing unchecked. But there are other reasons as well: The next entry in this series explores the extent to which tiered storage can address the issue of data storage cost.

  • Why Isn't Storage Getting Cheaper? Part 1: Too Cheap to Manage

    The application of Moore's Law may have led to incredible advances in computing, but the growth of storage capacity is even more impressive. Disk and tape storage density has doubled every year, driving out cost and bulk, a phenomenon sometimes called Kryder's law. $250 bought an amazing 20 GB of hard disk space in 1999, but this is nothing compared to the 2 TB that can be had for the same money today. Yet enterprise storage costs have not dropped. The real cost of an enterprise storage array has not fallen significantly in that same period. In fact, the total cost of maintaining storage has actually increased as a percentage of IT budgets.

    Why isn't storage getting cheaper? This series of articles attempts to answer this question:

    1. Too Cheap to Manage
    2. Too Much to Manage
    3. Tiered Storage
    4. The Glass Floor
    5. Storage as a Service

    Too Cheap to Manage

    In the 1960's, the development of nuclear power promised to reduce the cost of electricity to zero. "Too cheap to meter*" meant the rapid switch to electricity throughout the home, exemplified by the widespread application of electric baseboard heating. We all know how this story turned out: Rapid increases in power usage overtaxed the ancient power grid, nuclear power was halted by environmental and NIMBY protests, and the energy supply remains one of the most important issues facing humanity

    My first hard disk drive boasted a generous 20 MB of capacity. With the addition of compression, this lasted me from 1988 through about 1993, when I finally ran out of space and had to upgrade. My next computer included a 100 MB drive, and I was very pleased indeed to be able to take along all of my old files and still have plenty of capacity for my new work. This process repeated every few years: I remember the joy that 200 MB, 840 MB, 1.2 GB, and 10 GB brought me through the decade. Each time I upgraded, the new capacity seemed limitless. More than I would ever need. Too cheap to manage.

    Disk drive capacity growth has been phenominal for the last 20 years.
    (Public domain image from Wikimedia Commons)

    "Too cheap to manage" was the cop-out for corporate IT. I remember installing my first 10 GB storage array. I remember EMC announcing the amazing 1 TB Symmetrix 3000. I remember migrating 20 TB through a temporary Fibre Channel SAN. I had stopped being amazed by storage capacity. Instead, I was amazed by the size of the data sets I was working with. We once used quotas to limit users to 5 MB of storage on the corporate file server, but quotas went the way of the dodo once multi-TB NAS filers were installed.

    But data growth suddenly caught up with us. "Too cheap to manage" led to "too much (data) to manage."