Tape in the Age of Big Data Growth
Data growth and tape capacity growth are two converging lines in 2015. In its latest EMC-sponsored
Digital Universe study, the International Data Corporation (IDC)
speculates that global data volume will grow over the next six years from 4.4 zettabytes to 44 zettabytes. Much of this growth will be driven by the Internet of
Things, which will include everything from jet engines and cars to mobile phones and dog collars.
In order to generate value from this data, it must be analyzed. Some of the analytics will be in near-real time, but some trends can only be seen best in
longer-term studies that call for older data. One of the greatest challenges of deriving value through the analysis of big data is finding a place to store it.
According to the IDC, data is outpacing storage. In 2013, the available storage capacity could hold just 33 percent of the digital universe; by 2020, it will be
able to store less than 15 percent.
Fortunately, there may be an answer to that challenge: tape.
It was a big year for
tape storage in 2014. IBM, in partnership with Fujifilm, and Sony both announced new tape technology that promises to increase the uncompressed capacity of a
Linear-Tape Open (LTO) tape to 155–185 terabytes per cartridge. Vendors are working with the open Linear Tape File System (LTFS) standard to advance new
technologies such as Spectra Logic's BlackPearl,
which will significantly decrease the load and retrieval times for smaller files from LTO libraries. IBM is integrating the General Parallel File System — which
is broadly used in large-scale cloud and supercomputing — with LTFS so that entire archives can be easily retrieved around the world.
The confluence of big data and new tape technologies has experts sheepishly admitting that perhaps tape is not really dead after all. In fact, it may even be
the savior of big data. All they need to do is look at some of the largest big data analytics implementations in the world and see that most of them rely on
LTO- and LTFS-based tape archives to manage their computing operations at a cost far below what it would cost on flash or spinning disk.
Big Data on Tape Case Studies
Consider two examples of large supercomputing analytic users who are relying on LTO- and LTFS-based storage architectures for their needs.
The National Center for Supercomputing Applications (NCSA),
which runs the world's largest data archive, has more than 300 petabytes of LTFS-integrated tape library-based archives. As it prepares to run simulations,
required data is swapped out of the archives and into spinning disk for analysis.
The NCSA's use of its tape archive for big data analytics mirrors one of the two major use cases for big data. Big data in motion will be used for real-time
analysis of streaming data. This use case is all about making actionable predictions about what will happen in the next hour, day or week based on what happened
in the past hour, day or week. What is wonderful about LTFS is that it can leave a copy of the metadata describing the streaming data in an easily searchable flash
or hard disk array so that when longer-term or larger analyses are run for big data at rest-type exercises such as the NCSA's, the data can be easily found and
restored to the in-memory or hard disk array needed for the analysis. Indeed,
IBM is now
experimenting with the use of predictive analytics linked to LTFS to assist in creating polices about which type of data should be stored where based on prior
Another example of big data growth driving the use of LTO and LTFS is the physics research center at
CERN, where data is growing
by more than 50 petabytes annually. CERN has more than 250 petabytes of physics data experiments stored on its LTO tapes. When it wants to conduct big data at
rest-type analyses, it transfers data to Hadoop for processing.
Big data is one of the most pressing issues for businesses, which is why organizations
should better understand how a modern tape management architecture can help them meet their big data requirements.
Do you have questions about data management? Read additional
Knowledge Center stories on this subject, or
contact Iron Mountain's Data Management team.
You'll be connected with a knowledgeable product and services specialist who can address your specific challenges.