The growing trend of secondary storage
Secondary data storage continues to evolve and take on new roles and business-enabling benefits.
Context is essential for secondary storage as it can refer to different tiers, classes, categories and types of technology. There are also different applications (new and old) that are generating and consuming larger amounts of data. These include legacy structured, semi-structured and unstructured little-data as well as big, data-based applications. Data types range from text, documents, email attachments, databases, videos, images, audio and event log telemetry streams, among others.
Some existing and new application workloads driving the need for more secondary storage include:
- Data protection and preservation (backup and archive)
- Video, audio and image analysis, as well as facial recognition
- Cognitive, artificial intelligence (AI) and machine learning (ML)
- Immersive augmented reality (AR) and virtual reality (VR)
- IoT, telemetry, batch and real-time clickstream analytics
Application workloads and secondary data storage trends include:
- There is more data (quantity and volume)
- Data is getting larger (size of data items)
- Faster rate of access (velocity or speed of creation and access)
- Dependence on information (need for durable data protection)
- Compliance, regulations, changing data value (longer immutable retention)
- Availability and accessibility when and where needed for an online, always-on world
So, what are the fundamentals? Secondary storage includes those solutions that are not considered high-performance, mission-critical, enterprise-class type systems. Think in terms of large, expensive all-flash arrays (AFAs), hybrid and similar storage systems. Secondary storage systems tend to cost less, have a higher capacity and, in some cases, offer fewer data management features or lower-end systems.
Be careful merely judging or classifying a solution based on its physical size, appearance, cost or vendor logo — after all, everything is not the same (remember, context matters).
An evolving trend within secondary storage is supporting changing life cycles as well as access patterns. Instead of data being active and then going dormant after it’s created (e.g. candidate for archiving), data is now being routinely accessed over time (candidate for online or active archive). This increased and continued access determines how secondary data storage is used, and drives supporting technology trends.
Some applications use secondary storage as their central repository for storing data in data ponds, pools and lakes where data streams flow into them. Applications and workloads include home directories and shares, media and entertainment, energy exploration and life sciences. Other applications that rely on secondary storage as a primary storage resource include security and surveillance video, big data analytics, telemetry, activity event logs, AI, ML, IoT and others.
Technologies that are part of secondary storage include:
- On-line, near-line, off-line as well as fixed and removable media (SSD, HDD, tape, optical)
- Solutions (systems, appliances, services) for bulk, low-cost, high-capacity scale-out
- Legacy and software-defined cloud, virtual, container and converged
- Data footprint reduction (DFR) such as compression, deduplication and tiering
- Management features such as WORM, immutable, quotas and analytics
- Data protection (availability, durable, security, mirror RAID and erasure codes)
- Direct attached and networked access to local, remote and cloud storage resources
- Access via database, key-value, API, object, file and block protocols
Other technologies found in secondary storage include:
- Policy-based management for availability, data protection, capacity and performance
- Data protection security access control, encryption (at rest, in-flight), versioning and logging
- Metadata management, application integration, indexing, search and tagging
- Namespace and end-point management (local, remote and cloud)
Some secondary storage solutions and services trends include:
- Enabling more data to be retained longer, more efficiently and cost-effectively
- Creation of data pools, ponds and lakes holding data of known and unknown value
- Providing fuel to AI, ML, AR, VR, big data analytics and other data-hungry application workloads
- Increased density of effective data stored per cubic foot or meter beyond raw device capacity
- Evolving from simple, low-cost, high-capacity, less reliable and “cheap and deep” to durable and productive
- Being used as primary storage by some applications and environments for value vs. low-cost
What does this all mean? There is no longer a one-to-one affinity of secondary applications using secondary storage or vice versa. Keep the context of secondary storage in mind both as a technology and tool, as well as a business benefit.