Data storage is the containment of any type of information in a particular location. Though today it is typically used to describe storing applications, files and other computing resources, it has existed as long as humans have. Data has been commonly stored and managed by memorizing, carving, writing, recording sound and video, printing type, taping, programming, creating files and powering servers.
It is estimated that the world will create 44 zettabytes of data in 2020; that’s 687 billion times larger than the data contained in all the scrolls in the Great Library of Alexandria, the largest library of the ancient world. And that number grows exponentially every year. Storing, managing and securing all that data requires enormous computing power and physical storage devices such as hard drives, flash memory, solid state drives and data tapes, whether on laptops, mobile devices or on servers in a cloud or data center. It also makes issues such as data storage integrity, reliability, and compatibility extremely important; nothing less than preserving the record of our civilization is at stake.
Historical data storage
Ancient data storage was both thorough and intricate. Ancient tribes memorized lengthy pieces of history and literature and handed them down through generations by regular recitation and practice. The Bible records data about the 12 tribes of Israel and head-counted certain tribe members. Thousands of years later, that data is preserved. Ancient people carved drawings, writing and numerical values on cave walls, stone tablets and pieces of clay, many of which still exist. The abacus and other calculation methods managed numerical data.
The Antikythera mechanism (photo at right courtesy Giovanni Dall’Orto) was an advanced time-tracking tool that used computing processes, dials, and gears to track astronomical movement and calendar dates. It was found in 1900 on a sunken ship near a Greek island. It is known as the first analog computer. The Antikythera mechanism produced data about the stars and the calendar years in advance. The advanced design of this computing tool suggested that it was not the first one to be designed.
Medieval data storage is less notable (years 500-1300 AD were not called the Dark Ages mistakenly), perhaps partly because ancient invention sank into oblivion for many years and historical records from medieval times are more fuzzy. (After the aforementioned Antikythera mechanism, similar machines didn’t appear to be invented for a good 1,300 years). However, the popularity of writing on parchment and the development of books marked an important step in storing data. During this period, as monks and scribes painstakingly created books filled with color and design, data storage became a work of art as well as a method of recording information.
In the 15th century, Gutenberg invented the printing press. Typesetting allowed people to make information much more available much more quickly. Though books were considered the property of the extremely wealthy or at least well-to-do for centuries more, they put physical copies of data into many more people’s hands. This not only increased the development of learning but also provided all people with the opportunity to analyze governmental and philosophical processes for themselves and challenge injustice.
During the industrial age, multiple inventors created machines that performed calculations and stored information; notably, Charles Babbage, in the 19th century, designed an early computer. The term business intelligence came into use in the mid-19th century, initiated by carefully discerned statistics and indicating the storage and analysis of data. Computing machines became very important in the world wars, in which they assisted in breaking codes, planning attacks and dropping bombs.
A side note regarding the most advanced kind of data storage: Though it may seem overly straightforward, the brain is much more advanced than any computer or network in its ability to process and use data (artificial intelligence is one of the more advanced forms of technology, and even it can only hope to catch up with the brain). The human mind can store data through memorization (as mentioned earlier) and through naturally intaking information. The brain manages the inner workings of many different systems through electric signals and stores data through its natural processing and advanced analytics system.
Pre-digital data, file and image storage
Before data storage providers went digital, there were a few providers who specialized in safeguarding data and files on paper, on film, in images, on objects and in other formats. Most of these companies are still in business, because not everything that needs to be stored and protected is in a computer system. Companies such as Iron Mountain (started up in 1951) and competitors Access Information Management, Hewlett Packard Enterprise, H3C and CoreSite Realty are known to build and maintain super-secure storage facilities–both above and below ground–in order to safeguard valuable public and private information and artifacts. These storage providers still play an important role in real-world use cases.
For example, Iron Mountain protects a high percentage of Hollywood movie history–thousands of cans of physical film dating back to the late 19th century–in an underground vault in the West Los Angeles area. Iron Mountain and others also store a great deal of information and artifacts for the federal, state and local governments.
Computer data storage
In a modern computer, a central processing unit (CPU) is the control center for the computer, giving commands that the computer then executes. It is connected to primary storage, or main memory. Random access memory (RAM), part of main memory, processes data that the CPU requests, but it cannot process much at once. Secondary storage, however, stores data in the background, where it can be accessed by computer memory and brought into primary storage, or RAM, for processing. Multiple types of hardware are available for storing and processing data. Hard disks store more data than soft disks and can process information more quickly. Soft (floppy) disks, though easier to transport and purchase, are much less secure.
Direct-attached storage refers to data storage that’s attached to a computer or server rather than over a network. This makes it readily available, which is beneficial if a network is down and a user needs to access data. Solid state drives (SSDs) are just one example of direct-attached storage: external hard drives, which can be an SSD or hard disk drive, plug into a computer, allowing users to instantly access the data stored within the drive.
Software-defined storage (SDS) manages software and hardware, such as servers in a data center, from a distance. SDS can control multiple environments and allows flexible data storage (on servers, pieces of hardware, virtual machines, etc). It’s more abstract than on-premises storage, but it also provides many more compute resources and greater flexibility.
Data centers were initially developed in the mid-1900s (perhaps initially modeled after ENIAC—photo at right—one of the first computers), but their usage grew much more quickly in the late 1990s. As demand for computing skyrocketed, huge infrastructures were built to meet the need. Now data centers exist physically and virtually. Google has 11 physical centers in the United States alone and 19 globally (as of 2020). Data centers require enormous amounts of management, cooling, and security monitoring; they must also be placed in locations with minimal natural disaster tendencies.
Modern data storage concerns
Though the flexibility and agility of data storage has improved through software-defined and hybrid cloud environments, this doesn’t solve the problem of obsolete storage methods. Throughout history, storage methods have increasingly become less sturdy, if also easier to use.
Storing data through technology is still relatively abstract compared to previous methods of storageâ€”such as rock carving, which could only be lost if physically misplaced or damaged over hundreds of years of weather.
In contrast, technology becomes obsolete so quickly (more so than paper, which came before it), and users run the risk of losing their information if they can’t find a new location to properly store and process it. Different formats and generations of technology make old files obsolete quickly, and occasionally unreadable, necessitating the migration of data from one generation of technology to the next. Videos are one example: they’re challenging to transfer between mediums, and the technology that reads them (VHS and DVD drives, for example) can fail and storage devices deteriorate over time.
Even with the significant expansion of the cloud, computing processes still have to run on servers, and if technology shifts further, users may struggle to save all of their important data. And error rates during storage and transmission are also a threat to the integrity of data – if enough bits flip from 0 to 1 or vice versa, a file may become unreadable. While quantum computing is an attempt to move beyond the limits of modern data storage and computing, at its most basic level, data storage remains a digital process, defined by just two binary values, 1 and 0.
One of the most common types of enterprise data storage–RAID, or redundant array of independent disks–is an attempt to limit the risk of disk failures by spreading out data and duplicating it. Backup is an essential data protection strategy and can even help fight security threats, such as ransomware. The more amounts of data people store, the more information they risk losing, and the more they need strategies to protect and preserve it.
The importance of backing up data has increased as users rely further on technology. Accessing data in the cloud (using Google Drive to create documents, for example) is one helpful method, but it’s also important to save files on an external device such as a hard drive. The most important files should ideally be kept physically outside a computer network (in print form). You could also attempt carving them into a rock, depending on the relative importance of the file.