Metadata describes how and when and by whom a particular set of data was collected, and how the data is formatted. Metadata is essential for understanding information stored in data warehouses and has become increasingly important in XML-based Web applications.
In this definition...
What does metadata mean?
Metadata is data that provides information about other data. However, metadata does not provide the content of data, such as the image of a photograph or the text composed in a message. Instead, it provides information about the content, use, management, or other context of the data content.
There are many types of metadata. Some main categories include:
- Descriptive: Information that describes a data resource and that is used for discovery and identification. This may include title, author, date, keywords, abstract, and video runtime.
- Structural: Information indicating how compound objects are put together into containers or units. Some examples include how pages are ordered to form a chapter or how chapters or sections are structured into the format of a book or DVD.
- Preservation: Provides information that enhances the maintenance procedures for a digital object or file. Examples include records of actions taken on the file or the rights attached to it.
- Administrative: Information that helps manage data, such as rules, restrictions, and handling instructions.
- Reference: Information on the statistical data related to the file or object as well as the quality of that data.
- Provenance: Information about the history of data. This is mostly used for digital data, which are often amended or duplicated. It records the origin of the digital file and tracks alterations, duplication, users, methodologies, and other data about events over the data’s lifecycle.
- Use: Data that is recorded and sorted each time a user accesses or uses a piece of data. The purpose of this data is to help make predictions about a user’s future behavior. One common application of use metadata is predicting consumer behavior in order to decide to serve up relevant promotions or to stock certain goods in a store.
- Statistical or process: Describes the processes of collecting, processing, or producing statistical data.
- Legal: Provides information about the author, rights holder, usage and licensing, and other legal context of the data.
What purpose does metadata serve?
Metadata enhances the management, use, and searchability of data. Some common applications include:
- Photographs: Used for keywording, identification, searchability, and post-production organization may be written into a digital photo file. This includes photo ownership, contact information, copyright, timestamp, camera specs, exposure information, and tags that describe or categorize photo content.
- Video: Makes video content searchable may be recorded, or tags useful for the purposes of automatic number plate recognition, vehicle recognition, and facial recognition.
- Telecommunications: Record the origins, destination, times, durations, and other aspects of electronic correspondence. This data is used for personal records, market analysis, law enforcement, and intelligence.
- Libraries, research, and databases: Crucial in cataloging the vast holdings of libraries in both digital and analog formats for the purposes of storage, search, retrieval, associated holdings, citations, and more. The same applies to the sciences, museums, litigation, legislation, healthcare, biomedical research, data warehousing, the internet, the broadcast industry, ecological and environmental study, digital music, cloud applications, and more.
- Geospatial: Allows location-based data in geographic information system (GIS) files, maps, and images to be searched and utilized in a variety of scenarios. Metadata may include developer identification, processing methods, available formats, and time of collection.
How does metadata work?
Creation and Recording
Metadata can be created manually or through automated information processing. An example of manual metadata creation is recording information about title, author, call number, date of publication, and copyright for a library book on an index card. Digital information systems now often automatically capture file metadata such as object creation date, creator identification, last update, file size, file extension, and much more.
Further metadata can be generated through analysis of files and automatic input. For example, a digital image can be analyzed by AI, and the objects recognized could be recorded as metadata to tag the photo for later retrieval. This can also be done manually, such as human users tagging people in photos on social media.
Internal vs. External Metadata Storage
One method of metadata storage is to record metadata such as the time, author, or location directly into the file (internally). Metadata can also be stored externally in a file (known as a sidecar file) or field that is separate from the data it describes. The latter is typically used for data repositories, as searching and management is more efficient.
Human-Readable vs. Binary Metadata
Metadata can also be stored in a format that is readable by humans or in binary or other formats. Human-readable formats can be easily understood and edited by people with no technical expertise, but requires special software to convert it between human-readable and computer-readable formats. As binary is more easily readable by computers, it is optimal in terms of storage capacity, communication speed, and processing speed.
Databases, Data Warehouses, and Metadata Lakes
Metadata can be stored in a relational database system that organizes the metadata into tables. This structured data is easily accessible in numerous ways, and it also makes it useful for creating reports, analyzing small datasets, automating business processes, and auditing data entry.
Data warehouses are the next step up from databases. They are large storage locations for data collected from a wide variety of sources. The structure of the data storage determines which types of analysis can be performed on it. They have been used for decades by mid- and large-size companies to share data within the silo of a team or department.
A growing trend is storage in a data lake without being parsed into fields. This allows the storage of raw data that database software may not understand. Such unstructured data allows unusable metadata to be preserved until it can be used by other software, but it is slower to access. Data lakes are mostly used by large corporations or businesses specializing in big data analysis.
Metadata is especially useful for object storage, an affordable and scalable form of storage used by many popular services.
As metadata is used in larger volumes for more purposes all the time, it has undergone global standardization and harmonization work. The European Statistics Code of Practice and ISO 17369:2013 have both helped make metadata usable by the broader statistical analysis community.
What are the key features and benefits of metadata?
Metadata Enhances Data Value and Application
Metadata enhances and enables data use and reuse. It enriches and adds value to data by making it easy to search, manage, analyze, retrieve, and otherwise use in various capacities. Analysis can improve a business’s bottom line, increase the depth of business intelligence, or make data more useful in countless ways.
Metadata Creates Business Opportunities
Metadata’s benefits are especially useful in the domains of business intelligence and financial intelligence, service-oriented architecture data services, cloud computing, enterprise research, and master data management.
The majority of applications are in business, where metadata is used to improve search engine visibility, searchability, and discoverability. Businesses often use it to analyze consumer behavior, devise advertising and marketing strategies, improve audience engagement, and inform the decision of which products and services to offer.