What is metadata?
Metadata is used to summarize basic information about existing data. In the technology space, “meta” means an underlying description. At a basic level, metadata is data about data. Metadata can be generated automatically by the application being used or created manually to optimize searchability and accuracy.
The purpose of metadata
The purpose of metadata is to help applications, websites, and databases find and work with relevant data. Some examples of basic metadata include:
- Publication date
- Image size, file type, dimensions
- Web pages
The most common metadata used by people is to find resources over the internet. Search engines like Google, Bing and DuckDuckGo match user queries to the metadata provided by websites. This metadata can be pulled from meta descriptions, meta titles, image alt tags, header tags and more.
Metadata is also an important factor when working with databases. Relational databases, the most common type of database, store structured data in a large number of tables and columns. When managing databases, administrators need the ability to query specific data from these tables and columns. Metadata in these databases refers to the structured schema that designates what information is stored where to make it searchable.
Digital asset management (DAM) systems also heavily rely on metadata. Whether it’s metadata automatically entered into files from digital cameras, or manually created when used in creative programs like Photoshop or Lightroom, metadata is key to storing and finding assets to be repurposed later.
Types of metadata
There are a variety of types of metadata that serve different purposes:
- Descriptive metadata: This is the most commonly used type of metadata. It’s used to identify basic and specific types of information, such as titles, dates and keywords.
- Structural metadata: This gives information on specific objects or resources, making it useful for DAM systems. It specifies how these resources are sorted.
- Preservation metadata: Preservation metadata offers information to maintain the integrity of objects or files through their lifecycles.
- Provenance metadata: This provides information on the history of objects or files, namely when they’ve been altered or duplicated.
- Use metadata: Use metadata shows each time a user has accessed or manipulated an object or file.
- Administrative metadata: Files often contain rules or restrictions on how they can be used and by who. Administrative metadata is created by administrators to designate these limitations.
Right-clicking any saved file in Windows Explorer, for example, a text (.txt) file on your hard drive, allows you to select “properties” and see additional information about that specific file. In this case, the information is about the file itself and includes the file name, what program it can be opened with, when it was created, last modified and last accessed, the file size, the full pathname of the directory it is stored in, who created it, who the system owner is and more.
This additional information you can obtain about the file is the metadata. The metadata you can see when using Window Explorer properties is specifically called file system metadata. Metadata is associated with almost every type of electronic file available today. Even your email headers and attachments contain metadata. Most metadata is hidden and you have to know how to access it to change or limit the information provided.
Try right-clicking an image on your hard drive. Photographers who capture the perfect shot and can’t remember the camera settings can try viewing the metadata attached to the picture to find out. This is especially true for JPEG images, although metadata is available on a wide variety of image file formats. In addition to information ranging from author to white balance to the camera lens manufacturer, the metadata stored with the image is in-depth, and possibly information you don’t want someone viewing the image to know.
In Apple’s macOS, Cmd+I or Get Info will show similar information on Mac and MacBook computers.
Microsoft office and metadata
Microsoft Office files, like other types of digital data, also carry document metadata. Some of the types of metadata that may be stored along with your saved Office documents can include your name, initials, company name, computer name, the disk or network server the file was stored in, file properties, revisions, hidden text, deleted comments, and so much more. Microsoft Office documents are frequently passed among co-workers, clients and contractors, so when the documents are shared, quite often, large amounts of metadata are too.
This problem is often referred to as “metadata risk,” wherein private information is disclosed, usually unknowingly, because this metadata is hidden from plain view and users simply are not aware of it. When a document is sent outside your office to a client or contractor, the associated metadata may not stay hidden if the receiving party knows where and how to look for it. One critical area of interest is the capability to track changes made to the document. In Microsoft Word, metadata stores information about the changed text, the name of the author making changes, and the date and time those changes were made. This information may be something those outside your company shouldn’t have access to.
When using Microsoft Office applications or any application, it’s important to familiarize yourself with which of the program’s tools will let you remove this metadata, and show what is normally hidden mark-up so you can ensure this type of associated information about the content will not be shared when you share the actual file. The removal of this type of data is often called data scrubbing or data cleansing.
Did you know?
In a landmark 2004 case, the U.S. District Court, D. Kansas ruled that electronic documents must be produced in native format. and with their metadata intact (Williams v. Sprint). Metadata includes message attributes such as file owner, creation date, routing details, the sender, receivers, and subject line. [Source: The New Federal Rules of Civil Procedure: IT Obligations For Email]
This article has been updated by Kyle Guercio