Webopedia on Google+Webopedia on TwitterWebopedia on FacebookTech Bytes Blog
Main » Did You Know » Computer_Science »

Metadata: Data About Data

 

"Data about data."

That simple statement describes the essence of metadata. Breaking it down further, data about data really refers to the information used to describe specific content (also data).

For example, if you right-click on any saved file, such as a text (.txt) file on your hard drive using Windows Explorer you can select "properties" and see additional information about that specific file. In this case, the information is about the file itself and includes such things as the file name, what program it can be opened with, when it was created, last modified and last accessed, the file size, full path name of the directory it is stored in, who created it, who the system owner is, and so on.

This additional information you can obtain about the file is the metadata. The metadata you can see when using Window Explorer properties is specifically called file system metadata. Metadata is associated with almost every type of electronic file available today. Even your e-mail headers and attachments contain metadata. Most metadata is hidden and you have to know how to access it to change or limit the information provided.

Try right-clicking an image on your hard drive. Photographers who capture the perfect shot and can't remember the camera settings, can try viewing the metadata attached to the picture to find out. This is especially true for JPEG images, although metadata is available on a wide variety of image file formats. In addition to information ranging from author to white balance to the camera lens manufacturer, the metadata stored with the image is in-depth, and possibly information you don't want someone viewing the image to know.


Here's everything you could possibly want to know about this JPEG image — and then some.

Microsoft Office and Metadata

Microsoft Office files, like other types of digital data, also carry metadata, called document metadata. Some of the types of metadata that may be stored along with your saved office documents can include your name, initials, company name, computer name, the disk or network server the file was stored in, file properties, revisions, hidden text, deleted comments,  and so much more. Microsoft Office documents are frequently passed among co-workers, clients and contractors, so when the documents are shared, quite often, so is large amounts of metadata.

The problem is often referred to as the metadata risk, and that risk is the disclosure of private information, usually unknowingly, because this metadata is hidden from plain view and users simply are not aware of it. When a document is sent outside your office to a client or contractor, the associated metadata may not stay hidden if the receiving party knows where and how to look for it. One critical area of interest is the capability to track changes made to the document. In Microsoft Word, metadata stores information about changed text, the name of the author making changes, and the date and time those changes were made. This information may be something those outside your company shouldn't have access to.

When using Microsoft Office applications or any application, it's important to familiarize yourself with which of the program's tools will let you remove this metadata, and show what is normally hidden mark-up so you can ensure this type of associated information about the content will not be shared when you share the actual file. The removal of this type of data is often called data scrubbing or data cleansing.

In the latest version, Microsoft Office 2007, Microsoft included a Document Inspector feature in Microsoft Office Word 2007, Microsoft Office Excel 2007, and Microsoft Office PowerPoint 2007 that can help you find and remove hidden data and personal information in your Office documents.


Microsoft Document Inspector finds two areas of metadata concern and provides an easy way to delete the information.

If you use Office 2003/XP, Microsoft offers an add-in you can download that enables users to permanently remove hidden data. With this Remove Hidden Data add-in, you can run the tool on individual files from within your Office application, or run it on multiple files from the command line. Additionally, a Google search will produce a wide array of results for third-party tools and software that you can use to wipe out metadata from various files created using different programs.

The Importance of Metadata

With so many privacy concerns surrounding metadata, you may wonder why it exists. Metadata is actually useful for searching and controlling content. For example, consider metadata on Web pages. Search engines often place a higher priority on metadata tags such as page title, keywords and description than they do on the actual contents of the page. To those searching the Web, this metadata is useful for finding relevant pages. Metadata is also important for faster and more accurate database search and retrieval and for information stored in data warehouses.

So while it does serve a very important role in computing, you do need to remember that metadata can disclose information about you and your business — information that you may not even realize exists.

Did You Know...

In a landmark 2004 case, the U.S. District Court ruled that electronic documents must be produced .in native format. and .with their metadata intact.. (Williams v. Sprint). Metadata includes message attributes such as file owner, creation date, routing details, the sender, receivers, and subject line. [Source: The New Federal Rules of Civil
Procedure: IT Obligations For Email
]


Key Terms To Understanding  metadata

metadata
Data about data. Metadata describes how and when and by whom a particular set of data was collected, and how the data is formatted.

meta
In computer science, a common prefix that means "about". So, for example, metadata is data that describes other data (data about data). A metalanguage is a language used to describe other languages. A metafile is a file that contains other files.

data warehouse
Sometimes abbreviated DW, a collection of data designed to support management decision making. Data warehouses contain a wide variety of data that present a coherent picture of business conditions at a single point in time.




Based in Nova Scotia, Vangie Beal is has been writing about technology for more than a decade. She is a frequent contributor to EcommerceGuide and managing editor at Webopedia. You can tweet her online @AuroraGG.





TECH RESOURCES FROM OUR PARTNERS
QUICK REFERENCE
Webopedia Polls

The trend for the past two years has been for shoppers to spend more online during the holiday season. How do you typically shop for holiday... Read More »

How to Create a Desktop Shortcut to a Website

This Webopedia guide will show you how to create a desktop shortcut to a website using Firefox, Chrome or Internet Explorer (IE). Read More »

Flash Data Storage Vendor Trends

Although it is almost impossible to keep up with the pace of ongoing product releases, here are three recent highlights in the flash data storage... Read More »

DID YOU KNOW?
The Great Data Storage Debate: Is Tape Dead?

Tape clearly is on the decline. But remember, legacy systems can hang for a shockingly long time. Read More »

Apple Pay Promises to Strengthen Payment Security

Experts believe that Apple Pay and other competitive payment systems will be far more secure than cards, even cards equipped with EMV chips. Read More »

Internet of Things Shaping IT's Future

To make the IoT both work and pay off, IT is juggling upgrading and building app-centric networks, mapping out new data center architectures and... Read More »