Webopedia on Google+Webopedia on TwitterWebopedia on FacebookTech Bytes Blog
Main » Did You Know » Computer_Science »

Metadata: Data About Data


 

"Data about data."

That simple statement describes the essence of metadata. Breaking it down further, data about data really refers to the information used to describe specific content (also data).

For example, if you right-click on any saved file, such as a text (.txt) file on your hard drive using Windows Explorer you can select "properties" and see additional information about that specific file. In this case, the information is about the file itself and includes such things as the file name, what program it can be opened with, when it was created, last modified and last accessed, the file size, full path name of the directory it is stored in, who created it, who the system owner is, and so on.

This additional information you can obtain about the file is the metadata. The metadata you can see when using Window Explorer properties is specifically called file system metadata. Metadata is associated with almost every type of electronic file available today. Even your e-mail headers and attachments contain metadata. Most metadata is hidden and you have to know how to access it to change or limit the information provided.

Try right-clicking an image on your hard drive. Photographers who capture the perfect shot and can't remember the camera settings, can try viewing the metadata attached to the picture to find out. This is especially true for JPEG images, although metadata is available on a wide variety of image file formats. In addition to information ranging from author to white balance to the camera lens manufacturer, the metadata stored with the image is in-depth, and possibly information you don't want someone viewing the image to know.


Here's everything you could possibly want to know about this JPEG image — and then some.

Microsoft Office and Metadata

Microsoft Office files, like other types of digital data, also carry metadata, called document metadata. Some of the types of metadata that may be stored along with your saved office documents can include your name, initials, company name, computer name, the disk or network server the file was stored in, file properties, revisions, hidden text, deleted comments,  and so much more. Microsoft Office documents are frequently passed among co-workers, clients and contractors, so when the documents are shared, quite often, so is large amounts of metadata.

The problem is often referred to as the metadata risk, and that risk is the disclosure of private information, usually unknowingly, because this metadata is hidden from plain view and users simply are not aware of it. When a document is sent outside your office to a client or contractor, the associated metadata may not stay hidden if the receiving party knows where and how to look for it. One critical area of interest is the capability to track changes made to the document. In Microsoft Word, metadata stores information about changed text, the name of the author making changes, and the date and time those changes were made. This information may be something those outside your company shouldn't have access to.

When using Microsoft Office applications or any application, it's important to familiarize yourself with which of the program's tools will let you remove this metadata, and show what is normally hidden mark-up so you can ensure this type of associated information about the content will not be shared when you share the actual file. The removal of this type of data is often called data scrubbing or data cleansing.

In the latest version, Microsoft Office 2007, Microsoft included a Document Inspector feature in Microsoft Office Word 2007, Microsoft Office Excel 2007, and Microsoft Office PowerPoint 2007 that can help you find and remove hidden data and personal information in your Office documents.


Microsoft Document Inspector finds two areas of metadata concern and provides an easy way to delete the information.

If you use Office 2003/XP, Microsoft offers an add-in you can download that enables users to permanently remove hidden data. With this Remove Hidden Data add-in, you can run the tool on individual files from within your Office application, or run it on multiple files from the command line. Additionally, a Google search will produce a wide array of results for third-party tools and software that you can use to wipe out metadata from various files created using different programs.

The Importance of Metadata

With so many privacy concerns surrounding metadata, you may wonder why it exists. Metadata is actually useful for searching and controlling content. For example, consider metadata on Web pages. Search engines often place a higher priority on metadata tags such as page title, keywords and description than they do on the actual contents of the page. To those searching the Web, this metadata is useful for finding relevant pages. Metadata is also important for faster and more accurate database search and retrieval and for information stored in data warehouses.

So while it does serve a very important role in computing, you do need to remember that metadata can disclose information about you and your business — information that you may not even realize exists.

Did You Know...

In a landmark 2004 case, the U.S. District Court ruled that electronic documents must be produced .in native format. and .with their metadata intact.. (Williams v. Sprint). Metadata includes message attributes such as file owner, creation date, routing details, the sender, receivers, and subject line. [Source: The New Federal Rules of Civil
Procedure: IT Obligations For Email
]


Key Terms To Understanding  metadata

metadata
Data about data. Metadata describes how and when and by whom a particular set of data was collected, and how the data is formatted.

meta
In computer science, a common prefix that means "about". So, for example, metadata is data that describes other data (data about data). A metalanguage is a language used to describe other languages. A metafile is a file that contains other files.

data warehouse
Sometimes abbreviated DW, a collection of data designed to support management decision making. Data warehouses contain a wide variety of data that present a coherent picture of business conditions at a single point in time.







TECH RESOURCES FROM OUR PARTNERS
QUICK REFERENCE
How to Create a Desktop Shortcut to a Website

Creating desktop shortcuts to a websites is useful. When you double-click the icon from your desktop it automatically launches the browser and... Read More »

Flash Data Storage Vendor Trends

Although it is almost impossible to keep up with the pace of ongoing product releases, here are three recent highlights in the flash data storage... Read More »

15 Important Big Data Facts for IT Professionals

Keeping track of big data trends, research and statistics gives IT professionals  a solid foundation to plan big data projects. Here are 15... Read More »

DID YOU KNOW?
Keeping Data Secure Is Tougher than Ever

If hackers get their hands on your company's data, they can wreak havoc on customer relationships and cause tremendous damage to your brand and... Read More »

Windows XP: Move Along, There's Nothing to See Here

After more than 12 years of holding the title of most popular operating system in the world, Windows XP is taking center stage for its final... Read More »

Report: The Role of Big Data in the Marketing Industry

According to a new study from Infogroup Targeting Solutions, we can expect to see companies spend heavily on big data marketing initiatives in... Read More »