|
"Data about data." That simple statement describes the essence of metadata. Breaking it down further,
data about data really refers to the information used to
describe specific content (also data).
For example, if you right-click on any
saved file, such as a text (.txt) file on your hard drive using
Windows Explorer you can select "properties" and see additional
information about that specific file. In this case, the information
is about the file itself and includes such things as the file name,
what program it can be opened with, when it was created, last
modified and last accessed, the file size, full path name of the
directory it is stored in, who created it, who the system owner is,
and so on.
This additional information you can
obtain about the file is the
metadata. The metadata you can see when
using Window Explorer properties is specifically called file system
metadata. Metadata is associated with almost every type of
electronic file available today. Even your e-mail headers and
attachments contain metadata. Most metadata is hidden and you have to
know how to access it to change or limit the information provided. Try right-clicking an image on your hard drive. Photographers who capture the perfect shot
and can't remember the camera settings, can try viewing the
metadata attached to the picture to find out. This is
especially true for JPEG images, although metadata is available on a wide variety of image file formats.
In addition to information ranging from author to white balance to the camera
lens manufacturer, the metadata stored with the image is in-depth,
and possibly information you don't want someone viewing the image to know. |
Key Terms To
Understanding metadata
metadata
Data about data. Metadata describes how and when and by whom a
particular set of data was collected, and how the data is formatted.
meta
In computer science, a common prefix that means "about". So, for
example, metadata is data that describes other data (data about
data). A metalanguage is a language used to describe other
languages. A metafile is a file that contains other files.
data warehouse
Sometimes abbreviated DW, a collection of data designed to support management
decision making. Data warehouses contain a wide variety of data that
present a coherent picture of business conditions at a single point
in time. |
 |
| Here's everything you could possibly want to know
about this JPEG image and then some. |
Microsoft Office and Metadata
Microsoft Office files, like other types of digital data, also carry
metadata, called document metadata. Some of the types of metadata that may be stored
along with your saved
office documents can include your name, initials, company name, computer name,
the disk or network server the file was stored in, file
properties, revisions, hidden text, deleted comments, and so much more. Microsoft Office documents are
frequently passed among co-workers, clients and contractors, so when the
documents are shared, quite often, so is large amounts of metadata.
The problem is often referred to as the
metadata risk, and that risk is the disclosure of private information,
usually unknowingly, because this metadata is hidden from plain view and users simply are
not aware of it. When a document is sent outside your office to a client
or contractor, the associated metadata may not stay hidden if the receiving
party knows where and how to look for it. One critical area of interest is
the capability to track changes made to the document. In Microsoft Word,
metadata stores information about changed text, the name of the author making changes, and the date and
time those changes were made. This information may be something those
outside your company shouldn't have access to.
When using Microsoft Office applications or any application, it's important to familiarize yourself with which of the
program's tools will let you remove this metadata, and show what
is normally hidden mark-up so you can ensure this type of associated
information about the content will not be shared when you share the actual
file. The removal of this type of data is often called
data scrubbing or
data cleansing.
In the latest version, Microsoft Office 2007,
Microsoft included a Document Inspector feature in Microsoft Office Word
2007, Microsoft Office Excel 2007, and Microsoft Office PowerPoint 2007 that
can help you find and remove hidden data and personal information in your
Office documents.
 |
| Microsoft Document Inspector
finds two areas of metadata concern
and provides an easy way to delete the information. |
If you use Office 2003/XP, Microsoft offers
an add-in you can download that enables users to permanently remove
hidden data. With this Remove Hidden Data add-in, you can run the tool on
individual files from within your Office application, or run it on multiple
files from the command line. Additionally, a Google search will produce a
wide array of results for third-party tools and software that you can use to
wipe out metadata from various files created using different programs.
The Importance of Metadata
With so many privacy concerns surrounding metadata, you may wonder why it
exists. Metadata is actually useful for searching and
controlling content. For example, consider metadata on
Web pages. Search
engines often place a higher priority on metadata tags such as page title,
keywords and description than they do on the actual contents of the page. To those searching the
Web, this metadata is useful for finding relevant pages.
Metadata is also important for faster and more accurate database
search and retrieval and for information stored in
data
warehouses.
So while it does serve a very important role
in computing, you do need to remember that metadata can disclose
information about you and your business information that you may not even realize exists.
Did You Know... In a landmark 2004 case, the U.S. District Court
ruled that electronic documents must be produced .in native format.
and .with their metadata intact.. (Williams v. Sprint). Metadata
includes message attributes such as file owner, creation date,
routing details, the sender, receivers, and subject line.
[Source:
The
New Federal Rules of Civil
Procedure: IT Obligations For Email] |
Vangie 'Aurora' Beal
Writer, www.Webopedia.com
Last updated: August 10, 2007
Remove hidden data and personal information from Office documents

If you plan to share an electronic copy of a Microsoft Office document, it is a
good idea to take the extra step of reviewing the document for hidden data or
personal information that might be stored in the document itself or in the
document properties (metadata (metadata: Data that describes other data. For
example, the words in a document are data; the word count is an example of
metadata.)).
Office 2003/XP Add-in: Remove Hidden Data

With this add-in you can permanently remove hidden data and collaboration data,
such as change tracking and comments, from Microsoft Word, Microsoft Excel, and
Microsoft PowerPoint files.
Ed
Bott's Windows Expertise: What.s hidden in your Word documents?
Last week, a company I worked with e-mailed me a contract and a cover memo
explaining the contract.s terms in plain English. Both documents were in Word
document (.doc) format. What the sender didn.t know was that the cover memo
contained some comments, written by various people as the document went through
the approval process.
Metadata--Think outside the docs!
Metadata lets intelligent computer programs find the meaning of your content,
beyond that discoverable by examining the documents themselves. Librarians
called it the machine-readable catalog (MARC). Tim Berners-Lee calls it the
Semantic Web.
Metadata Mistake by Top Spy Agency
Metadata is by definition out of sight. And what is out of sight is out of mind,
and so easily forgotten. The latest metadata mistake story proves this point in
classic .spy versus spy. fashion.
Hidden Data in JPEG Files

Digital cameras and image manipulation programs add hidden data to JPEG files.
For different reasons, one might want to remove these data before publishing the
files on the Internet.
Metadata
standards directory

This sites hosts a collection of links to
geographic data, metadata and interchange standards, metadata standards and
resources, multilingual tools and services, and links to Web sites containing
geographic information. |