Bayesian Filter

(b z -en fil t r) (n.) A technique for identifying incoming e-mail spam. Unlike other filtering techniques that look for spam-identifying words in subject lines and headers, a Bayesian filter uses the entire context of an e-mail when it looks for words or character strings that will identify the e-mail as spam. Another difference between a Bayesian filter and other content filters is that a Bayesian filter learns to identify new spam the more it analyzes incoming e-mails.

Named for English Mathematician, Thomas Bayes

Bayesian filtering is named for English mathematician Thomas Bayes, who developed a theory of probability inference. Bayesian filtering is predicated on the idea that spam can be filtered out based on the probability that certain words will correctly identify a piece of e-mail as spam while other words will correctly identify a piece of e-mail as legitimate and wanted. At its most basic level, a Bayesian filter examines a set of e-mails that are known to be spam and a set of e-mails that are known to be legitimate and compares the content in both e-mails in order to build a database of words that will, according to probability, identify, or predict, future e-mails as spam or not. Bayesian filters examine the words in a body of an e-mail, its header information and metadata, word pairs and phrases and even HTMLcode that can identify, for example, certain colors that can indicate a spam e-mail.

How it Works

Bayesian filters are adaptable in that the filter can train itself to identify new patterns of spam and can be adapted by the human user to adjust to the user’s specific parameters for identifying spam. Bayesian filters also are advantageous because they take the whole context of a message into consideration. For example, not every e-mail with the word “cash” in it is spam, so the filter identifies the probability of an e-mail with the word “cash” being spam based on what other content is in the e-mail.

Proponents of Bayesian filters assert that the filters return less than one percent of false positives.

Other forms: Bayesian filtering (v.)

Webopedia Staff
Webopedia Staff
Since 1995, more than 100 tech experts and researchers have kept Webopedia’s definitions, articles, and study guides up to date. For more information on current editorial staff, please visit our About page.

Related Articles

DocuSign

What is DocuSign? DocuSign is an agreement management application that enables businesses to create, send, and automate a wide variety of forms and contracts and...

Compliance

What is compliance? Compliance or regulatory compliance is a term used across industries to describe rules and policies that prohibit or regulate specific products, services,...

User Experience

User experience describes a user's interaction with products, systems, and services and includes usability, design, navigation, and impression.

Management Information Systems (MIS)

What is a Management Information System? A Management Information System (MIS) is an information system that provides managers with the tools to effectively organize, evaluate,...

Venture Capital

Venture capital (VC) offers startups and developing businesses growth opportunities with funding from...

Third-Party Apps

A third-party application is an application provided by a vendor other than the...

Ernst & Young (EY)

Ernst & Young Global Limited, commonly known as EY, is a multinational professional...