Web Crawler Definition & Meaning

A web crawler is a bot that moves through web pages and indexes their content so that users can find it in subsequent searches. The most prominent bots are manned by major search engines. Google has multiple web crawling bots; others include Yahoo‘s bot and Chinese tech corporation Baidu’s bot. A web crawler primarily travels web pages using both external and internal links. Web crawlers are also referred to as spiders.

If a web domain owner wants their site to be found in searches, they must allow web crawling. Search engines will only present web pages that they have discovered through crawling. As a web crawler moves through a page, it indexes, or records, all of the relevant information on the page (often any information on the page) so that it can pull up those pages when a user makes a search engine query. Not all of the Internet is indexed; researchers aren’t sure how much. But only public web pages can be accessed by web crawlers; private pages cannot. A website can also add the robots.txt extension to the HTML for pages that should not be crawled by a bot, or use “noindex” tags in the HTML itself.

Web crawlers and SEO

Web crawlers find content for search engines; what they gather from a web page affects that page’s search engine optimization ranking. If a page has a lot of keywords and relevant links when it is indexed, it will display more prominently on a search engine. Having keywords in important places, such as headings and meta data, also gives a web page better SEO visibility. Web crawlers not only pay attention to the plain text on a web page, they also study meta data and the way users respond to a page, so it’s important for a website to choose accurate meta data to be more accurately displayed in a search engine – and to have content that answers relevant search queries.

Crawler bots have also been used for malicious purposes, such as spreading false content or harvesting user information, and they’ve also been used to gauge and influence opinion.






Jenna Phipps
Jenna Phipps
Jenna Phipps is a contributor for websites such as Webopedia.com and Enterprise Storage Forum. She writes about information technology security, networking, and data storage. Jenna lives in Nashville, TN.

Top Articles

Huge List Of Texting and Online Chat Abbreviations

From A3 to ZZZ we list 1,559 text message and online chat abbreviations to help you translate and understand today's texting lingo. Includes Top...

How To Create A Desktop Shortcut To A Website

This Webopedia guide will show you how to create a desktop shortcut to a website using Firefox, Chrome or Internet Explorer (IE). Creating a desktop...

The History Of Windows Operating Systems

Microsoft Windows is a family of operating systems. We look at the history of Microsoft's Windows operating systems (Windows OS) from 1985 to present...

Hotmail [Outlook] Email Accounts

  By Vangie Beal Hotmail is one of the first public webmail services that can be accessed from any web browser. Prior to Hotmail and its...

Unregulated Power Supply Definition...

An unregulated power supply is a system that transforms input voltage into direct...

Cybersecurity Awareness Training Definition...

Cybersecurity awareness training informs employees of the attack surfaces and vectors in their...

OST File Definition &...

An OST file, or offline storage table (.ost) file, is an Offline Outlook...