Web business consultants

How and why Stopwords are used by Search Engines

Stop Words

To save disk space and to reduce calculation times for search responses, search engines either:

  • do not index (record),
  • or ignore in their search response calculations,
  • or both,

common words which are unlikely to have any relevancy in searches.

These are known as "stop words."  Examples: "this", "that", "those", "the".

Saving Space on Servers

Consider this sentence:

The way to the school is long and hard when walking in the rain.

The appears three times. To save space, a search engine might replace it with what's called a marker. The sentence would be stored like this:

* way to * school is long and hard when walking in * rain.

This explanation is simplified, but the point is that using markers can save a lot of disk space. The sentence retains most of its relevancy, and the extra space can be used to store more web pages.

Speeding up Searches

Some search engines store every word on a web page but they don't search for certain ones to save time. Consider a search for:    the piano player

The search engine has to make three runs to find matches (again, this is oversimplified).

First it looks for all matches of the, then all matches of piano, then all matches of player.

Chances are, just looking for the last two words is enough to find relevant pages. So to save time, the search engine excludes searching for a select number of small words. It won't "stop" to look for them.

Internet Marketing Engine - Web business consultants to small & mid-size organizations
Privacy and Security

Phone: Chicago +1 (708) 401-0421 or Melbourne, Australia +61 (3) 9670-1165

Copyright © 2000-2008 Internet Marketing Engine Australia USA. All rights reserved.

Common words which are unlikely to have any relevancy in searches:About Us|Contact Us|Site Map|Home