A Review On Web Search Engines Retrieval Methods

Authors Avatar

A Review On Web Search Engines’ Retrieval Methods

—Best match as principal and Boolean searching as auxiliary

21st December 2001

JIN, JIEJUN

MSc in Information Management

INF 6060 Information Storage and Retrieval

Abstract

This essay is aimed to analyse why most World Wide Web search engines provide best match searching as their principle retrieval method with Boolean searching playing an auxiliary role.  The World Wide Web has revolutionised the way in which people access information, and search engines are are widely used by people to find useful information on the Web.  There are pros and cons lying in both best match retrieval system and Boolean retrieval system.  Comparisons of these two retrieval methods show that the performance of best match searching is generally stronger than Boolean searching in an online environment for general uses.  Nevertheless, as long as in some circumstances Boolean searching provides more effective and accurate performance, a replacement is not seen.

Main Contents

The Importance of Searching The World Wide Web

Seeking information is an activity fundamental to all human beings.  Throughout history, one of man’s primary concerns has been to satisfy his information needs.  With the development of new technologies, people’s behaviour with regard to accessing information has been greatly changed.  Since the Internet was invented in 1969, it has been growing rapidly and now has extensions in every corner of the globe (Poutler (1997)).  The World Wide Web (WWW) has revolutionised the way in which people access information, and has opened up new possibilities in areas such as digital libraries, and the dissemination and retrieval of scientific information.  The Internet is proving to have important implications in areas as diverse as education, commerce, entertainment, and medicine and health care.

With the help of web search engines, people can efficiently search the large amount of information available.  The volume of information on search engines has been exploding in the past years.  As a huge amount of information has long been available in libraries, the revolution that World Wide Web has brought is rather an improvement in the efficiency of accessing information.  The results of GVU’s April 1998 WWW user survey indicate that about 86% of people now find a useful Web site through search engines, and 85% find them through hyperlinks in other Web pages; people now use search engines as much as surfing the Web to find information (Kobayashi, M. & Takeda, K. (2000)).  This indicates the importance of the role that web search engines are playing for people accessing the information on the Internet.

What is a World Wide Web Search Engine?

Poutler (1997) defines a ‘World Wide Web search engine’ as a retrieval service consisting of a database (or databases) mainly describing resources available on the World Wide Web (WWW), search software and a user interface also available via the WWW.

“Archie”, the earliest Internet search engine appeared at McGill University.  It allowed keyword searches of database of name of files available via FTP (File Transfer Protocol).  A global network of Archie servers was set up and each server offered local access to a “mirror’ of the original Archie database.  Only substring filenames could be searched.  However a single Archie search could turn up to references to a file stored on many different sites, then the searcher could retrieve the nearest copy of this file by FTP.  (Poutler (1997))

Gopher, which was created at University of Minnesota in 1992, later allowed the creation of menus and links from items in these menus to either files or to menus on other Gopher servers.  This was the forerunner of World Wide Web.  A search engine called Veronica was developed for searching on Gopher.  It supported standard Boolean operators and it would default a multiword search without operators to Boolean AND.  (Poutler (1997))

Join now!

Today, web search engines use software robots to survey the Web and build their databases.  Web documents are retrieved and indexed.  When a user enters a query at a search engine website, the input is checked against the search engine's keyword indices.  The best matches are then returned to the user as “hits”.  Most web search engines offer two different types of searches—“basic” and “advanced” searches.  

A “basic” search normally uses a st matcbeh retrieval method.  This is the principle retrieval method of most Web search engines.  A "basic" search requires the user simply to enter a query without ...

This is a preview of the whole essay