All Topics  
Search engine

 

   Email Print
   Bookmark   Link






 

Search engine



 
 
A search engine is an information retrieval system
Information retrieval

Information retrieval is the science of searching for documents, for information within documents and for Metadata about documents, as well as that of searching relational databases and the World Wide Web....
 designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information which must be consulted, akin to other techniques for managing information overload
Information overload

Information overload refers to an excess amount of information being provided, making processing and absorbing tasks very difficult for the individual because sometimes we cannot see the validity behind the information ....
.

The most public, visible form of a search engine is a Web search engine
Web search engine

A Web search engine is a tool designed to search for information on the World Wide Web. The search results are usually presented in a list and are commonly called hits....
 which searches for information on the World Wide Web
World Wide Web

The World Wide Web is a very large set of interlinked hypertext documents accessed via the Internet. With a Web browser, one can view Web pages that may contain writing, s, videos, and other multimedia and navigate between them using hyperlinks....
.

ch engines provide an interface
Interface (computer science)

Interface generally refers to an Abstraction_%28computer_science%29 that an entity provides of itself to the outside. This separates the methods of external communication from internal operation, and allows it to be internally modified without affecting the way outside entities interact with it, as well as provide Polymorphism in object-orien...
 to a group of items that enables users to specify criteria about an item of interest and have the engine find the matching items. The criteria are referred to as a search query.






Discussion
Ask a question about 'Search engine'
Start a new discussion about 'Search engine'
Answer questions from other users
Full Discussion Forum



Encyclopedia


A search engine is an information retrieval system
Information retrieval

Information retrieval is the science of searching for documents, for information within documents and for Metadata about documents, as well as that of searching relational databases and the World Wide Web....
 designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information which must be consulted, akin to other techniques for managing information overload
Information overload

Information overload refers to an excess amount of information being provided, making processing and absorbing tasks very difficult for the individual because sometimes we cannot see the validity behind the information ....
.

The most public, visible form of a search engine is a Web search engine
Web search engine

A Web search engine is a tool designed to search for information on the World Wide Web. The search results are usually presented in a list and are commonly called hits....
 which searches for information on the World Wide Web
World Wide Web

The World Wide Web is a very large set of interlinked hypertext documents accessed via the Internet. With a Web browser, one can view Web pages that may contain writing, s, videos, and other multimedia and navigate between them using hyperlinks....
.

How search engines work

Search engines provide an interface
Interface (computer science)

Interface generally refers to an Abstraction_%28computer_science%29 that an entity provides of itself to the outside. This separates the methods of external communication from internal operation, and allows it to be internally modified without affecting the way outside entities interact with it, as well as provide Polymorphism in object-orien...
 to a group of items that enables users to specify criteria about an item of interest and have the engine find the matching items. The criteria are referred to as a search query. In the case of text search engines, the search query is typically expressed as a set of words that identify the desired concept
Concept

A concept is a cognition unit of meaning— an abstraction idea or a mental symbol sometimes defined as a "unit of knowledge," built from other units which act as a concept's characteristics....
 that one or more document
Document

A document is a bounded physical representation of body of information designed with the capacity to communication. A document may manifest symbolic, diagrammatic or sensory-representational information....
s may contain. There are several styles of search query syntax
Syntax

In linguistics, syntax is the study of the principles and rules for constructing Sentence s in natural languages. In addition to referring to the discipline, the term syntax is also used to refer directly to the rules and principles that govern the sentence structure of any individual language, as in "the Irish syntax"....
 that vary in strictness. It can also switch names within the search engines from previous sites. Whereas some text search engines require users to enter two or three words separated by white space
Whitespace (computer science)

In computer science, whitespace is any single character or series of characters that represents horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visual mark, but typically does occupy an area on a page....
, other search engines may enable users to specify entire documents, pictures, sounds, and various forms of natural language
Natural language

In the philosophy of language, a natural language is a language that is spoken, Sign language, or writing by humans for general-purpose communication, as distinguished from formal languages and from constructed languages....
. Some search engines apply improvements to search queries to increase the likelihood of providing a quality set of items through a process known as query expansion
Query expansion

Query expansion is the process of reformulating a seed query to improve retrieval performance in information retrieval operations.In the context of web search engines, query expansion involves evaluating a user's input and expanding the search query to match additional documents....
.

The list of items that meet the criteria specified by the query is typically sorted, or ranked. Ranking items by relevance (from highest to lowest) reduces the time required to find the desired information. Probabilistic
Probability

Probability, or wikt:chance, is a way of expressing knowledge or belief that an Event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy to draw conclusions about t...
 search engines rank items based on measures of similarity
Similarity

Similarity is some degree of symmetry in either analogy and resemblance between two or more concepts or physical objects. The notion of similarity rests either on exact or approximate repetitions of patterns in the comparison items....
 (between each item and the query, typically on a scale of 1 to 0, 1 being most similar) and sometimes popularity
Popularity

Popularity is the quality of being well-liked or mainstream. Cult of personality are an important part of many people's personal value systems, and forms a vital component of success in people-oriented fields such as politics....
 or authority
Authority

In government, authority is often used interchangeably with the term "power ". However, their meanings differ: while "power" refers to the ability to achieve certain ends, "authority" refers to a claim of legitimacy , the justification and right to exercise that power....
 (see Bibliometrics
Bibliometrics

Bibliometrics is a set of methods used to study or measure texts and information. Citation analysis and content analysis are commonly used bibliometric methods....
) or use relevance feedback
Relevance feedback

Relevance feedback is a feature of some information retrieval systems. The idea behind relevance feedback is to take the results that are initially returned from a given query and to use information about whether or not those results are relevant to perform a new query....
. Boolean
Boolean

Boolean , as a noun or an adjective, may refer to:* Boolean algebra , a logical calculus of truth values or set membership* Boolean algebra , a set with operations resembling logical ones...
 search engines typically only return items which match exactly without regard to order, although the term boolean search engine may simply refer to the use of boolean-style syntax (the use of operators AND, OR, NOT, and XOR) in a probabilistic context.

To provide a set of matching items that are sorted according to some criteria quickly, a search engine will typically collect metadata
Metadata

Metadata is "data about other data", of any sort in any media. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items and hierarchical levels, for example a database schema....
 about the group of items under consideration beforehand through a process referred to as indexing
Index (search engine)

Search engine index collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics and computer science....
. The index typically requires a smaller amount of computer storage
Computer storage

Computer data storage, often called storage or memory, refers to computer components, devices, and recording medium that retain digital data used for computing for some interval of time....
, which is why some search engines only store the indexed information and not the full content of each item, and instead provide a method of navigating to the items in the search engine result page. Alternatively, the search engine may store a copy of each item in a cache
Cache

In computer science, a cache is a collection of data duplicating original values stored elsewhere or computed earlier, where the original data is expensive to fetch or to compute, compared to the cost of reading the cache....
 so that users can see the state of the item at the time it was indexed or for archive purposes or to make repetitive processes work more efficiently and quickly.

Other types of search engines do not store an index. Crawler, or spider type search engines (a.k.a. real-time search engines) may collect and assess items at the time of the search query, dynamically considering additional items based on the contents of a starting item (known as a seed, or seed URL in the case of an Internet crawler). Meta search engines store neither an index nor a cache and instead simply reuse the index or results of one or more other search engines to provide an aggregated, final set of results.

See also