Database search engine
Encyclopedia
There are several categories of search engine software: Web search or full-text search (example: Lucene
Lucene
Apache Lucene is a free/open source information retrieval software library, originally created in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License....

), database or structured data search (example: Dieselpoint), and mixed or enterprise search
Enterprise search
Enterprise search is the practice of making content from multiple enterprise-type sources, such as databases and intranets, searchable to a defined audience.-Enterprise search summary:...

 (example: Google Search Appliance
Google Search Appliance
The Google Search Appliance is a rack-mounted device providing document indexing functionality that can be integrated into an intranet, document management system or web site using a Google search-like interface for end-user retrieval of results. The operating system is based on CentOS...

). The largest web search engines such as Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

 and Yahoo!
Yahoo!
Yahoo! Inc. is an American multinational internet corporation headquartered in Sunnyvale, California, United States. The company is perhaps best known for its web portal, search engine , Yahoo! Directory, Yahoo! Mail, Yahoo! News, Yahoo! Groups, Yahoo! Answers, advertising, online mapping ,...

 utilize tens or hundreds of thousands of computers to process billions of web pages and return results for thousands of searches per second. High volume of queries and text processing requires the software to run in highly distributed environment with high degree of redundancy. Modern search engines have the following main components:

Searching for text-based content in databases or other structured data formats (XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

, CSV
Comma-separated values
A comma-separated values file stores tabular data in plain-text form. As a result, such a file is easily human-readable ....

, etc.) presents some special challenges and opportunities which a number of specialized search engines resolve. Databases are slow when solving complex queries (with multiple logical or string matching arguments. Databases allow logical queries which full-text search doesn't (use of multi-field boolean logic for instance). There is no crawling necessary for a database since the data is already structured but it is often necessary to index the data in a more compact form designed to allow for faster search.

Database search engines were initially (and still usually are) included with major database software products. As such, they are usually called indexing engines. However, these indexing engines are relatively limited in their ability to customize indexing formats (compounding, normalization, transformation, transliteration
Transliteration
Transliteration is a subset of the science of hermeneutics. It is a form of translation, and is the practice of converting a text from one script into another...

, etc.) Usually they do not provide sophisticated data matching technology (string matching, boolean logic
Boolean logic
Boolean algebra is a logical calculus of truth values, developed by George Boole in the 1840s. It resembles the algebra of real numbers, but with the numeric operations of multiplication xy, addition x + y, and negation −x replaced by the respective logical operations of...

, algorithmic methods, search scripting, etc.).

In more advanced Database search systems relational databases are indexed by compounding multiple tables into a single table containing only the fields that need to be queried (or displayed in search results). The actual data matching engines can include any functions from basic string matching, normalization, transformation, Database search technology is heavily used by government database services, e-commerce companies, web advertising platforms, telecommunications service providers, etc.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK