Apache ZooKeeper
Encyclopedia
Apache ZooKeeper is a software project of the Apache Software Foundation
Apache Software Foundation
The Apache Software Foundation is a non-profit corporation to support Apache software projects, including the Apache HTTP Server. The ASF was formed from the Apache Group and incorporated in Delaware, U.S., in June 1999.The Apache Software Foundation is a decentralized community of developers...

, providing an open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

 centralized configuration service and naming registry for large distributed systems. ZooKeeper is a sub project of Hadoop
Hadoop
Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data...

.

ZooKeeper's architecture supports High-availability
High-availability cluster
High-availability clusters are groups of computers that support server applications that can be reliably utilized with a minimum of down-time. They operate by harnessing redundant computers in groups or clusters that provide continued service when system components fail...

 through redundant services. The clients can thus ask another ZooKeeper master if the first fails to answer. ZooKeeper nodes store its data in a hierarchical name space, much like a file system or a trie
Trie
In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the...

 datastructure. Clients can read and write from/to the nodes and in this way have a shared configuration service.

ZooKeeper is used by companies including Rackspace
Rackspace
Rackspace US, Inc. is an IT hosting company based in San Antonio, Texas. The company also has offices in Australia, the United Kingdom, The Netherlands and Hong Kong, and data centers operating in Texas, Illinois, Virginia, the United Kingdom, and Hong Kong in late 2008...

 and Yahoo!
Yahoo!
Yahoo! Inc. is an American multinational internet corporation headquartered in Sunnyvale, California, United States. The company is perhaps best known for its web portal, search engine , Yahoo! Directory, Yahoo! Mail, Yahoo! News, Yahoo! Groups, Yahoo! Answers, advertising, online mapping ,...

 as well as open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

 enterprise search
Enterprise search
Enterprise search is the practice of making content from multiple enterprise-type sources, such as databases and intranets, searchable to a defined audience.-Enterprise search summary:...

 systems like Solr
Solr
Solr is an open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document handling...


Typical use cases

  • Naming service
  • Configuration management
    Configuration management
    Configuration management is a field of management that focuses on establishing and maintaining consistency of a system or product's performance and its functional and physical attributes with its requirements, design, and operational information throughout its life.For information assurance, CM...

  • Synchronization
    Synchronization
    Synchronization is timekeeping which requires the coordination of events to operate a system in unison. The familiar conductor of an orchestra serves to keep the orchestra in time....

  • Leader election
    Leader election
    In distributed computing, leader election is the process of designating a single process as the organizer of some task distributed among several computers . Before the task is begun, all network nodes are unaware which node will serve as the "leader," or coordinator, of the task...

  • Message Queue
    Message queue
    In computer science, message queues and mailboxes are software-engineering components used for interprocess communication, or for inter-thread communication within the same process. They use a queue for messaging – the passing of control or of content...

  • Notification

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK