Two-node cluster
Encyclopedia
A two-node cluster is the minimal high-availability cluster
High-availability cluster
High-availability clusters are groups of computers that support server applications that can be reliably utilized with a minimum of down-time. They operate by harnessing redundant computers in groups or clusters that provide continued service when system components fail...

 that can be built.
Should one node fail (for a hardware
Hardware
Hardware is a general term for equipment such as keys, locks, hinges, latches, handles, wire, chains, plumbing supplies, tools, utensils, cutlery and machine parts. Household hardware is typically sold in hardware stores....

 or software problem), the other must acquire the resource
Resource (computer science)
A resource, or system resource, is any physical or virtual component of limited availability within a computer system. Every device connected to a computer system is a resource. Every internal system component is a resource...

s being previously managed by the failed node, in order to re-enable access to these resources. This process is known as failover
Failover
In computing, failover is automatic switching to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active application, server, system, or network...

.

Introduction and some definitions

There are various kinds of resources, notably:
  • Storage space (containing data, binaries, or everything else that needs to be accessed)
  • Network address(es) (the users can reach the resources via a network connection)
  • Application software (that acts as an interface between users and other resources)


Typical services provided by a computer cluster are built from a combination of each of the previously defined resources.

As an example, an Oracle database
Oracle database
The Oracle Database is an object-relational database management system produced and marketed by Oracle Corporation....

 service might be composed of:
  • some storage space, to hold the database files (and, ultimately, the data);
  • an Oracle installation, configured to be remotely (or locally) accessed; and,
  • an IP address to listen on; the users must connect to this address in order to use Oracle to access the data.

Required

  • Two hosts, each with its own local storage device(s)
  • Shared storage, that can be accessed by each host (or node), such as a file server
  • Some method of interconnection that enables one node to see if the other is dead, and to help coordinate resource access

Interconnection topologies

  • A serial crossover cable is the simpler (and more reliable) way to ensure proper intracluster communication
  • An Ethernet crossover cable needs each host's TCP/IP stack to be functional to ensure proper intracluster communication
  • A shared disk (in advanced setups), usually used for heartbeat only

Optional but strongly recommended

Gear to eliminate other single points of failure:
  • Three Uninterruptible Power Supplies
    Uninterruptible power supply
    An uninterruptible power supply, also uninterruptible power source, UPS or battery/flywheel backup, is an electrical apparatus that provides emergency power to a load when the input power source, typically mains power, fails...

    , one for each node and one for the shared storage
  • Redundant network connections (using dual NICs and dual switches with bonding or trunking software on the server)


Some method of exclusive access to shared resources. This can be:
  • Physical, in order to forcefully eject the other machine:
    • Two power switches, allowing each node to remotely cut the power to the other when it becomes stuck or inoperative
  • Logical, using one of (as appropriate for each resource):

Classification by role symmetry

There are two kinds of two-node clusters, from this perspective:

Active/Passive

One node owns the services, the other one remains inoperative.

Should the primary node fail, the secondary or backup node takes the resources and reactivates the services, while the ex-primary remains in turn inoperative.

This is a configuration where only one node is operative at any point of time.

Active/Active

There is no concept of a primary or backup node: both nodes provide some service, should one of these nodes fail, the other must also assume the failed node's services.

Service based

Every service is independent from each other, provided by the cluster: for example, a web server
Web server
Web server can refer to either the hardware or the software that helps to deliver content that can be accessed through the Internet....

 and a mail server are run on the cluster, and each one can be independently managed, switched from one node to the other, without affecting the functionality of other services.

One open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

 example of this kind of cluster is Kymberlite.

Logical-host based

In more complex configurations, there can be dependencies among services.

For example, a mail server receives e-mail for local users, thus storing the mail on a local storage resource, but requires the ability to read this email from a remote site. This then requires a mail retrieving server, such as an IMAP server.

Both of these services need access to the same storage resource, the first for writing the e-mail messages that arrived from the Internet, the second to read, move or delete them.

This means that the mail server cannot simply be failed over from one node to the other, as the mail retrieving server will still need access to the same data. These two services must be grouped together, forming a so-called logical host. To be more precise, this logical host will consist of three resources:
  • the storage resource, needed by both server applications;
  • the mail transfer service that receives e-mail from the Internet; and,
  • the mail retrieval service that acts as an interface, permitting users to view their email.


If it becomes necessary to fail over this logical host, the following steps need to be automatically or manually performed:
  • stop the mail retrieving service on the failed node (if possible)
  • stop the mail transfer service on the failed node (if possible)
  • release the storage resource on the failed node (if possible)
  • acquire the storage resource on the failover node
  • start the mail transfer service on the failover node
  • start the mail retrieving service on the failover node


One open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

 example of this kind of cluster is Linux-HA
Linux-HA
The Linux-HA project provides a high-availability solution for Linux, FreeBSD, OpenBSD, Solaris and Mac OS X which promotes reliability, availability, and serviceability ....

; one commercial example for systems running the Solaris Operating System
Solaris Operating System
Solaris is a Unix operating system originally developed by Sun Microsystems. It superseded their earlier SunOS in 1993. Oracle Solaris, as it is now known, has been owned by Oracle Corporation since Oracle's acquisition of Sun in January 2010....

 is called Sun Cluster
Sun Cluster
Solaris Cluster is a high-availability cluster software product for the Solaris Operating System, created by Sun Microsystems, a subsidiary of Oracle Corporation. It is used to improve the availability of software services such as databases, file sharing on a network, electronic commerce...

.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK