|
Linux has shown a lot of growth in the area of data-centric, high-availability clustering. Most admins are already familiar with computational clusters, known loosely as Beowulf clusters, which are implemented in the form of MPI, PVM, LAM, MOSIX, and other process-sharing and process-distributing technologies. There are also "Web service clusters", such as those distributed in years past by TurboLinux and others. These were typically groups of similarly configured servers that used DNS and round-robin IP address tricks to give the illusion of Web server high-availability to end users.
Cohesive operation between the nodes, however, was still only achieved through a shared-storage medium, such as Fibre Channel or shared SCSI, which are prohibitively expensive for small businesses, or proprietary cluster hardware and software, which is also prohibitively expensive. A database engine that serves a Web cluster must still itself be clustered to achieve true high-availability. Application-level high-availability tools (such as the MySQL database engine) that transparently replicate themselves between servers, are also being used to provide some level of redundancy. The one area in which Linux still starves for attention is in the realm of lightweight, easily configured, affordable high-availability -- a general-purpose cluster. A general-purpose, high availability cluster must be "application agnostic" -- it should not care what runs on it, whether it be Web server, mail server, database server, or any future, yet-unknown type of service. The cluster should give a uniform style of operation no matter what application is running. In response to this, I have written TKCluster (when I initially wrote it, I couldn't think of a good name for it, so I just prefixed "cluster" with my initials). TKCluster is available under the GPL so that anyone can freely download and modify it to suit their needs. Overview TKCluster itself is a cluster manager. Raw data replication between nodes is performed by the wonderful DRBD driver by Phillipp Reisner. DRBD is a block device that maps to a given raw disk partition and a socket. Writes to the DRBD device (/dev/nb0 .. /dev/nbX) go to both the physical disk in the local machine, as well as to the waiting secondary node over a standard Ethernet connection. All clusters require some kind of "heartbeat" mechanism. After experimenting with various ones, I chose openMosix. openMosix was designed to share computation-intensive process loads between multiple machines; however, I have yet to find anything that does as good a job at maintaining a frequently updated list of connected machines. The process-load sharing and MFS filesystem (analogous to traditional NFS, but infinitely smarter) make openMosix a perfect candidate for helping to tie the cluster nodes together. Although MFS is not necessary for TKCluster operation, it sure helps when copying configuration files around between the nodes of the cluster. TKCluster's role is to use the data gleaned from both openMosix and DRBD to make decisions about starting services, seizing control of the cluster IP address, and keeping the sister copy of TKCluster on the other node of the cluster aware of what's going on. Read more at SysAdmin and at LinuxGazette |