High-performance clusters
High-performance clusters are more than a cloud.
High-performance clusters have many applications.
Web-clusters are multiple redundant. Redundant load balancers are dispatching incoming web requests on multiple production threads. Each production thread consists of a web- and a database server. The goal is to process the incoming request massively parallel. On the other hand, this design gives redundancy to counteract system failures. Redundancy also makes it trivial to maintain production sites without haven a complete downtime. Each redundant part of the cluster can be maintained independently.
Number-crunching clusters distribute processes on a lot of CPUs/GPUs and need extremely fast network links like Infiniband to distribute memory segments in the cluster. The goal is the shear CPU-Power. Redundancy has an inferior role.
Failover-clusters serve primarily for redundancy. They consist of an active and a passive server. The servers mirror their data content in real-time from the active server to the passive one. If the active server goes down by failure or for maintenance the passive server takes seamlessly over the role of the now passive one.
High-performance clusters share resources. The sharing of resources is a crucial aspect of all clusters. Such resources are IP addresses, data in a database or file systems with lots of files. Building such a resource sharing cluster can be done be standard components.
Shared IP addresses e.g. for load balancing a bunch of web servers can be done by the Linux kernel utilizing the IPVS kernel module.
Replicated file systems in real time are the domain of DRBD.
Databases like MySQL, Postgres, Oracle all have batteries included methods for replication.
The software Pacemaker is a battle hardened tool to control resources distributed in a network.
High-performance clusters are complex. Often the complexity and the error-proneness of clusters is underestimated. The distribution of resource requires the control of these resource in real-time over the whole cluster. This control can only work out if at every time all components of the cluster can communicate with each other. The ultimate reliability of all network connections is crucial for the operation of the cluster, even if all servers in the cluster are highly redundant. Each network component starting at the cabling between switch and server, between switches, router, firewalls, load balancers etc. has to be connected by two truly independent network paths. The usual star shape network topology becomes a structure of interlinked rings. But rings topology is not allowed in ethernet. Without actively steering the topology utilizing the spanning-tree protocol the rings and the whole network will collapse under massive packet storms in seconds. I have accidently sent a whole data center offline by connecting the two network ports of a virtual machine with their real counterparts on the host machine the wrong way.
High-performance clusters need experience. We build clusters for more than 10 years for a wide range of applications. Starting euphorically soon disillusion has set in. We have learned to respect and fear the complexity of clusters and not to build anything fancy. Some critical failover processes have we put back into the responsibility of the administrators since the automatic failovers have caused too much trouble. Currently, we believe that a small downtime till the administrator reacts on an unexpected database fault is better than two databases with inconsistent data after a split brain in the cluster. Fixing two inconsistent databases usually takes much longer than a Nagios broadcast message to the admins in charge.
High-performance clusters are indispensable. Our customers have virtually no downtimes. Maintenance can be done on a part of the cluster. After testing this part for some days under production, the rest of the cluster can be maintained the same way. If there comes more load for the cluster additional servers are installed or rented from a cloud.
High-performance clusters are no cloud. Or the other way around. Clouds are not based on high-performance clusters. Sure clouds are also clusters and share resources. But the technology used is primitive compared to high-performance clusters while otherwise advertised. The virtual hard disk a virtual machine lives on is in most clouds a file distributed by an NFS server. The NFS layer comes with a massive overhead accessing to the underlying real harddisk. A high-performance cluster can also host VMs like we do at inqbus hosting. At inqbus hosting each VM on one of our CPU nodes has a true partition on our storage nodes which are connected via Infiniband. The storage nodes are replicated utilizing a second Infiniband network. But the true power of high-performance clusters come not from virtualisation but if the nodes are run bare metal.