Clustering for scalability and failover

Clustering is an effective way to perform vertical and horizontal scaling of application servers.

Scalability pertains to the capability of a system to adapt readily to a greater or lesser intensity of use, volume, or demand. For example, a scalable system can efficiently adapt to work with larger or smaller networks performing tasks of varying complexity. Ideally, it is possible to handle any given load by adding more servers and machines, assuming each additional machine processes its fair share of client requests. Each machine should process a share of the total system load that is proportional to the processing power of the machine.

Using cluster members can improve the performance of a server, simplify its administration, and enable the use of workload management.

Vertical scaling (scale up topology)

In vertical scaling, as illustrated in Figure 2-3, multiple cluster members for an application server are defined on the same LPAR, or node, which might allow the LPAR's processing power to be more efficiently allocated.

Figure 2-3 Vertical scaling

Even if a single JVM can fully utilize the processing power of the machine, you might still want to have more than one cluster member on the machine for other reasons, such as using vertical clustering for process availability. If a JVM reaches a memory limit (or if there is some similar problem), then the presence of another process provides for failover.

If we install several application server instances on a single LPAR (assuming the LPAR has sufficient resources, such as CPU) to create a vertical cluster, then application throughput increases.

Vertical clusters are valuable for better utilization of the LPAR when the operating system otherwise constrains the availability of resources on a process boundary. For example, if a JVM process is pinned to a single processor on a symmetric multiprocessor (SMP) computer, introducing additional application server instances allows the process to utilize other CPUs on the same computer (presuming they would be assigned to the other processors). Or, if the operating system limits the number of connections that can be formed to a single process, then an increase in the number of effective connections to the computer can be made by increasing the number of application server instances.

Although vertical scaling can improve availability by creating multiple JVM processes, the LPAR itself remains a single point of failure. Therefore, the use of vertical clustering should not be viewed as a means of achieving high availability.

Before implementing a vertical cluster, determine if wer applications are bound by CPU, by I/O, or by network issues. Avoid using rules of thumb when determining the number of cluster members for a given machine. The only way to determine what is correct for your environment and application is to tune a single instance of an application server for throughput and performance, and then add it to a cluster and incrementally add additional cluster members. Ensure that you test performance and throughput as each member is added to the cluster. Always monitor memory usage when you are configuring a vertical scaling topology, and do not exceed the available physical memory on a machine.

In general, 85% (or more) utilization of the CPU on a large server shows that there is little, if any, performance benefit to be realized from adding additional cluster members.

Note: We also have the flexibility of removing a resource from an LPAR if the application does not utilize it. The resource can be dynamically moved to another LPAR where required.

Horizontal scaling (scale out topology)

In horizontal scaling, as illustrated in Figure 2-4, cluster members are created on multiple physical machines (or LPARs). This allows a single WebSphere application to run on several machines, while still presenting a single system image, making the most effective use of the resources of a distributed computing environment.

Horizontal scaling is especially effective in environments that contain many small- to medium-sized LPARs; client requests that overwhelm a single LPAR can be distributed over several LPARs in the system.

Figure 2-4 Horizontal scaling

Failover is another important benefit of horizontal scaling. If a machine becomes unavailable, its workload can be routed to other machines containing cluster members.

Horizontal scaling can handle application server process failures and hardware failures (or maintenance) without significant interruption to client service. It is common to use similar machines to host members from a cluster. This allows you to easily plan for future capacity need in a linear fashion.

You can also use WAS Edge components, such as the Caching Proxy Edge component and the Load Balancer component set (which includes the Dispatcher component) to implement horizontal scaling.

Combining vertical and horizontal scaling

WebSphere applications can combine vertical and horizontal scaling to reap the benefits of both scaling techniques, as illustrated in Figure 2-5.

Figure 2-5 Vertical and horizontal clustering

As a rule of thumb for real world applications, first use vertical scaling to improve performance, while carefully monitoring server-to-CPU and server-to-memory ratios. After performance is optimized, start using horizontal scaling to provide failover and redundancy support to maintain 24x7 uptime with the desired performance.