Building your site capacity to account for High Availability

Building your site capacity to account for High Availability

Usually, the more the excess capacity, the higher the availability. The excess capacity would usually translate to how many redundant components may be allowed to fail without causing any noticeable drop in user experience and thus still meeting the business capacity requirements. As always, cost of redundant components required to maintain excess capacity and business requirements dictate what level of High Availability could be built into a site.
Expected peak capacity versus maximum business capacity

The maximum business capacity (TB) should also be higher than the peak capacity (TP) expected. The expected peak capacity is based on the peak workload expected by the site. Any excess capacity, for example, C B-P (for example, excess throughput in the above graph, TB - TP), over the peak capacity contributes towards High Availability of the site.
Usually, if your peak workload drives your available Web server CPU to >=50%, your available database CPU >=60%, and the following day is a public holiday in in a country and CPU >= 70% of its capacity, then it is time to scale your system. That is, your business capacity requirements for CPU consumption are 50%, 60%, and 70% for your Web server, database, machines where the following day is a public holiday in a country, respectively.
Maximum business capacity versus maximum system capacity

Notice that the theoretical maximum system capacity is different from the business capacity that is defined by the business requirements. These two terms are often confused with one another. The maximum business capacity should not equate the maximum system capacity. Otherwise, any good news (for example, a business underestimating the site workload) may turn into a nightmare.
Theoretical maximum system capacity does not occur at maximum throughput (TM). Instead, it occurs for that value of throughput and response time that correspond to the maximum differential of the throughput to response time:

The maximum business capacity is a special case of this equation. It can be derived by applying the constraints as defined by business requirements. For example, in our case it is the maximum response time allowed of (for example, R = RB):

The difference between the maximum system capacity (C) and the business capacity required (C') (for example, (C - C')), is the second buffer or level of contingency over the excess capacity as defined in the previous section (C B-P).
Maximum system capacity (C), just like the required business capacity (C'), is usually calculated experimentally, by a trial-and-error process of running various load tests and observing where the change in slope of (dT/dR) occurs.

xxxx