Predicting performance and scaling behaviour in a data center with multiple application servers
As web pages become more user friendly and interactive we see that objects such as pictures, media files, cgi scripts and databases are more frequently used. This development causes increased stress on the servers due to intensified cpu usage and a growing need for bandwidth to serve the content. At the same time users expect low latency and high availability. This dilemma can be solved by implementing load balancing between servers serving content to the clients. Load balancing can provide high availability through redundant server solutions, and reduce latency by dividing load. This paper describes a comparative study of different load balancing algorithms used to distribute packets among a set of equal web servers serving HTTP content. For packet redirection, a Nortel Application Switch 2208 will be used, and the servers will be hosted on 6 IBM bladeservers. We will compare three different algorithms: Round Robin, Least Connected and Response Time. We will look at properties such as response time, traffic intensity and type. How will these algorithms perform when these variables change with time. If we can find correlations between traffic intensity and efficiency of the algorithms, we might be able to deduce a theoretical suggestion on how to create an adaptive load balancing scheme that uses current traffic intensity to select the appropriate algorithm. We will also see how classical queueing algorithms can be used to calculate expected response times, and whether these numbers conform to the experimental results. Our results indicate that there are measurable differences between load balancing algorithms. We also found the performance of our servers to outperform the queueing models in most of the scenarios.