Institut für Informatik Lehrstuhl 7 up
Kai-Steffen Hielscher 22/01/02

Cluster-based Webserver Laboratory

Our cluster consists of 13 PCs that were bulit using commodity of-the-shelf components and serve as a distributed webserver running under Linux with kernel modifications from the Linux Virtual Server-Project . The primary goal of this project is to provide a model of a distributed webserver that can be verified and validated using measurements to obtain real world data.

Two of the PCs are used as load balancers, ten work as real servers where the http server software is running and one is used for clock synchronization.

The cluster is connected to the internet using three 1Gbit/s Ethernet interfaces. Incomming traffic is routed to one of the real servers by one of the load balancers. The other machine is a standby PC that monitors the state of the active load balancer to detect failures and replace the defective computer. The route of the responses of the real servers depends on the technology used for load distribution in the cluster and thus on the topology of the network. We can realize three different methods for this purpose: NAT (Network Address Tranlation), Tunneling or Direct Routing.

It is common to all methods for load balancing used that the webserver is reachable under one virtual IP address (VIP). In this way we can avoid several drawbacks that arise when DNS-based load balancing is used. While DNS-based load balancing has advantages for global load balancing, a routing-based approch like the Linux Virtual Server solution is more suitable for local load balancing, where all nodes of a distributed web server are located at the same physical place (i.e. the server is not not geographically distributed).

NAT
When using NAT for load balancing, the packets with VIP as the destination IP address are passed to the load balancer (VIP is the IP adress of the external network interface of the load balancer) and the destination IP address of incomming requests is changed by the load balancer to point to one real server determined by a scheduling algorithm. This can be accomplished using standard NAT methods. The real server processes the request and passes the responses back to the load balancer which in turn changes the source IP address of the answer packets to its own IP address. The load balancer works as a default gateway for all real servers. Both requests and responses pass the load balancer and the IP addresses of both types must be changed using NAT. Thus this method is referred to as double rewriting.

NAT

The advantage of using NAT is that the real server can run any operating system without modification, they can use private IP addresses and only one public IP address, the VIP assigned to the load balancer, is needed. The disadvantage of this approach is the need for all packets to pass the load balancer twice, so the load on the load balancer can become high and the number of real servers should not exceed a certain not to degrade the performance of the system. All machines have to be in one (logical) IP network, preferably with private addresses (e.g. 192.168.x.0/24).

Tunneling
The tunneling method requires all real servers to have tunneling interfaces set up with the IP address VIP. One interface of the load balancer has the VIP assigned to it also. Since the tunneling interfaces do not respond to ARP requests, the MAC address of one interface of the load balancer will be in the ARP table of the router connecting the system to the internet. This is the reason that incoming requests arrive at the load balancer which encapsulates them in an IP packet with the destination IP address of one of the real servers. There the request is decapsulated and processed by the server software. The answers can be passed back to the client without any modification by the load balancer because the real server can fill in the VIP as the source address of the response packets since it assigned to the tunneling interface. This method is called a single rewriting method.

Tunneling
 
By routing the respone packets directly to to the client this architecture can handle a large amount of requests per second and can be used with a large number of real servers for high performance applications. However, the real server operating system needs to support IP tunneling. The real servers can be geographically distributed.

Direct Routing
Here the VIP is assigned to one interface of the load balancer and to alias devices for the network devices of the real servers. The alias interfaces must not answer to ARP requests. For this purpose, a special patch has to be applied for modern linux kernels. Since the MAC address of the interface of the load balancer is the one in the ARP table entry for the VIP in the internet router, request packets arrive at the load balancer first. The load balancer neither has to rewrite nor to encapsulate the packet. It passes the response to the real server determined by the scheduling algorithm by insering the MAC address of this real server as the destination of the generated link layer frame (e.g. Ethernet frame). When the packet arrives at the real server, it is accepted there beacuse of the alias carrying the VIP. The responses generated by the server software can be sent to the client using the VIP as the source address without having to pass the load balancer a second time.

Direct Routing
 
This method of load balancing works without the tunneling overhead and gives the highest performance of all three mechanisms described. There is no real packet rewriting involved, only the translation of IP addresses to MAC addresses is dynamically changed. One drawback is that all machnies must be in the same physical network segment. 

Time Synchronization
We can obtain event traces either by using hybrid monitoring with the ZM4 system developed at the chair (100ns resolution) or by distributed software monitoring. Since the hardware monitor is more suitable for high precision, low volume measurements, most of the timestamps for performance analyses of the webserver are obatined using software monitoring. We use a combination of NTP and the PPS API to synchronize the clocks of the object system to the time obtained by one of our four GPS receivers. We found out that reading the time directly from the GPS clock was way to slow for our purposes. NTP over the network does not synchronize our machines' clocks accurately enough. By feeding the PPS (pulse per second) signal generated by our GPS receiver to the serial port of all PCs and using the PPS Linux kernel modifications and NTP, we can improve the accuracy of the synchronization. [Details ]

The most important facts for the hardware used can be seen in the following tables:

Components
2
Load Balancing Nodes (Master-Nodes)
10
Real Server (Slave-Nodes)
2
Fast Ethernet Switches
1
GPS receiver system

Load Balancers
2
CPUs intel Pentium III with 1 GHz
1
Dual-Processor Mainboard with support for 64Bit PCI-Slots
512
MByte Main Memory
2
Gigabit Ethernet adapters (1000-Base-FX), 64Bit PCI
for routing the traffic
1
Fast Ethernet adapter (100-Base-TX), onboard
for management purposes

Real Servers
1
CPU AMD Athlon Thunderbird 900 MHz
256
MByte Main Memory
2
Fast Ethernet adapters (100-Base-TX)

Main Ethernet Switch
Cisco Catalyst 3500XL with 2 1000-Base-FX GBIC modules

GPS receiver system
1
roof-mounted GPS-Antenna
1
lightninig protection
1
GPS antenna splitter for distributing the signal to four receivers
4
Meinberg GPS167PCI receiver cards

We can offer several Master Theses, Studienarbeiten and Diplomarbeiten in our cluster-based webserver laboratory.

Contact
Dipl.-Inf. Kai-Steffen Hielscher