Monday, 3 July 2017

Keep alive

What is KeepAlive?

HTTP is a session less protocol. A connection is made to transfer a single file and closed once the transfer is complete. This keeps things simple but it’s not very efficient.
To improve efficiency something called KeepAlive was introduced. With KeepAlive the web browser and the web server agree to reuse the same connection to transfer multiple files.

Advantages of KeepAlive

  • Improves website speed: It reduces latency associated with HTTP transfers and delivers a better user experience.
  • Reduces CPU usage: On the server side enabling KeepAlive reduces CPU usage. Consider that a typical web page has dozens of different files such as images, stylesheets, javascript files etc. If KeepAlive is disabled a separate connection must be made for each of those files. Creating and closing connections has an overhead and doing it for every single file wastes CPU time.

Disadvantages of Keepalive

  • Increases memory usage: Enabling KeepAlive  increases memory usage on the server. Apache processes have to keep connections open waiting for new requests from established connections. While they are waiting they are occupying RAM that could be used to service other clients. If you turn off KeepAlive fewer apache processes will remain active. This will lower memory usage and allow Apache to serve more users.

When should you enable KeepAlive?

Deciding whether to enable KeepAlive or not depends on a number of different factors:
  • Server resources: How much RAM vs. how much CPU power you have? RAM is often the biggest limiting factor in a webserver. If you have little RAM you should turn off KeepAlive because having Apache processes hanging around while they wait for more requests from persistent connections is a waste of precious memory. If CPU power is limited then you want KeepAlive on because it reduces CPU load.
  • Types of sites: If you have pages with a lot of images or other files linked into them, KeepAlive will improve the user experience significantly. This is because a single connection will be used to transfer multiple files.
  • Traffic patterns: The type of traffic you get. If your web traffic is spread out evenly throughout a day then you should turn on KeepAlive. OTOH, if you have bursty traffic where a lot of concurrent users access your sites during a short time period KeepAlive will cause your RAM usage to skyrocket so you should turn it off.

Configure Apache KeepAlive settings

Open up apache’s configuration file and look for the following settings. On CentOS this file is called httpd.conf and is located in /etc/httpd/conf. The following settings are noteworthy:
  • KeepAlive: Switches KeepAlive on or off. Put in “KeepAlive on” to turn it on and “KeepAlive off” to turn it off.
  • MaxKeepAliveRequests: The maximum number of requests a single persistent connection will service. A number between 50 and 75 would be plenty.
  • KeepAliveTimeout: How long should the server wait for new requests from connected clients. The default is 15 seconds which is way too high. Set it to between 1 and 5 seconds to avoid having processes wasting RAM while waiting for requests.
Other settings
KeepAlive affects other settings in your Apache configuration file even though they are not directly related. Here are the settings in an Apache prefork MPM webserver:
  • MaxClientsMaxClients is the maximum number of child processes launched by Apache to service incoming requests. With KeepAlive enabled you have will have a higher number of child processes active during peak times. So your MaxClients value may have to be increased.
  • MaxRequestsPerChild: The number of requests a child process will serve before it is killed and recreated. This is done to prevent memory leaks. When KeepAlive is turned on each persistent connection will count as one request. That effectively turns MaxRequestsPerChild into a maximum connections per child value. As a result you can set a lower MaxRequestsPerChild value if you allow KeepAlive. If you don’t allow KeepAlive you should increase the MaxRequestsPerChild value to prevent excessive CPU usage.

Conclusion

There is no one universal solution to tuning Apache. It all depends on the resources at your disposal and the type of sites you have. When used properly KeepAlive can improve the user experience at minimal cost in terms of server resources. But it can also be a liability when you are faced with a large number of concurrent users.

Tuesday, 27 June 2017

What does it mean to say Apache spawns a thread per request, but node.js does not?

A thread is a context of program execution. Programs that are single-threaded can only do one thing at once, where multi-threaded programs can do many things at once.
Think of it like a kitchen at a restaurant. A single chef can really only do one task at a time, be that chopping opinions or putting something in an oven. If an order comes in that requires lots of work from the chef (such as making salads vs. putting stuff in the oven and waiting) some meals may get delayed because that chef is busy. On the other hand, if that chef just has to bake a bunch of stuff, there isn't much work for him to do and he can make other meals while waiting for the food in the oven to be done.
With multiple chefs, many of these tasks can be done simultaneously. Many meals can be prepared simultaneously.
Apache's threading model is like hiring a fixed number of chefs (regardless of how many customers your restauarant has that night) and each chef can only work on one meal at a time. That means that if a meal order comes in, a dedicated chef is assigned to that meal. There will be times when that chef is busy chopping up ingredients and mixing cake batter, but there will also be times when he's just standing around waiting for the potatoes to boil. At any given time, you could have most of your chefs sitting idle, waiting on potatoes to boil and cake to bake and no more orders will be worked on, since each chef is dedicated to one order at a time.
To make matters worse, your kitchen is only as big as you can afford to make it. Each chef takes up space and resources, and you may have a situation where a bunch of chefs standing around holding the only spoons available are preventing other chefs from getting their food made.
Nginx is another web server (often used as a proxy) that you didn't ask about, but I'm including it to explain another threading model. It also hires a fixed number of chefs, but it hires fewer of them. Each chef can work on multiple meals at a time. So, if they're waiting on potatoes to boil while an order comes in for a chopped salad, they can go work on that salad instead of standing around idle. You can have a smaller kitchen (relative to the size of restaurant/number of customers) and get the same number of meals out, or more. It's a tight crew that is effective at not wasting time and resources.
Node.js is a bit different. It is single-threaded from a JavaScript perspective, but other tasks like disk and network IO are handled on separate threads automatically. It's like having a kitchen with only one chef, but that makes sense in some cases. If your kitchen has a lot of busy work for that chef, perhaps it makes sense to hire more chefs to do work. (To do this in Node.js, you can only spawn more processes, which is effectively like building a bunch of small kitchens right next to each other. You can have one guy standing out front coordinating the orders for all those kitchens.) However, if you're just a bakery (mainly just IO, with little busy-work for the chef), maybe you only need one chef.
To sum all this up, different threading models are used to divide work and process it effectively. Which threading model makes sense depends on your needs, and the other characteristics of the server you are choosing.
shareedit

Saturday, 24 June 2017

Latency ,Bandwith Explanation


  Latency 
   - The term latency refers to any of several kinds of delays typically incurred in the processing of network data. A low latency network connection is one that experiences small delay times, while a high latency connection suffers from long delays.Besides propagation delays, latency may also involve transmission delays (properties of the physical medium) and processing delays (such as passing through proxy servers or making network hops on the internet).

So what exactly is bandwidth, anyway?

Essentially, it’s just a term to quantify the rate of traffic and data allowed to flow between users and your site via the internet.
The term “bandwidth” is loosely use to describe “data transfer” but in reality these two are two different things.
Bandwidth is the measure of maximum data that can be transferred in a given time, usually measured in seconds. Data transfer, on the other hand, is the amount of data to be transferred; while bandwidth is the rate of the data transfer. You can imagine bandwidth as the width of a water pipe where data transfer is the amount of water flowing out from the pipe. How wide is the pipe width (bandwidth) determines how fast can water (data) flows. Fundamentally, data transfer is the consumption of bandwidth.
To site owners, the amount of bandwidth that a hosting company site offers can typically serve as a good indicator of that host’s capabilities – the higher the bandwidth, the better the speed; network; connectivity; and systems.

How to Calculate the Bandwidth you Need

Basic Bandwidth Calculation / Guesstimation

In bandwidth, it also doesn’t make sense to purchase up – this is why it makes sense to work with hosting providers who offer scalable solutions. As for buying small, that’ll only get you into trouble. Know your actual need to get the service that works for you – here’s how to calculate your required bandwidth:
  1. Estimate the average page size of your site in kilobytes (KB).*
  2. Multiply that average page size (in KB) by the monthly average number of visitors.
  3. Multiply the result from step 2 by the average number of pageviews per visitor.

Needed Website Bandwidth + Redundancy (without user downloads)

To do this calculation, use the following formula:
Bandwidth needed = Average Page Views x Average Page Size x Average Daily Visitors x Number of days in a month (30) x Redundant Factor
  • Average Daily Visitors: The total number of monthly visitors/30.
  • Average Page Size: The average size of your web page.
  • Average Page Views: The average page viewed per visitors.
  • Redundant Factor: A safety factor ranged from 1.3 – 1.8.

Needed Website Bandwidth + Redundancy (with user downloads)

If your site does not use or allow downloads:
Bandwidth needed = [(Average Page Views x Average Page Size x Average Daily Visitors) + (Average Download per day x Average File Size) ] x Number of days in a month (30) x Redundant Factor
  • Average Daily Visitors: The total number of monthly visitors/ 30.
  • Average Page Size: The average size of your web page
  • Average Page Views: The average page viewed per visitor
  • Average File Size: The total file size divided to the number of files
  • Redundant Factor: A safety factor ranged from 1.3 – 1.8.

Friday, 23 June 2017

Apache Concurrent connection configuration

Default concurrent connection setting for apache is 256
ServerLimit 16
StartServers 2
MaxClients 200
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25

First of all, whenever an apache is started, it will start 2 child processes which is determined by StartServers parameter. Then each process will start 25 threads determined by ThreadsPerChildparameter so this means 2 process can service only 50 concurrent connections/clients i.e. 25x2=50. Now if more concurrent users comes, then another child process will start, that can service another 25 users. But how many child processes can be started is controlled by ServerLimit parameter, this means that in the configuration above, I can have 16 child processes in total, with each child process can handle 25 thread, in total handling 16x25=400 concurrent users. But if number defined in MaxClients is less which is 200 here, then this means that after 8 child processes, no extra process will start since we have defined an upper cap of MaxClients. This also means that if I set MaxClients to 1000, after 16 child processes and 400 connections, no extra process will start and we cannot service more than 400 concurrent clients even if we have increase the MaxClientparameter. In this case, we need to also increase ServerLimit to 1000/25 i.e. MaxClients/ThreadsPerChild=40 So this is the optmized configuration to server 1000 clients
<IfModule mpm_worker_module>
    ServerLimit          40
    StartServers          2
    MaxClients          1000
    MinSpareThreads      25
    MaxSpareThreads      75 
    ThreadsPerChild      25
    MaxRequestsPerChild   0
</IfModule>

How does Software and Hardware Load Balancer Work? (Loadbalancer Algorithms Explained with Examples)

When you have an enterprise application or website that gets lot of hits, your server might be under heavy load. In that case, you may want to consider distributing the load across multiple servers.
Load balancer will distribute the work-load of your system to multiple individual systems, or group of systems to to reduce the amount of load on an individual system, which in turn increases the reliability, efficiency and availability of your enterprise application or website.

In this article, we’ll cover the basics of software and hardware load balancer, and explain the various algorithms used by the load balancers.
The following are the advantages of load balancing your application:
  • Reduced the work-load on an individual server.
  • Large amount of work done in same time due to concurrency.
  • Increased performance of your application because of faster response.
  • No single point of failure. In a load balanced environment, if a server crashes the application is still up and served by the other servers in the cluster.
  • When appropriate load balancing algorithm is used, it brings optimal and efficient utilization of the resources, as it eliminates the scenario of some server’s resources are getting used than others.
  • Scalability: We can increase or decrease the number of servers on the fly without bringing down the application
  • Load balancing increases the reliability of your enterprise application
  • Increased security as the physical servers and IPs are abstract in certain cases.
On a high level, there are two types of load balancers, which implements different types of scheduling algorithms and routing mechanisms.
  1. Software load balancer
  2. Hardware load balancer

I. Software Load Balancers

Software load balancers generally implements a combination of one or more scheduling algorithms.
The following are the three different basic algorithms used by load balancers. Most modern load balancers use combination of these algorithms to reach high performance and to set a trade off between various parameters.

1. Weighted Scheduling Algorithm

Work is assigned to the server according to the weight assigned to the server. For different types of the server in the group different weights are assigned thus the load gets distributed.
The diagram below depicts a generic scenario where a load balancer is used and how the weighted scheduling will work if there are total of 10 request ( R1, R2…… R10 ) coming to the server farm/cluster.
As you see, the request will be assigned to the server as per its weight. This weight is determined by the administrators wisely by considering the hardware capabilities of each server. Assuming that we have different weights assigned to each server as you can see in the figure below.
The load balancer will compute the percentage of traffic to be sent to a particular server according to the weight assigned to it.
When is this algorithm mostly used?: This is used when there is a considerable difference between the capabilities and specification of the servers present in the farm or cluster.
This algorithm stands out to be efficient in managing the load without swarming the low capability servers the most and efficiently utilizing the available server resource at any instant of time.
Load Balancer - Weighted Scheduling Algorithm
Fig: Load Balancer – Weighted Scheduling Algorithm
The following points can be noted in the above diagram:
  • A load balancer can be of two types: hardware load balancer and software load balancer.
  • Software load balancer are often installed on the servers and consumes the processor and memory of the servers. So, in the diagram above software load balancer is over lapping the server farm.
  • Hard ware load balancers are specialized hardware deployed in-between server and the client. It can be a switching/routing hardware or even a dedicated system running a load balancing software with specialized capabilities.

2. Round Robin Scheduling

Requests are served by the server sequentially one after another. After sending the request to the last server, it starts from the first server again.
The diagram below depicts this approach. Sequentially each request gets assigned to each server one by one and the round goes on. The change in the request assigned can be easily understood by looking into the diagram below.
This algorithm is used when servers are of equal specification and there not much persistent connections.
Round Robin Load Balancer
Fig: Load Balancer – Round Robin Algorithm

3. Least Connection First Scheduling

Requests are served first to the server which is currently handling least number of persistent connections.
In the diagram above, if we are using the least connection first scheduling algorithm, the request R5 can be assigned randomly to any of the server as when R5 will be coming every server will be having same number of connections.
Lets say when request R5 comes, request R3 got completed and server S3 is free now. Then, in that case, the request R5 will be assigned to server S3 instead of any other server as server S3 is having least no of connection at that instant of time.
When is this algorithm used?: When we have large number of persistent connections in the traffic unevenly distributed between the servers. It is often coupled with Sticky Session or Session aware load balancing. In this, all the request related to a session is sent to the same server to maintain the session state and syncronization.
This approach is used when we have session aware write operations in sync with client and the server so that it avoids any inconsistency.
Now, load balancing softwares can have the smart implementation of the combination of the above three basic scheduling algorithm. Such implementations are Weighted round robin scheduling and Weighted least connection scheduling.
Many hybrid scheduling algorithm for load balancing has evolved using some variations or combinations of the above algorithms.

Software Load Balancer Examples

The following are few examples of software load balancers:
  1. HAProxy – A TCP load balancer.
  2. NGINX – A http load balancer with SSL termination support. (install Nginx on Linux)
  3. mod_athena – Apache based http load balancer.
  4. Varnish – A reverse proxy based load balancer.
  5. Balance – Open source TCP load balancer.
  6. LVS – Linux virtual server offering layer 4 load balancing

II. Hardware Load Balancers

Load balancing hardwares are often referred as specialized routers or switches which are deployed in between the servers and the client. It can also be a dedicated system in between the the client and the server to balance the load.
The hardware load balancers are implemented on Layer4 (Transport layer) and Layer7 (Application layer) of OSI model so prominent among these hardwares are L4-L7 routers.

1. Layer4 Hardware Load Balancing

These kind of load balancers work on transport layer of OSI model and make use of TCP, UDP and SCTP transport layer protocol details to make decision on which server the data is to be sent.
Layer 4 load balancers are mostly the network address translators (NATs) which shares load to the different servers getting translated to by these loadbalancer. These routers hide multiple servers behind them and translate every response data packets coming from the server to be coming from same ipaddress.
Similarly, when there is a request they reverse translate the request using the mapping table and distributes it among the multiple servers.
DNS load balancing: In DNS based load balancing method the Domain Name Servers are configured to return different ipaddress for different systems. This approach creates a load balancing effect whenever there is a dns lookup.
The diagram below depicts the highlevel overview of Layer 4 and Layer 7 load balancer working and techniques on OSI layer.
OSI Layer - Load Balancer
Fig: Load Balancer – Layer 4 and Layer 7
Direct routing: This is a yet another configuration of hardware load balancing where the routers are aware of the server mac addresses and server may be ARP( Address resolution Protocol) disabled.
In direct routing, it is direct in the sense that all the income traffic is routed by the load balancer however all the outgoing traffic direct reaches the client which makes it super fast load balancing configuration.
Tunnel or a IP tunneling often looks like Direct routing where response is directly sent to client however the traffic between the router and the server can be routed.
In this, client sends the request to the virtual IP of load balancer which further encapsulates the IP packets, keeps a hast table and distributes it to the different servers as per the configured load balancing technique.
When the server is getting back the response, it decapsulates it and send back to the client directly according to the hash table which it has stored. This record is eventually removed from hash table when the connection is closed or there is a timeout.

2. Layer7 Hardware Load Balancing

This type of load balancers makes the decision according to the actual content of the message (URLs, cookies, scripts) since HTTP exists on the layer7.
These layer7 hardware actually form a ADN (Application delivery network) and they pass-on request to the servers as per the type of the content.
For example, the request for image will go to an image server, request for PHP scripts may to another server, HTML, js and css like static content may go to another one and request to any media content may go to yet another server.
So, here a load balancing effect is achieved by distributing load according to the type to content requested.
For this, it is also very helpful to understand the fundamentals of TCP/IP Protocol with the different layers.
The diagram below depicts a Layer7 load balancer.
Layer 7 Load Balancer
Fig: Load Balancer – Layer 7
Layer 7 load balancing uses the following three techniques:
  1. URL parsing: From this they come to know about different type of contents.
  2. Cookie sniffing: This helps them for a session aware routing.
  3. HTTP reading: This method looks for http header information.

Hardware Load Balancer Examples

  1. F5 BIG-IP load balancer (Setup HTTPS load balance on F5)
  2. CISCO system catalyst
  3. Barracuda load balancer
  4. Coytepoint load balancer

Thursday, 22 June 2017

5 Common Server Setups For Your Web Application

Introduction

When deciding which server architecture to use for your environment, there are many factors to consider, such as performance, scalability, availability, reliability, cost, and ease of management.
Here is a list of commonly used server setups, with a short description of each, including pros and cons. Keep in mind that all of the concepts covered here can be used in various combinations with one another, and that every environment has different requirements, so there is no single, correct configuration.

1. Everything On One Server

The entire environment resides on a single server. For a typical web application, that would include the web server, application server, and database server. A common variation of this setup is a LAMP stack, which stands for Linux, Apache, MySQL, and PHP, on a single server.
Use Case: Good for setting up an application quickly, as it is the simplest setup possible, but it offers little in the way of scalability and component isolation.
Everything On a Single Server
Pros:
  • Simple
Cons:
  • Application and database contend for the same server resources (CPU, Memory, I/O, etc.) which, aside from possible poor performance, can make it difficult to determine the source (application or database) of poor performance
  • Not readily horizontally scalable
Related Tutorials:

2. Separate Database Server

The database management system (DBMS) can be separated from the rest of the environment to eliminate the resource contention between the application and the database, and to increase security by removing the database from the DMZ, or public internet.
Use Case: Good for setting up an application quickly, but keeps application and database from fighting over the same system resources.
Separate Database Server
Pros:
  • Application and database tiers do not contend for the same server resources (CPU, Memory, I/O, etc.)
  • You may vertically scale each tier separately, by adding more resources to whichever server needs increased capacity
  • Depending on your setup, it may increase security by removing your database from the DMZ
Cons:
  • Slightly more complex setup than single server
  • Performance issues can arise if the network connection between the two servers is high-latency (i.e. the servers are geographically distant from each other), or the bandwidth is too low for the amount of data being transferred
Related Tutorials:

3. Load Balancer (Reverse Proxy)

Load balancers can be added to a server environment to improve performance and reliability by distributing the workload across multiple servers. If one of the servers that is load balanced fails, the other servers will handle the incoming traffic until the failed server becomes healthy again. It can also be used to serve multiple applications through the same domain and port, by using a layer 7 (application layer) reverse proxy.
Examples of software capable of reverse proxy load balancing: HAProxy, Nginx, and Varnish.
Use Case: Useful in an environment that requires scaling by adding more servers, also known as horizontal scaling.
Load Balancer
Pros:
  • Enables horizontal scaling, i.e. environment capacity can be scaled by adding more servers to it
  • Can protect against DDOS attacks by limiting client connections to a sensible amount and frequency
Cons:
  • The load balancer can become a performance bottleneck if it does not have enough resources, or if it is configured poorly
  • Can introduce complexities that require additional consideration, such as where to perform SSL termination and how to handle applications that require sticky sessions
  • The load balancer is a single point of failure; if it goes down, your whole service can go down. A high availability (HA) setup is an infrastructure without a single point of failure. To learn how to implement an HA setup, you can read this section of How To Use Floating IPs.
Related Tutorials:

4. HTTP Accelerator (Caching Reverse Proxy)

An HTTP accelerator, or caching HTTP reverse proxy, can be used to reduce the time it takes to serve content to a user through a variety of techniques. The main technique employed with an HTTP accelerator is caching responses from a web or application server in memory, so future requests for the same content can be served quickly, with less unnecessary interaction with the web or application servers.
Examples of software capable of HTTP acceleration: Varnish, Squid, Nginx.
Use Case: Useful in an environment with content-heavy dynamic web applications, or with many commonly accessed files.
HTTP Accelerator
Pros:
  • Increase site performance by reducing CPU load on web server, through caching and compression, thereby increasing user capacity
  • Can be used as a reverse proxy load balancer
  • Some caching software can protect against DDOS attacks
Cons:
  • Requires tuning to get best performance out of it
  • If the cache-hit rate is low, it could reduce performance
Related Tutorials:

5. Master-Slave Database Replication

One way to improve performance of a database system that performs many reads compared to writes, such as a CMS, is to use master-slave database replication. Master-slave replication requires a master and one or more slave nodes. In this setup, all updates are sent to the master node and reads can be distributed across all nodes.
Use Case: Good for increasing the read performance for the database tier of an application.
Here is an example of a master-slave replication setup, with a single slave node:
Master-Slave Database Replication
Pros:
  • Improves database read performance by spreading reads across slaves
  • Can improve write performance by using master exclusively for updates (it spends no time serving read requests)
Cons:
  • The application accessing the database must have a mechanism to determine which database nodes it should send update and read requests to
  • Updates to slaves are asynchronous, so there is a chance that their contents could be out of date
  • If the master fails, no updates can be performed on the database until the issue is corrected
  • Does not have built-in failover in case of failure of master node
Related Tutorials:

Example: Combining the Concepts

It is possible to load balance the caching servers, in addition to the application servers, and use database replication in a single environment. The purpose of combining these techniques is to reap the benefits of each without introducing too many issues or complexity. Here is an example diagram of what a server environment could look like:
Load Balancer, HTTP Accelerator, and Database Replication Combined
Let's assume that the load balancer is configured to recognize static requests (like images, css, javascript, etc.) and send those requests directly to the caching servers, and send other requests to the application servers.
Here is a description of what would happen when a user sends a requests dynamic content:
  1. The user requests dynamic content from http://example.com/ (load balancer)
  2. The load balancer sends request to app-backend
  3. app-backend reads from the database and returns requested content to load balancer
  4. The load balancer returns requested data to the user
If the user requests static content:
  1. The load balancer checks cache-backend to see if the requested content is cached (cache-hit) or not (cache-miss)
  2. If cache-hit: return the requested content to the load balancer and jump to Step 7. If cache-miss: the cache server forwards the request to app-backend, through the load balancer
  3. The load balancer forwards the request through to app-backend
  4. app-backend reads from the database then returns requested content to the load balancer
  5. The load balancer forwards the response to cache-backend
  6. cache-backend caches the content then returns it to the load balancer
  7. The load balancer returns requested data to the user
This environment still has two single points of failure (load balancer and master database server), but it provides the all of the other reliability and performance benefits that were described in each section above.

Conclusion

Now that you are familiar with some basic server setups, you should have a good idea of what kind of setup you would use for your own application(s). If you are working on improving your own environment, remember that an iterative process is best to avoid introducing too many complexities too quickly.
Let us know of any setups you recommend or would like to learn more about in the comments below!