When consumer base for API or web site grows, the number of potential abusers will eventually increase. Whether on purpose or not it may cause problems for legit consumers by slowing down the performance or even taking down the servers. In case of a web site it’s much easier to predict the absolute maximum requests per second/minute threshold. It wouldn’t make sense for someone to browse the page for more than 2-3 times a second including accidental refresh hits (F5). I’m speaking here about page views, not the concurrent calls to backend (css files, javascript, multiple sections loaded from different paths using jQuery).

This is much harder to do for APIs since they might be proxying the requests - in such scenario it might be worth using X-Forwarded-For header for getting client IP, white-listing legit consumers (this might be even necessary in web site scenario for companies that proxy internet traffic for their employees), delaying or restricting requests that exceed thresholds. Believe it or not all this is fairly easy to set up and comes with no-extra cost in case already using HAProxy or Nginx.

Companies that use HAProxy can be found here and here.

Companies that use Nginx can be found here and here, and here is how Nginx competes amongst other web servers.

To demonstrate the use of HAProxy I’m using:

  • vagrant
  • chef cookbooks for haproxy (haproxy-1.5-dev19) (original can be found here) and nginx

Although haproxy-1.5-dev19 is still in development it’s been used by major companies out there, some of them make their own branches so to make sure it’s inline with their upgrade policy.

The repository can be found at https://github.com/uldissturms/request-rate-limit where all the infrastructure setup can be seen.

git clone git@github.com:uldissturms/request-rate-limit.git
git submodule update --init
vagrant up

In case you don’t have omnibus vagrant plugin installed already run command:

vagrant plugin install vagrant-omnibus

This will bring up two machines:

  • HAProxy
  • Nginx

And expose HAProxy 80 port to host port 8081 so that the web site can be accessed through http://localhost:8081/.

Lets start by testing HAProxy. To test the performance of web site we will use ApacheBench.

sudo apt-get install apache2-utils
ab -n 500000000 -c 10 http://localhost:8081/

This will install the apache bench and run it against our web server making sure that we have 10 concurrent connections. Lets go ahead and try to open up 11th one.

telnet 8081

After the changes:

ab -n 50000000 -c 10 [](
  Benchmarking (be patient)
apr_socket_recv: Connection reset by peer (104)
  Total of 15234 requests completed

When running in parallel, connection is immediately dropped

telnet 8081
  Connected to
  Escape character is '^]'.
  Connection closed by foreign host.

Settings that prevent from client to hold the connection open for too long can be applied

timeout http-request 3s # client to send the whole HTTP request

When 3 seconds are passed we notice:

telnet 8081
  Connected to
  Escape character is '^]'.
  HTTP/1.0 408 Request Time-out
  Connection: close
  <html><body><h1>408 Request Time-out</h1>
  Your browser didn't send a complete request in time.
  Connection closed by foreign host.

Bursts can be used instead of dropping the request so that consumer experience slowdown instead of service failure. While this is a great option to consider it might also hide problems - when legit consumers aren’t familiar with busts set up then API misuse will not result in HTTP error. Monitoring and a close look should be applied.

Burst can be set up as in this gist: https://gist.github.com/dsuch/5872245

White-listing can be applied using /usr/local/etc/whitelist.lst file.

Using Nginx for request rate limiting. This is achieved using limit request module.

  curl 10 -w "Time: %{time_total} "
  lt;html><head><title>Welcome to nginx!</title></head><body bgcolor="white" text="black"><center><h1>Welcome to nginx!</h1></center></body></html>Time: 0.001
  ab -n 10
Percentage of the requests served within a certain time (ms)
  50%  500
  66%  501
  75%  501
  80%  501
  90%  501
  95%  501
  98%  501
  99%  501
  100% 501 (longest request)

White-listing can be achieved with geo module. Let’s white-list localhost and execute request from Nginx server itself.

geo $nolimit {
    default 0; 1; # my network
vagrant@web:~$ ab -n 10
Percentage of the requests served within a certain time (ms)
  50% 0
  66% 0
  75% 0
  80% 0
  90% 0
  95% 0
  98% 0
  99% 0
100% 0 (longest request)

You can specify the nodelay to drop request instead of throttling.

Things I liked about HAProxy: