Maciej Mróz Personal Blog

Because why not

Jul 7, 2012 - 6 minute read - Technology Server Side

Enter the Node.js

The title says it all, but I probably still should paint some background to this article. Imagine a system where you need very high performance, you need it yesterday, and you don’t have infinite funds to simply throw more hardware at the problem. Very important component in this system is a web server. Very simple requests, always generated by code (while system in question does not provide data to end users, you can think of AJAX - definitely there’s a similarity). So I have a webserver in question. Apache + PHP, running on a virtual machine. I quickly written a Python script that emulated the workload webserver was supposed to have (fuzzed POST requests in specific format). I did back of the envelope calculations of the performance the web server has to meet. As it turned out, Apache + PHP running on a virtual machine was at very, very small percentage of what I needed.

Obvious thing I tried was to get a VM with a lot more CPU power. It indeed got me a lot more performance, but as you might have guessed already, getting the improvement I needed was hardly possible this way. So I moved the webserver to a physical box (I have these too :) ). As I quickly learned after going physical, the Python script I was using to test the webserver couldn’t truly cut it any more and was adding a lot of overhead. I gave up the idea of precisely simulating the workload, took one example request, and used the industry standard - Apache Benchmark (ab). Even before that is was obvious that Apache + PHP is not going to cut it. My initial idea for improvement was to use nginx + PHP running in FastCGI mode. It did run, and initial results were quite promising but … you’ll see results below. Anyway, nginx wasn’t cutting it.

I intended to take a deeper look at node.js for a long time but never really had a project that would push me far enough. This one was it. I am not a web developer, and JavaScript is hardly my language (I think in C++ but usually use Python because I am lazy). Anyway, imperative languages are mostly the same, I hacked node.js app that did what I needed. I slightly changed system design along along the way - node.js output format was JSON instead of CSV used by PHP app. Still, the apps were computationally equivalent, and it terms of I/O node.js was actually a bit more verbose. Output was pushed over the network to another service, so web app itself was not doing any disk I/O. Backend service was not the bottleneck.

The benchmark was done on Core i3 540, CentOS Linux 6.2, using newest node.js, PHP 5.3.x with APC installed, newest stable nginx, Apache 2.2.x. Node app was using ‘cluster’ module (built in) in order to use all available logical CPU cores. Nginx was using PHP via FastCGI interface, Apache was running in ‘prefork’ mode and PHP compiled as a module. I tried to benchmark 8,32,128,512,1024, 2048, 4096 simultaneous clients on 500k requests. Machine used to run ab was using some quad core Intel CPU and CentOS Linux 5.5 and was connected to the same 1 GBit switch the web server was connected to.

Apache vs Nginx

I guess that before going in details into results that I got from node it’s good to compare Apache and nginx - there’s a lot of comparisons out there, and many of them suggest that nginx is significantly better … Apache results:

  • 8 clients: 2250 requests/sec, 4 ms latency
  • 32 clients: 2250 requests/sec, 14 ms latency
  • 128 clients: 2100 requests/sec, 60 ms latency
  • 512 clients: 2100 requests/sec, 240 ms latency
  • 1024 clients: FAILED
  • 2048 clients: ——-
  • 4096 clients: ——-

Apache had mostly flat performance, going slightly down from 32 to 128 clients. At 1024 clients it simply stopped accepting connections. Still, if the request completed, it completed without error, and up to 512 simultaneous clients request latency was going up linearly with number of clients. Honestly, these are not bad results and I’d be fine with them if my target wasn’t at 5k requests per second. Nginx results:

  • 8 clients: 2250 rps, 4 ms latency
  • 32 clients: 2350 rps, 14 ms latency
  • 128 clients: 2350 rps, 55 ms latency
  • 512 clients: 2100 rps, 241 ms latency (~500 failed!)
  • 1024 clients: 1900 rps, 540 ms latency (~3650 failed!)
  • 2048 clients: FAILED (completed but error rate 80%+)
  • 4096 clients: ——-

Nginx was never vastly faster than Apache. Still, at realistic workloads it indeed was slightly faster. What happened at higher concurrency levels was interesting. At 512 clients roughly 500 requests failed (completed with HTTP code other than 2xx) which is about 1 in 1000. The error rate went up even more at 1024 clients. The breakdown happened at 2048 clients - the benchmark did complete, but with more than 80% of requests ending in non-2xx code (I suppose it was 500 Internal server Error or something like that …). To sum things up: the way nginx performance degrades with concurrency is different from Apache, and on the extreme end nginx was actually slower. Apache and nginx, while different, were still in the same performance league. I still needed 5000 requests per second and it was time to try out node.js.

Welcome the king

Node.js results:

  • 8 clients: 4700 rps, 2 ms latency
  • 32 clients: 6900 rps, 5 ms latency
  • 128 clients: 7200 rps, 18 ms latency
  • 512 clients: 7200 rps, 70 ms latency
  • 1024 clients: 7150 rps, 143 ms latency
  • 2048 clients: 7100 rps, 288 ms latency
  • 4096 clients: 6900 rps, 590 ms latency

Shocking, isn’t it? Node was running in circles around traditional web servers! Its peak performance was not only at higher concurrency level, but the performance drop with more clients was only slight, instead of completely dying. It was able to complete the benchmark up to 4096 simultaneous clients, with 0 (zero!) errors (even if latency was high) … For me it’s almost end of the road - I got more performance than I actually needed. Now it’s time to polish the code, write documentation etc - and then move to other parts of the system.

I’d love to repeat the benchmark on faster CPU some day (the i3 used was all I had in the office at the moment) - it seems that all servers in question would benefit a lot from more CPU power. Also, redoing the benchmarks on faster CPU with more physical cores might reveal some other limitations. After this experience, node is definitely going to be important part of my toolbox - it seems perfectly suited for many of the things I am struggling with at work. Still, it is unlikely to ever replace traditional webservers across the stack.

Something to think about: HTTP is actually very simple protocol. We made it complicated, and mainstream webservers moved along to support all the bells and whistles. What we have right now is like showing up on a car race with expensive limousine: it may have tons of great gizmos and might be a better all-rounder, but on a race track it is crushed by cars that were built to race.