Multi-Core HTTP Server with NodeJS

NodeJS API viewNodeJS has been garnering a lot of attention late. Briefly, NodeJS is a server-side JavaScript runtime implemented using Google's highly performant V8 engine. It provides an (almost) completely non-blocking I/O stack which, when combined with JavaScript closures and anonymous functions, makes it an excellent platform for implementing high throughput web services. As an example, a simple "Hello, world!" application written for NodeJS performs comparably to an Nginx module written to do the same.

In case you missed it, NodeJS author Ryan Dahl (@ryah) gave an excellent talk at a Bayjax event hosted by Yahoo! a few weeks ago.

Here at Yahoo!, the Mail team is investigating the use of NodeJS for some of our upcoming platform development work. We thought it would be fun to share a bit of what we've been working on, which includes

  • Performance characterization of NodeJS vs. other web stacks in various workloads: straight HTTP proxy at different latencies, scatter/gather HTTP proxy, etc.
  • Improving multi-core support in the NodeJS ecosystem and NodeJS core itself; the infrastructure for multi-process HTTP support in various NodeJS modules was contributed by the Mail team.
  • Experimenting with @davglass's YUI3 support for running in NodeJS. We hope to use this to use a single code base to render HTML in both the browser and on the server.

Of course, I would be remiss if I didn't mention that Mail is hiring! If you're interested, please contact us.


The case for multi-core

But all is not sunshine and lollipops in NodeJS land. While single-process performance is quite good, eventually one CPU is not going to be enough; the platform provides no ability to scale out to take advantage of the multiple cores commonly present in today's server-class hardware. With current NodeJS builds, the practical limits of a single CPU acting as an HTTP proxy are around 2100 reqs/s for a 2.5GHz Intel Xeon.

While Node is relatively solid, it does still crash occasionally, adversely impacting availability if you're running only a single NodeJS process. Such problems can be particularly common when using a buggy compiled add-on that can suffer from the usual cornucopia of C++ goodies such as segfaults and memory scribbling. When handling requests with multiple processes, one processing going down will simply result in incoming requests being directed at the other processes.

Taking advantage of multiple cores

There are several ways to use mutliple cores in NodeJS, each with their own benefits and drawbacks.

Using a load balancer

Until node-v0.1.98, the best practice for utilizing multiple cores was to start up a separate NodeJS processes per core, each running an HTTP server bound to a different port. To route client requests to the various processes, one would front them all with a load balancer configured to know about each of the different ports. This performed just fine, but the complexity of configurating and managing these multiple processes and endpoints left something to be desired.

As a benefit, this architecture allows the load balancer to route requests to different processes based on an affinity policy (for example, by IP, by cookie, and so on).

Using the OS kernel

In node-v0.1.98, the Yahoo!-contributed file descriptor passing and re-use patches landed in core and allowed the emerging set of HTTP frameworks such as Connect and multi-node to serve HTTP requests on multiple cores simultaneously with no change in application code or configuration.

Briefly, the approach used by these frameworks is to create a bound and listening in a single process (say, bound to port 80). However, rather than accepting connections using this socket, it is passed off to some number of child processes using net.Stream.write() (under the covers this uses sendmsg(2) and FDs are delivered using recvmsg(2)). Each of these processes in turn inserts the received file descriptor into its event loop and accepts incoming connections as they become available. The OS kernel itself is responsible for load balancing connections across processes.

It's important to note that each this is effectively an L4 load balancer with no affinity; each request by any given client may be served by any of the workers. Any application state that needs to be available to a request cannot simply be kept in-process in a single NodeJS instance.

Using NodeJS to route requests

In some cases it may be impossible or undesirable to use either of the above two facilities. For example, one's application may require affinity that cannot be configured using a load balancer (e.g., policy decisions based on complex application logic or the SE Linux security context of the incoming connection). In such cases, one can accept a connection in a single process and interrogate it before before handing it off to the correct process for handling.

The following example requires node-v0.1.100 or later and node-webworker, a NodeJS implementation of the emerging HTML5 Web Workers standard for parallel execution of JavaScript. You can install node-webworker using npm by executing npm install webworker@stable.

While an in-depth explanation of Web Workers is beyond the scope of this article, for our purposes one can think of a worker as an independent execution context (such as a process) that can pass messages back and forth with the JavaScript environment that spawned it. The node-webworker implementation supports sending around file descriptors using this message passing mechamism.

First, the source of master process, master.js:

var net = require('net');
var path = require('path');
var sys = require('sys');
var Worker = require('webworker/webworker').Worker;

var NUM_WORKERS = 5;

var workers = [];
var numReqs = 0;

for (var i = 0; i < NUM_WORKERS; i++) {
workers[i] = new Worker(path.join(__dirname, 'worker.js'));

net.createServer(function(s) {

var hv = 0;
s.remoteAddress.split('.').forEach(function(v) {
hv += parseInt(v);

var wid = hv % NUM_WORKERS;

sys.debug('Request from ' + s.remoteAddress + ' going to worker ' + wid);

workers[wid].postMessage(++numReqs, s.fd);

The master does the following:

  • The master process will fire up a net.Server instance listening for connections on port 80
  • On accepting a connection, the master process will
    • Hash the peer's IP address of the connection and use that to determine which worker to send the request to
    • Call net.Stream.pause() on the incoming stream. This prevents the master process from reading any data off of the socket -- the worker should be able to see all data sent by the remote side
    • Use postMessage() to send the (incremented) global request counter and just-received
      socket descriptor to the assigned worker

Second, the source of worker processes, worker.js:

var http = require('http');
var net = require('net');
var sys = require('sys');


var srv = http.createServer(function(req, resp) {
resp.writeHead(200, {'Content-Type' : 'text/plain'});
'process=' + +
'; reqno=' + req.connection.reqNo + '\n'

onmessage = function(msg) {
var s = new net.Stream(msg.fd);
s.type = srv.type;
s.server = srv;

s.reqNo =;

srv.emit('connection', s);

The worker does the following:

  • Decrease its privilege level to the nobody user.
  • Create an HTTP server instance without calling any of the listen() variants. We will be processing requests based on the descriptors received from the master.
  • Await receipt of a message from the master with the banner message and socket descriptor
  • Stash the request counter that we got from the master in our stream object; kind of dirty, but allows us to get at this data in the HTTP request handler.
  • Synthesize a net.Stream instance with the received TCP connection and inject it into the HTTP processing pipeline by emitting the connection event manually.
  • At this point, our request handler set up above run as normal: the HTTP server instance in the worker completely owns the connection and will parse the client's request as usual. The only wrinkle is that the request handler has access to the reqNo field which we set up when we received the message from the master.

Finally, to run this example, be sure to launch master.js as the superuser, as we want to bind to a privileged port. Then use curl to make some requests see which process they're coming from.

% sudo node ./master.js

% curl 'http://localhost:80'
process=13049; reqno=2

Of course, the preceding example is kind of a toy in that hashing based on IP is something that any HTTP load balancer worth its salt can do for you. A more realistic example of why you might want to do this would be dispatching requests to a worker running in the right SE Linux context after interrogating the other end of the connection (say, using node-selinux). Making routing decisions based on the HTTP request itself (path, vhost, etc.) is a bit more complicated but doable as well using similar techniques.

In conclusion

Finally, I hope this article has helped to shed some light on the state of multi-core support in NodeJS: several existing HTTP frameworks enable utilization of multiple cores of the box for a wide varietyof NodeJS apps; node-webworkers provides a useful abstraction on top of child_process for managing parallelism in NodeJS; and how to use NodeJS itself as an L7 HTTP router.

Sample code

This post was written by Peter Griess (@pgriess), Principal Engineer, Yahoo! Mail