Open-Sourcing Traffic Server: 700k lines of code, 9 months

The following is from a conversation with Andrew Hsu, a member of the Yahoo! team that prepared the Traffic Server code base for its open source release.

What is Traffic Server (TS)?

Traffic Server is a fast, scalable, and extensible HTTP/1.1 compliant caching proxy server. TS includes a low-latency, extensible framework that is based on a “plug-in” architecture. Developers can customize existing plug-ins or develop their own to accommodate specific application requirements, while benefiting from the built-in efficiency and performance of Traffic Server. Examples of plug-ins that are commonly developed by users: application specific load-balancers and URL mapping. Check out Miles Libbey’s description of TS’s usage at Yahoo!.

What is TS’s history?

Traffic Server was originally created by Inktomi and sold as a commercial product. The product was designed to run on every major operating system under the sun circa 1990: SunOS, DEC, IRIX, Windows, Linux, FreeBSD, etc. Yahoo! acquired Inktomi in 2002 and used Traffic Server internally for several years, but did not maintain all of the OS-specific code. In 2009, as we looked to open source the code, we decided to carve off or disable a lot of the non-Linux-specific code in order to simplify the cleanup process. Now that we have a community of developers growing around Traffic Server, contributors outside of Yahoo! can choose to build the code on more modern, non-Linux OSes as they see fit.

How does Yahoo! use it?

Traffic Server is serving more than 30 billion Web objects and more than 400 terabytes of data a day across the Yahoo! network. It’s in use as a proxy or cache (or both) by services like the Yahoo! homepage, Mail, Sports, Search, News, and Finance. Traffic Server is used in-house at Yahoo! to manage our own traffic and enable session management, authentication, configuration management, load balancing, and routing for entire cloud computing stacks.

What does Yahoo! gain from open-sourcing TS?

We plan to continue active development. For example, we are planning to add support for IPv6 and improve its performance when dealing with very large files. We’d love to work with the community on these and other efforts and leverage the expertise and experience of people outside of Yahoo!. Since the open source release, the community has already contributed patches to enabling full 64-bit functionality.

What was involved in open-sourcing TS?

A team of Yahoo! developers worked on modifying the code specifically to prepare for releasing it as open source. We also worked with the Yahoo! legal and user experience and design teams, the Open Source Working Group, and VPs and managers to coordinate the effort. We started with over 700K lines of code and reduced it to just over 300K lines of code (according to sloccount). Some of the things we removed were the OS-specific stuff we did not want to support, the Yahoo!-specific stuff we did not want to, or could not, open-source, and any obsolete functionality, like old streaming media protocols. I believe we have modified every line of code because we reformatted the code with “indent” and also restructured the directories, however we did not “inspect” every line of code because the risk of a change in semantics was low.

How long did it take to open-source?

It was in the planning for over 9 months, from first internal documentation edit about open-sourcing to the date of release.

How can people get it?

Apache’s SVN repo is the best source for Traffic Server

Andrew Hsu
Yahoo! Traffic Server team

Erik Eldridge (@erikErik Eldridge)

Yahoo! Developer Network