Moving to the Cloud

There's a paradigm shift going on in the industry about how we deal with
computing infrastructure. Services such as Amazon Web Services, Google App Engine, and various other cloud infrastructure providers are changing the
way that companies think about writing, hosting, and deploying web
applications. At Yahoo!, we're also getting on the cloud. Our deep involvement with Hadoop is the best-known example, and we are building out other internal cloud services focused on storage and deployment. The cloud model enables Yahoo! properties and services to scale effortlessly and focus on what their audience wants, rather than on how to manage the data.

Data at Yahoo! scale

I'm the product manager for Sherpa, our cloud key-value store. We just released a new
version to Yahoo! properties and I wanted to let the developer community know about it because it's cool technology, and represents the hard work of some very talented people. While there are a lot of key-value stores out there, we needed to write our own for a specific reason: scale.

At Yahoo!, our systems have to scale horizontally (we have to handle tens of thousands of requests per second in a single datacenter) and geographically (our users are around the globe and their data needs to be close to them wherever they are). Scaling on these two axes simultaneously is a problem that very few companies have to deal with. At the same time, we must meet the latency SLAs required by user-facing pages.

Sherpa handles both of these scaling problems. Our loosely coupled architecture scales linearly and for all intents infinitely, replicates data globally, and still answers queries within our SLAs. We do all this behind a very simple RESTful interface with four basic operations: Get, Set, Delete, and Scan.

Client APIs

Sherpa's data model is a key-value store where data is stored as JSON blobs. Data is organized in tables where primary key uniqueness can be enforced, but other than
that, there are no fixed schemas.

A slightly modified example of a get operation illustrates the simplicity of the API as well as how we encode our data.

$ curl http://sherpa.yahoo.com/SherpaWebService/V1/get/AddressTable/yahoo
{"sherpa":{
"status":{"code":200,"message":"OK"},
"metadata":{"seq_id":"abc-4990E052:abc-5",
"modtime":1234231551},
"fields": {
"addr":{"value":"700 First Ave"},
"city":{"value":"Sunnyvale"},
"state":{"value":"CA"}
}
}}

In this case, we've retrieved a record with the primary key of yahoo from the table AddressTable,which contains the fields addr, city, and state, and, in turn, contain the values of Yahoo!'s address in California. If you want to add another field, just add it to the record. (Anyone who has used Python dictionaries, C++ maps, or Perl hashes will instinctively get the model.) Sherpa also returns a request status, and record metadata that can be used to handle consistency issues (more about this below). Set, delete, and scan function in largely the same way. However, Sherpa does not support SQL-like queries and does not allow joins between
tables.

Built to scale

sherpa-arch.png

Sherpa is a classic sharded datastore with a few enhancements to provide the scalability that Yahoo! needs:
  • Our sharding algorithm is located in a separate architectural element rather than in the client. This allows us to add capacity dynamically without any user impact.
  • The data itself is organized in small discrete chunks. This enables us to move data between storage nodes to route around hardware failures and rebalance overloaded nodes. All of this data reorganization can be carried out with minimal human intervention and no loss of availability, making Sherpa much easier to operate than conventional solutions like sharded MySQL?. We also have the ability to maintain multiple copies of data in a datacenter if a property needs low latency failover.
  • We've implemented a reliable, high-performance message bus with a flexible
    topology. For example, our US and main international datacenters communicate via a fully connected mesh, while smaller replicas can replicate on a spur. We achieve low-write latencies by replicating inserts and updates asynchronously.

Put your CAP on

In designing Sherpa, we've paid a lot of attention to making the system simple to use and hiding complex operations behind simple interfaces. However, we can't entirely hide the implications of the CAP theorem (first proposed by Dr. Eric Brewer). People who are much smarter than me have done a great job of href="http://www.infoq.com/presentations/availability-consistency">explaining this, so I'll quickly summarize here.

The CAP says that all distributed systems have unavoidable trade-offs between consistency (all records are the same in all replicas), availability (all replicas can accept updates or inserts), and tolerance of network partitions (the system still functions when distributed replicas cannot talk to each other). The CAP says that you can only pick two of these three. And since Yahoo! exists in the real world where network partitions happen, our tradeoff must be between consistency and availability.

Sherpa provides users the ability to choose between various degrees of consistency and availability, and gives them tools and techniques to deal with the limitations imposed by the choice. For example, if users choose high availability (as most of our users do), we provide mechanisms to identify and correct inconsistent records.

We've replaced your usual datastore...

Yahoo! properties have been very enthusiastic about Sherpa. Instead of
writing and setting up their own storage infrastructure, dealing with hardware configuration, replication, and so forth, properties just get on the cloud. They set up their tables, write their app and the Sherpa team handles the rest.

While we don't have any immediate plans to deploy Sherpa as an externally available web service, many user facing Yahoo! services such as href="https://developer.yahoo.com/yos/intro/">Y!OS and YQL will use Sherpa for their data storage.

Chances are, when you use Yahoo! in the near future, you'll be in the cloud with Sherpa.

If you're interested in a more technical treatment of some of the concepts employed in Sherpa, please check out this paper written by Yahoo! Research. Note that this describes a previous version of Sherpa, and our production system outperforms this system by an order of magnitude. Check back in a few months and we'll have detailed performance information that we can share.

Toby Negrin, Sherpa Product Manager