In my previous post Do you have what it takes to join Yahoo!'s US Hadoop Team? I talked about a number of technical positions we have open in the Yahoo! Hadoop team. The post was a big success. We received a number of excellent resumes, and ended up hiring a few great engineers.
But today I want to talk about a particular challlenge we have - testing Apache Hadoop. Testing distributed systems is a hard job. In addition to testing functionality, one has to deal with scalability, reliability, security, non-determinism, and a number of other tough challenges. The larger the distributed system, the harder it is to test. Yahoo! currently runs Hadoop on clusters of up to 4000 servers, and this is not the limit. These are some of the largest distributed systems in the world.
Also, the way Yahoo! uses Hadoop is changing. Previously, most Hadoop users at Yahoo! were researchers. Reseraches are usually hungry for scalability and features, but they are fairly tolerant of failures.Read More »from Do you have what it takes to test Hadoop?