I'm getting together with some of the Hadoop committers tomorrow. Considering my quality engineering background, these are some of the discussion items at the top of my mind for the project:
- Code Review Guidelines: I wrote these up a couple years ago. Are they being followed? Are they the right set? How can we raise the quality of the code reviews being performed before patches are committed?
- Feature Design Documentation: Can we agree that each feature needs a design doc? A proposed template is attached to HADOOP-5587.
- Feature Test Plans: Can we agree that each feature needs a test plan? A proposed template is also attached to HADOOP-5587.
- Warnings: We're working to reduce static analysis (Findbugs), compiler (javac), and documentation (javadoc) warnings to zero. Can we commit to keeping them there?
- Fault Injection Framework: We're working on a fault-inject framework so that my team and others can write tests that inject faults and monitor the effects. The current work is being contributed on HADOOP-5974. What additional requirements might folks have?
- Usability: Web UI and command lines could use some work to be more consistent and user friendly. Can we agree that no stack trace should, by default, be output to the user when using command line?
- Patch Testing: We've saturated the available hardware for test patches. More hardware is on the way. What problems do we have with the current setup (other than speed)? What improvements can we make?
- Fast Commit Builds: We need a quick (10-minute) build and test target in Hadoop. Once committed, how does this new target fit into the contributor/committer workflow?
- Project Split: This is being tracked as HADOOP-4687. How do we manage build and runtime dependencies between all Hadoop projects?
- TestNG vs Junit4: Should we convert to TestNG to take advantage of some of its unique features, such as data provides and test annotations? This is being tracked in HADOOP-4901.
- True Unit Test: So many of our current JUnit tests are really mini-system tests since they are using MiniMRCluster and MiniDFSCluster to bring up a cluster on a single node in a single process. How do we do better to support and monitor contributions with true unit level tests?
- Testing for Backwards Compatibility: There's a strong desire to get to API and configuration backward compatibility from Hadoop 0.21 forward. After Hadoop 0.21, how do we ensure patches are not breaking backwards compatibility?
Quality and Release Engineering
Yahoo! Cloud Computing