Vespa Product Updates, February 2020: Ranking with LightGBM Models, Matrix Multiplication Performance, Benchmarking Guide, Query Builder, and Hadoop Integration
<p><a href="https://www.linkedin.com/in/kraune/">Kristian Aune</a>, Tech Product Manager, Verizon Media<br/></p><p><b></b></p><p>In the <a href="https://yahoodevelopers.tumblr.com/post/190520850938/vespa-product-updates-january-2020-tensor">January Vespa product update</a>, we mentioned Tensor Operations, New Sizing Guides, Performance Improvements for Matched Elements in Map/Array-of-Struct, and Boolean Query Optimizations. This month, we’re excited to share the following updates:</p><p><b></b></p><p><b>Ranking with LightGBM Models</b></p><p>Vespa now supports <a href="https://docs.vespa.ai/documentation/lightgbm.html">LightGBM</a> machine learning models in addition to ONNX, Tensorflow and XGBoost. LightGBM is a gradient boosting framework that trains fast, has a small memory footprint, and provides similar or improved accuracy to XGBoost. LightGBM also supports categorical features.</p><p><b></b></p><p><b>Matrix Multiplication Performance</b></p><p><b></b></p><p>Vespa now uses <a href="https://www.openblas.net/">OpenBLAS</a> for matrix multiplication, which improves performance in machine-learned models using matrix multiplication.</p><p><b>Benchmarking Guide</b></p><p>Teams use Vespa to implement applications with strict latency requirements and minimal cost. In January, we released a new sizing guide. This month, we’re adding a <a href="https://docs.vespa.ai/documentation/performance/vespa-benchmarking.html">benchmarking guide</a> that you can use to find the perfect spot between cost and performance.</p><p><b>Query Builder</b><br/></p><p><b></b></p><p>Thanks to contributions from <a href="https://github.com/vespa-engine/vespa/commits?author=yehzu">yehzu</a>, Vespa now has a fluent library for composing <a href="https://docs.vespa.ai/documentation/query-language.html">queries</a> - explore the <a href="https://github.com/vespa-engine/vespa/tree/master/client">client</a> module for details.</p><p><b>Hadoop Integration</b><br/></p><p><b></b></p><p>Vespa is integrated with <a href="https://docs.vespa.ai/documentation/feed-using-hadoop-pig-oozie.html">Hadoop</a> and easy to feed from a grid. The grid integration now also supports conditional writes, see <a href="https://github.com/vespa-engine/vespa/pull/12081">#12081</a>. </p><p><b></b></p><p>We welcome your contributions and feedback (<a href="https://twitter.com/vespaengine">tweet</a> or <a href="mailto:info@vespa.ai">email</a>) about any of these new features or future improvements you’d like to request.</p><p><b></b></p><p><b>About Vespa:</b> Largely developed by Yahoo engineers, <a href="https://github.com/vespa-engine/vespa">Vespa</a> is an open source big data processing and serving engine. It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform. Thanks to feedback and contributions from the community, Vespa continues to grow.</p>