Vespa Product Updates, October/November 2019: Nearest Neighbor and Tensor Ranking, Optimized JSON Tensor Feed Format,
Matched Elements in Complex Multi-value Fields, Large Weighted Set Update Performance, and Datadog Monitoring Support
<p><a href="https://www.linkedin.com/in/kraune/">Kristian Aune</a>, Tech Product Manager, Verizon Media<br/></p><p><b></b></p><p>In the <a href="https://yahoodevelopers.tumblr.com/post/188005614883/vespa-product-updates-september-2019-tensor">September Vespa product update</a>, we mentioned Tensor Float Support, Reduced Memory Use for Text Attributes, Prometheus Monitoring Support, and Query Dispatch Integrated in Container. </p><p><b></b></p><p>This month, we’re excited to share the following updates:</p><p><b></b></p><p><b>Nearest Neighbor and Tensor Ranking</b></p><p><a href="https://docs.vespa.ai/documentation/tensor-intro.html">Tensors</a> are native to Vespa. We compared <a href="http://elastic.co">elastic.co</a> to <a href="http://vespa.ai">vespa.ai</a> testing nearest neighbor ranking using dense tensor dot product. The result of an out-of-the-box configuration demonstrated that Vespa performed 5 times faster than Elastic. <a href="https://github.com/jobergum/dense-vector-ranking-performance">View the test results</a>.</p><p><b>Optimized JSON Tensor Feed Format</b></p><p>A tensor is a data type used for advanced ranking and recommendation use cases in Vespa. This month, we released an optimized tensor format, enabling a more than 10x improvement in feed rate. <a href="https://docs.vespa.ai/documentation/reference/document-json-format.html#tensor">Read more</a>.<br/></p><p><b>Matched Elements in Complex Multi-value Fields</b><br/></p><p>Vespa is used in many use cases with structured data - documents can have arrays of structs or maps. Such arrays and maps can grow large, and often only the entries matching the query are relevant. You can now use the recently released <a href="https://docs.vespa.ai/documentation/reference/search-definitions-reference.html#matched-elements-only">matched-elements-only</a> setting to return matches only. This increases performance and simplifies front-end code.<br/></p><p><b></b></p><p><b>Large Weighted Set Update Performance</b></p><p><b></b></p><p><a href="https://docs.vespa.ai/documentation/advanced-ranking.html#weightedset-example">Weighted sets</a> in documents are used to store a large number of elements used in ranking. Such sets are often updated at high volume, in real-time, enabling online big data serving. Vespa-7.129 includes a performance optimization for updating large sets. E.g. a set with 10K elements, without <a href="https://docs.vespa.ai/documentation/attributes.html#search">fast-search</a>, is 86.5% faster to update.</p><p><b></b></p><p><b>Datadog Monitoring Support</b></p><p><b></b></p><p>Vespa is often used in large scale mission-critical applications. For easy integration into dashboards, Vespa is now in Datadog’s <a href="https://github.com/DataDog/integrations-extras/tree/master/vespa">integrations-extras</a> GitHub repository. Existing Datadog users will now find it easy to monitor Vespa. <a href="https://docs.vespa.ai/documentation/reference/metrics.html#datadog-integration">Read more</a>.</p><p>About Vespa: Largely developed by Yahoo engineers, <a href="https://github.com/vespa-engine/vespa">Vespa</a> is an open source big data processing and serving engine. It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform. Thanks to feedback and contributions from the community, Vespa continues to grow.</p><p><b></b></p><p>We welcome your contributions and feedback (<a href="https://twitter.com/vespaengine">tweet</a> or <a href="mailto:info@vespa.ai">email</a>) about any of these new features or future improvements you’d like to request.</p>