Vespa Product Updates, December 2019: Improved ONNX Support, New Rank Feature attributeMatch().maxWeight, Free Lists for Attribute Multivalue Mapping, Faster Updates for Out-of-Sync Documents, and ZooKeeper 3.5.6 Support
<p><a href="https://www.linkedin.com/in/kraune/">Kristian Aune</a>, Tech Product Manager, Verizon Media</p><p><b></b></p><p>In the <a href="https://yahoodevelopers.tumblr.com/post/188839744218/vespa-product-updates-octobernovember-2019">November Vespa product update</a>, we mentioned Nearest Neighbor and Tensor Ranking, Optimized JSON Tensor Feed Format, Matched Elements in Complex Multi-value Fields, Large Weighted Set Update Performance and Datadog Monitoring Support.</p><p>Today, we’re excited to share the following updates:<br/></p><p><b></b></p><p><b>Improved ONNX Support</b></p><p>Vespa has added more operations to its ONNX model API, such as GEneral Matrix to Matrix Multiplication (GEMM) - see <a href="https://docs.vespa.ai/documentation/onnx.html#onnx-operation-support">list of supported opsets</a>. Vespa has also improved support for PyTorch through ONNX, see the pytorch_test.py <a href="https://github.com/vespa-engine/vespa/blob/master/model-integration/src/test/models/pytorch/pytorch_test.py#L60">example</a>.<br/></p><p><b></b></p><p><b>New Rank Feature attributeMatch().maxWeight</b></p><p><b></b></p><p><a href="https://docs.vespa.ai/documentation/reference/rank-features.html#attributeMatch(name).maxWeight">attributeMatch(name).maxWeight</a> was added in Vespa-7.135.5. The value is the maximum weight of the attribute keys matched in a weighted set attribute.</p><p><b></b></p><p><b>Free Lists for Attribute Multivalue Mapping</b></p><p><b></b></p><p>Since Vespa-7.141.8, <a href="https://docs.vespa.ai/documentation/attributes.html">multivalue attributes</a> uses a free list to improve performance. This reduces CPU (no compaction jobs) and approximately 10% memory. This primarily benefits applications with a high update rate to such attributes.</p><p><b></b></p><p><b>Faster Updates for Out-of-Sync Documents</b></p><p><b></b></p><p>Vespa handles replica consistency using bucket checksums. Updating documents can be cheaper than putting a new document, due to less updates to posting lists. For updates to documents in inconsistent buckets, a GET-UPDATE is now used instead of a GET-PUT whenever the document to update is consistent across replicas. This is the common case when only a subset of the documents in the bucket are out of sync. This is useful for applications with high update rates, updating multi-value fields with large sets. Explore details <a href="https://github.com/vespa-engine/vespa/pull/11319">here</a>.</p><p><b></b></p><p><b>ZooKeeper 3.5.6</b></p><p><b></b></p><p>Vespa now uses Apache ZooKeeper 3.5.6 and can encrypt communication between ZooKeeper servers.</p><p><b></b></p><p>About Vespa: Largely developed by Yahoo engineers, <a href="https://github.com/vespa-engine/vespa">Vespa</a> is an open source big data processing and serving engine. It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform. Thanks to feedback and contributions from the community, Vespa continues to grow.</p><p><b></b></p><p>We welcome your contributions and feedback (<a href="https://twitter.com/vespaengine">tweet</a> or <a href="mailto:info@vespa.ai">email</a>) about any of these new features or future improvements you’d like to request.</p>