Vespa Product Updates, December 2018: ONNX Import and Map Attribute Grouping
<p>Hi Vespa Community!<br/></p><p>Today we’re kicking off a blog post series of need-to-know updates on Vespa, summarizing the features and fixes detailed in <a href="https://github.com/vespa-engine/vespa/issues">Github issues</a>. </p><p>We welcome your <a href="https://github.com/vespa-engine/vespa/blob/master/CONTRIBUTING.md">contributions</a> and feedback about any new features or improvements you’d like to see. </p><p>For December, we’re excited to share the following product news:</p><p><b>Streaming Search Performance Improvement</b><br/>Streaming Search is a solution for applications where each query only searches a small, statically determined subset of the corpus. In this case, Vespa searches without building reverse indexes, reducing storage cost and making writes more efficient. With the latest changes, the document type is used to further limit data scanning, resulting in lower latencies and higher throughput. Read more <a href="https://docs.vespa.ai/documentation/streaming-search.html">here</a>.</p><p><br/><b>ONNX Integration</b><br/><a href="https://onnx.ai/">ONNX</a> is an open ecosystem for interchangeable AI models. Vespa now supports importing models in the ONNX format and transforming the models into <a href="https://docs.vespa.ai/documentation/tensor-intro.html">Tensors</a> for use in ranking. This adds to the TensorFlow import included earlier this year and allows Vespa to support many training tools. While Vespa’s strength is real-time model evaluation over large datasets, to get started using single data points, try the <a href="https://docs.vespa.ai/documentation/stateless-model-evaluation.html">stateless model evaluation API</a>. Explore this integration more in <a href="https://docs.vespa.ai/documentation/onnx.html">Ranking with ONNX models</a>.</p><p><br/><b>Precise Transaction Log Pruning</b><br/>Vespa is built for large applications running continuous integration and deployment. This means nodes restart often for software upgrades, and node restart time matters. A common pattern is serving while restarting hosts one by one. Vespa has optimized transaction log pruning with prepareRestart, due to flushing as much as possible before stopping, which is quicker than replaying the same data after restarting. This feature is on by default. Learn more in live <a href="https://docs.vespa.ai/documentation/operations/live-content-cluster-upgrade.html">upgrade</a> and <a href="https://docs.vespa.ai/documentation/reference/vespa-cmdline-tools.html#vespa-proton-cmd">prepareRestart</a>.</p><p><br/><b>Grouping on Maps</b><br/>Grouping is used to implement faceting. Vespa has added support to group using map attribute fields, creating a group for values whose keys match the specified key, or field values referenced by the key. This support is useful to create indirections and relations in data and is great for use cases with structured data like e-commerce. Leverage key values instead of field names to simplify the search definition. Read more in <a href="https://docs.vespa.ai/documentation/reference/grouping-syntax.html">Grouping on Map Attributes</a>.</p><p>Questions or suggestions? <a href="https://twitter.com/vespaengine">Send us a tweet</a> or an <a href="mailto:info@vespa.ai">email</a>.<br/></p>