Vespa Product Updates, April 2020: Improved Performance for Large Fan-out Applications, Improved Node Auto-fail Handling, CloudWatch Metric Import, & CentOS 7 Dev Environment
<p><b><a href="https://www.linkedin.com/in/kraune/">Kristian Aune</a>, Tech Product Manager, Verizon Media</b><br/></p><p>In the <a href="https://yahoodevelopers.tumblr.com/post/611541510021857280/vespa-product-updates-february-2020-ranking-with">previous update</a>, we mentioned Ranking with LightGBM Models, Matrix Multiplication Performance, Benchmarking Guide, Query Builder and Hadoop Integration. This month, we’re excited to share the following updates:</p><p><b>Improved Performance for Large Fan-out Applications</b></p><p>Vespa container nodes execute queries by fanning out to a set of content nodes evaluating parts of the data in parallel. When fan-out or partial results from each node is large, this can cause bandwidth to run out. Vespa now provides an optimization which lets you control the tradeoff between the size of the partial results vs. the probability of getting a 100% global result. As this works out, tolerating a small probability of less than 100% correctness gives a large reduction in network usage. <a href="https://docs.vespa.ai/documentation/reference/services-content.html#top-k-probability">Read more</a>.</p><p><b>Improved Node Auto-fail Handling</b></p><p>Whenever content nodes fail, data is auto-migrated to other nodes. This consumes resources on both sender and receiver nodes, competing with resources used for processing client operations. Starting with Vespa-7.197, we have improved operation and thread scheduling, which reduces the impact on client document API operation latencies when a node is under heavy migration load.</p><p><b>CloudWatch Metric Import</b></p><p>Vespa metrics can now be pushed or pulled into <a href="https://aws.amazon.com/cloudwatch/">AWS CloudWatch</a>. Read more in <a href="https://docs.vespa.ai/documentation/monitoring.html">monitoring</a>. </p><p><b>CentOS 7 Dev Environment</b></p><p>A <a href="https://github.com/vespa-engine/docker-image-dev#vespa-development-on-centos-7">development environment</a> for Vespa on CentOS 7 is now available. This ensures that the turnaround time between code changes and running unit tests and system tests is short, and makes it easier to contribute to Vespa.</p><p><b>About Vespa:</b> Largely developed by Yahoo engineers, <a href="https://github.com/vespa-engine/vespa">Vespa</a> is an open source big data processing and serving engine. It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform. Thanks to feedback and contributions from the community, Vespa continues to grow.</p><p>We welcome your contributions and feedback (<a href="https://twitter.com/vespaengine">tweet</a> or <a href="mailto:info@vespa.ai">email</a>) about any of these new features or future improvements you’d like to request.</p>