Follow BigDATAwire:

May 7, 2025

OpenSearch Gets Parallel Performance Boost Thanks to GPUs

Organizations adopting OpenSearch for big data search, analytics, and AI will see a 9.5x performance increase compared to a prior release of the product, the group behind the open source project say, including a 9.3x boost in the performance of vector database workloads thanks to an experimental new GPU-powered indexing mechanism.

OpenSearch is an open source search and analytics engine whose creation was spearheaded by AWS and which currently is managed by the OpenSearch Software Foundation. The software, which is derived from Elasticsearch version 7.10.2, enables users to store, search, and analyze large amounts of data, including logs and real-time data streams.

With OpenSearch 3.0, the foundation has added a number of new enhancements, including a new vector database engine that can be used to help power GenAI workloads, such as retrieval-augmented generation (RAG). The new vector engine in OpenSearch 3.0 supports Nvidia’s cuVS library, which enables it to utilize the power of Nvidia GPUs for creating vector indexes as well as for powering vector searches against those indexes.

The new GPU support, which is currently in the experimental phase, will speed data-intensive workloads and index builds by up to 9.3x while reducing costs by 3.75x compared to CPU-only solution, the foundation says.

While the OpenSearch vector engines made big strides in 2024–including AVX512 SIMD support, segment replication, efficient vector formats for reading and writing vectors, iterative index builds, intelligent graph builds, and a derived source–the addition of GPU support is poised to turbocharge the project for vector use cases, OpenSearch engineers wrote in a recent blog.

“Vector operations, particularly distance calculations, are computationally intensive tasks that are ideally suited for parallel processing,” they wrote in the March 18 blog post. “GPUs excel in this domain due to their massively parallel architecture, capable of performing thousands of calculations simultaneously. By leveraging GPU acceleration for these compute-heavy vector operations, OpenSearch can dramatically reduce index build times. This not only improves performance but also translates to significant cost savings, as shorter processing times mean reduced resource utilization and lower operational expenses.”

The new release of OpenSearch brings other AI enhancements, including support for Model Context Protocol (MCP), the protocol created by Anthropic in 2024 to enable coordination among LLMs, agents, and databases. This release also brings support for derived sources, which the foundation says will “reduce storage consumption by up to one-third by removing redundant vector data sources and utilizing primary data to recreate source documents as needed for reindexing or source call back.

OpenSearch brings several other new capabilities, some of which are experimental in nature, including:

  • Support for gRPC, the Google version of remote procedure call, which should bring faster and more efficient data communication;
  • The capability to pull data in from streaming systems like Apache Kafka and Amazon Kinesis;
  • Isolation of OpenSearch reader and writer components, enabling both to scale independently;
  • Support for Apache Calcite, which will bolster security, observability, and log analysis workloads;
  • And automatic detection of log-related indexes.

OpenSearch 3.0 also features several other core enhancements, including support for Apache Lucene 10, support for Java 21 as the minimum supported runtime, and support for Java Platform Module System (JPMS) support.

Related Items:

AWS Brings OpenSearch Under the Linux Foundation

AWS Announces General Availability of OpenSearch Serverless

AWS Adds Vector Capabilities to More Databases

BigDATAwire