

(ZinetroN/Shutterstock)
Rockset today unveiled new vector database capabilities, such as the addition of approximate nearest neighbor (ANN) search and native support for LlamaIndx and LangChain, that it says will help companies efficiently scale their GenAI applications once they’re in production.
As companies experiment with the new generative AI capabilities delivered via large language models (LLMs) and vector search, they’re getting good early results, says Rockset co-founder and CEO Venkat Venkataramani.
“We’re not educating people on what can vector search do for you,” he says. “They’ve already tinkered it at very small scale, built prototypes, and they already see the magic.”
While vector search and GenAI prototypes tease a tantalizing future, companies often run into trouble when they try to make the leap from development to production.
“Not a week goes by where somebody calls me and says, ‘Venkat, I started with this toy open source vector database and we did a shadow launch and a scale test, and it just bombed,’” Venkataramani says. “Other vector databases may have good vector support, but the database part is very shaky. Is it scalable? Is it reliable? It gets very expensive and very hard to operate very quickly.”
Rockset rolled out its initial support for vector search and storing vectorized embeddings earlier this year. Like many other SQL and NoSQL databases, the Silicon Valley firm experienced a surge in demand for these data types, which are instrumental for enabling vector search as well as other types of GenAI applications built atop LLMs and computer vision models.
The addition today of ANN and native support for LlamaIndex and LangChain, which are open source tools for automating prompt engineering and other critical behind-the-scenes GenAI data workflows, bolster Rocket’s existing capabilities for serving scalable GenAI apps.
The ANN algorithm is critical for quickly matching GenAI app user input to pre-generated vector embeddings stored in a vector database. It’s used both in vector search, where it powers the similarity search, as well as other GenAI use cases for text and computer vision.
Rocket’s implementation of ANN is unique, Venkataramani says, because it rebuilds the ANN index in real time as new data arrives, versus as a batch job that requires downtime.
“Other vector databases require you to rebuild the entire ANN index and all of that in batch mode, and so you don’t really get a real time application,” he says. “Rebuilding these indexes also is actually way more computationally expensive, but if you can incrementally maintain it, it is a lot cheaper and also more real-time.”
Rockset’s support for compute-compute separation enables it to run workloads such as index rebuilding, compaction, and ongoing maintenance without impacting the application’s main vector query workload, Venkataramani says. Compute-compute separation gives the database a big advantage when it comes to scaling GenAI applications, he says.
“You can have one or more compute instances for searches and similarity searches and vector searches and other real-time analytics and reporting–whatever applications you have,” the Datanami 2022 Person to Watch says. “They’re completely decoupled. They’re fully independently scalable and isolated from each other. But they work on the same copy of the data, and new data coming in–new updates, inserts, and deletes–will be available for your searches within single-digit milliseconds.”
The fact that Rockset, as a distributed relational database, can store all of a customer’s data as opposed to just storing vectors, as a dedicated vector database does, is another big advantage, Venkataramani says.
“You can have one column that’s basically vector embeddings, and all the other columns and other structured data available right there,” he says. “Building these kinds of hybrid searches across vectors and other metadata that you have is as simple as a SQL where clause. It’s not like you have a vector database and then you put all the other metadata and other structured data in a second separate database and you have to somehow in the application wire them together.”
Having all of the data in one place turns out to be very important in some GenAI use cases, such as powering a song recommendation engine, Venkataramani says. Running the ANN or K nearest neighbor (KNN) search–which applies a brute-force approach that delivers exact answers–is just one step among many that happens behind the scenes in recommendation engine. Developers may also bring some pre- and post-filtering using other metadata to get the best song recommendations in front of the user.
“You want to push the computation close to where the data lives, but the optimizer needs to be able to know which filters to apply first and which filters to apply second,” he says. “Imagine I have all the vectors in the vector database and all the metadata in the second database. Which one do I do first? If I go and get the 10 songs that are closest in the vector database, all of them might be in my recent playlist. If I go and look at all the songs from all these artists, none of them might be nearest neighbors. So I have to be able to combine them in the same SQL WHERE clause to be able to do this efficiently on the same data set.”
Since OpenAI ignited the GenAI storm a year ago with the launch of ChatGPT, the need for vector capabilities has exploded in the database market. Rockset’s vector capabilities are attracting attention among existing customers as well as prospects that are building GenAI applications, ranging from chatbots to recommendation engines to vector search, Venkataramani says.
“It’s really hot. It’s very, very significant,” he says. “AI applications are not like…a separate category of apps. Every application will have parts of their application powered by AI models and AI kind of capabilities, and it’ll be invisible…You’re not going to have a separate one-off side database to build your AI apps. Every single app in the world right now is going to get enhanced and have some components of it.”
One of the companies adopting Rockset’s vector capabilities is JetBlue. The airline, which recently shared its participated in the vendor’s one-day conference, did a bake-off between Rockset and several other vector database, and picked Rockset to power GenAI and other applications.
“We saw the immense power of real-time analytics and AI to transform JetBlue’s real-time decision augmentation and automation, since stitching together three to four database solutions would have slowed down application development,” Sai Ravuru, JetBlue’s senior manager of data science and analytics, says in a recent case study. “With Rockset, we found a database that could keep up with the fast pace of innovation at JetBlue.”
Related Items:
Rockset Says It’s Ready for Real-Time AI
Rockset Looks to Compute-Compute Isolation for Real-Time Advantage
April 10, 2025
- Fivetran Introduces Managed Data Lake Service for Google’s Cloud Storage
- NTT Research Launches New Physics of Artificial Intelligence Group
- COMPUTEX 2025 Keynotes to Highlight AI, Accelerated Computing, and Edge Innovation
- NVIDIA Brings Agentic AI Reasoning to Enterprises with Google Cloud
- Snowflake Achieves IL5 Authorization for DOD Workloads on AWS GovCloud
- dbt Labs Expands Cloud Availability with Google Cloud Deployment and Marketplace Launch
- Google Cloud Unveils Ironwood TPU for Large-Scale AI Inference
- Elasticsearch Now Available as a Native Grounding Engine on Google Cloud’s Vertex AI Platform
April 9, 2025
- Ai2 Launches OLMoTrace to Reveal How LLM Responses Connect to Training Data
- Elastic Announces General Availability of LLM Observability for Google Cloud’s Vertex AI
- Informatica Adds Native Databricks and Enhanced Governance to IDMC on Google Cloud
- PuppyGraph Achieves Google Cloud Ready Designations for BigQuery and AlloyDB
- Yellowbrick Powers Data Modernization Across Key Sectors for Faster Decision-Making
- AMD Powers New Google Cloud C4D and H4D VMs with 5th Gen EPYC CPUs
- DDN and Google Cloud Partner to Accelerate AI with Managed Lustre and Infinia
- Fortanix Launches Armet AI in Public Preview with Confidential GenAI Platform
- AI One Launches to End the Data Lake Era and Eliminate Costly Data Migrations
April 8, 2025
- PayPal Feeds the DL Beast with Huge Vault of Fraud Data
- OpenTelemetry Is Too Complicated, VictoriaMetrics Says
- Will Model Context Protocol (MCP) Become the Standard for Agentic AI?
- Accelerating Agentic AI Productivity with Enterprise Frameworks
- When Will Large Vision Models Have Their ChatGPT Moment?
- What Benchmarks Say About Agentic AI’s Coding Potential
- Nvidia Touts Next Generation GPU Superchip and New Photonic Switches
- Four Obstacles to Enterprise-Scale Generative AI
- Can We Learn to Live with AI Hallucinations?
- Can You Afford to Run Agentic AI in the Cloud?
- More Features…
- Clickhouse Acquires HyperDX To Advance Open-Source Observability
- NVIDIA GTC 2025: What to Expect From the Ultimate AI Event?
- Grafana’s Annual Report Uncovers Key Insights into the Future of Observability
- Reporter’s Notebook: AI Hype and Glory at Nvidia GTC 2025
- HPE Preps for the AI Era with Updated Data Fabric, Storage, and Compute Offerings
- ScaleOut Enhances Digital Twin Intelligence With Generative AI and ML
- Mathematica Helps Crack Zodiac Killer’s Code
- Confluent GAs Tableflow, Adds Flink Native Inference in Bengalaru
- Databricks Unveils LakeFlow: A Unified and Intelligent Tool for Data Engineering
- Datadog DASH 2024 Insights: New Features for Observability, Security, and Performance
- More News In Brief…
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- Snowflake Ventures Invests in Anomalo for Advanced Data Quality Monitoring in the AI Data Cloud
- Seagate Unveils IronWolf Pro 24TB Hard Drive for SMBs and Enterprises
- NVIDIA Unveils AI Data Platform for Accelerated AI Query Workloads in Enterprise Storage
- Accenture Invests in OPAQUE to Advance Confidential AI and Data Solutions
- Palantir and Databricks Announce Strategic Product Partnership to Deliver Secure and Efficient AI to Customers
- MinIO: Introducing Model Context Protocol Server for MinIO AIStor
- Dataiku Achieves AWS Generative AI Competency
- AMD Powers New Google Cloud C4D and H4D VMs with 5th Gen EPYC CPUs
- CData Launches Microsoft Fabric Integration Accelerator
- More This Just In…