Follow BigDATAwire:

March 18, 2025

Confluent GAs Tableflow, Adds Flink Native Inference in Bengalaru

(Quardia/Shutterstock)

Confluent today announced the general availability of Tableflow, the Apache Iceberg-based functionality that it first revealed a year ago. The company also used its conference this week in Bengaluru, India as the platform to launch Flink Native Inference, a new Apache Flink-based capability designed to make it easier to implement AI inference on streaming data.

It’s been almost exactly a year since Confluent announced it was adding a new feature called Tableflow to Confluent Cloud, the company’s hosted version of Apache Kafka. Tableflow makes it easy for customers to stream any data flowing in a Kafka topic directly into a data lake as a table in the Apache Iceberg format. In addition to the data, Tableflow grabs associated metadata, enabling the table to get all the benefits of Iceberg management, including support for ACID transactions.

Many Confluent customers tried to build this Iceberg capability themselves as they moved data from operational to analytical systems, said Adi Polak, director of advocacy and developer experience for Confluent.

“But it takes time, resources, and cost to build these additional data pipelines,” she said. “So what we did in Confluent is said, how about we’ll create this Tableflow for you, and with a click of a, button you don’t even need to think about it.”

In addition to Iceberg, Tableflow is now supporting Delta Lake, the table format created by Databricks for use with its data lake, or lakehouse platform, Polak said. While Databricks committed to supporting both Iceberg and Delta Lake following its acquisition of Tabular last year (and eventually merging them), the two formats continue to be used independently. Since Confluent and Databricks forged a strategic partnership last month, it made sense for Confluent to support Delta Lake, too.

In addition to creating Iceberg or Delta Lake tables out of Kafka topics, Confluent is also creating the data necessary for the tables to be discovered and managed by the metadata catalogs. The company is supporting AWS Glue and Snowflake’s Polaris catalogs out of the gate, Polak said.

Support for Iceberg and Delta Lake is important for Confluent customers because it makes it easier to connect their transactional (or operational) and analytical systems. Confluent has been working with media companies that would periodically dump data from their operational applications, including Kafka, Confluent Cloud, and Flink streams, into a data warehouse for the purpose of feeding dashboards and ad hoc queries. But the companies wanted to add real-time capabilities.

The company’s other big announcement is around Apache Flink, the popular data processing engine that works on streaming and static data. Confluent has been integrating the Flink stream processing engine into its Kafka streaming data pipelines for the past year. With the launch of Flink Native Inference, the integration between Flink and Kafka gets even deeper.

According to Polak, many Confluent customers want to run machine learning or AI models against their streaming data. But taking the data out of Confluent Cloud to run ML or AI algorithms increase latency and privacy concerns. The solution is to use Flink to run arbitrary machine learning models against streaming data, all hosted within the Confluent Cloud.

“We’re enabling them to have native inference on top of Confluent Cloud,” Polak said. “That gives them flexibility and security. We also help them with cost efficiency on the compute side as well as latency, because now they’re running their stream pipeline adjacent to where their model is being hosted. This is a game changer for a lot of our customers.”

Whether the AI is a homegrown model developed in PyTorch or an open source model like DeepSeek or Llama, customers can call it using the Flink API and Flink SQL functions, and run it directly within their Confluent Cloud account.

The company announced two other Flink capabilities, including Flink Search, which gives customers a way to perform vector searches across MongoDB, Elasticsearch, and Pinecone within Confluent Cloud’s Flink SQL; and Built-in ML Functions (early access), which brings access to Confluent-developed algorithms for data science tasks, including forecasting, anomaly detection, and real-time visualizations.

Flink Search will enable customers to build retrieval augmented generation (RAG) pipelines that are well-grounded, Polak said.

“A lot of the challenges with AI models is around hallucinations,” she said. “I’m taking an existing model, I’m deploying it, but it’s hallucinates because it’s not grounded in recent context. And for that, we need a RAG pattern or a RAG architecture, and this is exactly what our Flink Search enables.”

All three Flink capabilities are available as early access on Confluent Cloud, meaning the functionality may change and it may not be fully stable. Confluent made these announcements from Current Bengaluru 2025, its Kafka conference in the South Indian city of 14 million (also known as Banglalore). Tickets for the show, which began Tuesday March 18, are sold out.

“We’re doing it in India because there’s a lot of excitement in India for data streaming,” Polak said. “We see a huge growth in this population around data streaming, and data streaming engineers as well.”

Related Items:

Confluent and Databricks Join Forces to Bridge AI’s Data Gap

Confluent Goes On Prem with Apache Flink Stream Processing

Confluent Adds Flink, Iceberg to Hosted Kafka Service

BigDATAwire