
Tag: Matei Zaharia
LOTUS Promises Fast Semantic Processing on LLMs
Researchers at Stanford University and UC Berkeley recently announced the version 1.0 release of LOTUS, an open source query engine designed to make LLM-powered data processing fast, easy, and declarative. The project’ Read more…
Databricks Strings a Data Mesh with Lakehouse Federation
Databricks this week unveiled Lakehouse Federation, a set of new capabilities in its Unity Catalog that will enable its Delta Lake customers to access, govern, and process data residing outside of its lakehouse. The comp Read more…
A Dozen Questions for Databricks CTO Matei Zaharia
Matei Zaharia is a very busy man. When he’s not helping to shape the future of Databricks as its CTO, he is helping to shape the future of computer science as an assistant professor at Stanford University. He also fi Read more…
Databricks Bolsters Governance and Secure Sharing in the Lakehouse
Data governance is one of the four pillars necessary for the future of AI, along with past-looking analytics, future-looking AI, and real-time decision-making. To that end, Databricks rolled out several new governance ca Read more…
To Centralize or Not to Centralize Your Data–That Is the Question
Should you strive to centralize your data, or leave it scattered about? It seems like it should be a simple question, but it’s actually a tough one to answer, particularly because it has so many ramifications for how d Read more…
Spark 3.0 Brings Big SQL Speed-Up, Better Python Hooks
Apache Spark 3.0 is now here, and it’s bringing a host of enhancements across its diverse range of capabilities. The headliner is an big bump in performance for the SQL engine and better coverage of ANSI specs, while e Read more…
Databricks Brings Data Science, Engineering Together with New Workspace
Data scientists and software engineers work in different ways and use different tools. But both personas will feel more comfortable developing applications in the new version of Databricks Data Science Workspace, which t Read more…
Will Databricks Build the First Enterprise AI Platform?
Ali Ghodsi might have one of the best jobs in technology right now. As the CEO of Databricks, Ghodsi just completed an oversubscribed $400 million round of funding that gave the company a $6.2 billion valuation. Better s Read more…
Apache Spark Is Great, But It’s Not Perfect
Apache Spark is one of the most widely used tools in the big data space, and will continue to be a critical piece of the technology puzzle for data scientists and data engineers for the foreseeable future. With that said Read more…
What Makes Apache Spark Sizzle? Experts Sound Off
Apache Spark is one of the most popular open source projects in the world, and has lowered the barrier of entry for processing and analyzing data at scale. We asked some of the leaders in the big data space to give us th Read more…
Databricks Open Sources MLflow to Simplify Machine Learning Lifecycle
Databricks today unveiled MLflow, a new open source project that aims to provide some standardization to the complex processes that data scientists oversee during the course of building, testing, and deploying machine le Read more…
Spark 2.0 to Introduce New ‘Structured Streaming’ Engine
The folks at Databricks last week gave a glimpse of what's to come in Spark 2.0, and among the changes that are sure to capture the attention of Spark users is the new Structured Streaming engine that leans on the Spark Read more…
Spark Steals the Show at Strata
There was a lot of good stuff on display at last week's Strata + Hadoop World conference. But if there was one product or technology that stood out from the pack, that would have to be Apache Spark, the versatile in-memo Read more…