Anyscale Emerges from Stealth with Plan to Scale Ray
Anyscale emerged from stealth today with a Series A round of venture capital worth $20.6 million from Andreessen Horowitz and the rough outlines of a plan to scale Ray, the RISELab technology that effectively turns everyday Python coders into parallel computing developers.
We’re well over a decade into the big data epoch and are entering a new era of AI. Despite the improvements we’ve made in data management and deep learning, developing distributed applications that run on two or more nodes, or clusters, is still too hard.
According to Ion Stoica, a RISELab advisor, co-creator of Ray, and co-founder of Anyscale, Ray has the potential to dramatically simplify the development of distributed applications and usher in a new era of parallel computation productivity.
“If you look where we are today, Moore’s Law has ended, and you have a huge exponential increase in demand for machine learning and data workloads,” Stoica says. “What’s the answer? The answer is to develop specialized hardware, like GPUs, TPUs and so forth.”
However, these are not enough, he says. “There is still a gap between the demand of the applications and even these specialized chips,” he tells Datanami. “What this means is that almost every application will be distributed.”
The challenge there, however, is that developing distributed applications is extremely difficult, Stoica says. “Programming these applications today, it requires a rocket scientist,” he says. “There is no easy way. You need to be a Kubernetes expert. You need to know how to synchronize different process, communicate among them, maintain the state consistently. It’s very, very hard.
“So if all the applications are going to go distributed, and there is no good way to write this application today, there has to be a way. And we believe that Ray is the way,” he continues. “That’s why I got so excited. That’s why I think the opportunity is massive.”
Stoica joins Anyscale as a co-founder and is currently spending much of his time with the Anyscale team. He’s working closely with his two other co-creators of Ray at RISELab, Robert Nishihara and Philipp Moritz.
Nishihara, who talked with Datanami last month about Ray and is the CEO of Anyscale, says the new company will focus heavily on Ray. That includes contributing to the core open source project, as well as creating related tooling that works in the emerging Ray ecosystem.
“Of course there are contributors and committers from other companies as well,” he says. “But, yes, we’re investing very heavily in Ray and in making the open source project great.”
The co-founders wouldn’t divulge many specifics about their tool development plans at Anyscale, except to say that plans are still being hashed out and will be publicly announced in 2020.
“If you think about the experience that we want users or customers to have,” Nishhara says, “we want them to be able to focus just on developing their application logic, and not have to think about infrastructure and system and things like that.”
Stoica ruled out the traditional commercial open source business plan for Anyscale. “We are not looking to emulate the business like Horton or Cloudera, at least early on,” he says. “If there is a business we may want to emulate, it would be that of Databricks, for obvious reasons.”
Databricks, of course, is the company behind Apache Spark that Stoica co-founded with Spark creator Matei Zaharia and several others from AMPLab, including RISELab advisor Ali Ghodsi, who is Databricks’ CEO. Databricks, which recently completed an oversubscribed $400-million round of funding at a valuation of $6.2 billion, is racing to develop the first enterprise AI platform.
Ray is the most promising technology to come out of RISELab, the advanced computing program at UC Berkeley that is the follow-on to AMPLab, which yielded Apache Spark and several other technologies that continue to make a mark on the computing industry.
The plan calls for Ray to essentially slide into the computational stack that’s emerging for deep learning workloads and provide an immediate and relatively painless productivity boost by enabling any Python code to run in a distributed manner. Ray runs on the Kubernetes orchestration layer that simplifies infrastructure management (or can run on bare metal) and works with a variety of other distributed deep learning frameworks, like Tensorflow and PyTorch.
While developers can use Tensorflow and PyTorch straight away, using them with Ray will simplify much of the other work needed to build an end-to-end deep learning system, from training to inference.
“The promise of Ray is it will be this universal framework where you can build all these applications to support all these kinds of workloads,” Stoica says. “So you can learn one system, and then you can build hopefully any application. Any application you can build in Python, you should be able to build in Ray, with a few line change.”
In the future, AI will be in every application, Stoica says. That means there’s a need for a universal layer that automatically handles a lot of the complexity that’s inherent in building distributed systems.
“We want to make it as easy to develop applications which run on hundred or nodes or thousands of nodes as it is to develop and run an application on your laptop,” Stoica says. “You can imagine there are lots of tools, for development and runtime, which will be required to achieve that vision.”
While it’s relatively young technology, Ray is already being used in production settings, according to the Anyscale folks. Interest in the open source project is growing quickly. All these factors combined to get the attention of Ben Horowitz, the co-founder and general partner of the legendary venture capital firm Andreessen Horowitz.
“Ray is one of the fastest-growing open source projects we’ve ever tracked, and it’s being used in production at some of the largest and most sophisticated companies,” Horowitz said in a press release. “Its massive popularity is both a testament to the importance of the problem it is tackling and how well the team behind it has executed on building a product that works and does what it claims. We look forward to working with Robert, Philipp, and Ion to bringing Anyscale to users around the world.”
Anyscale is based in Berkeley and currently has about a dozen employees. “More people signed who are joining in 2020, to move quickly, to bold a strong team as quickly as we can,” Nishihara says. “We’re very much focused on hiring.”
Related Items:
Why Every Python Developer Will Love Ray
Deep Learning Has Hit a Wall, Intel’s Rao Says
Meet Ray, the Real-Time Machine-Learning Replacement for Spark