Has Codeium Cracked the Code for AI Assistants?
When it comes to AI-powered coding assistants, Microsoft’s Copilot has the name and the numbers. But a competitor called Codeium is growing quickly, and according to its co-founder and CEO Varun Mohan, the sky is the limit for AI assistants.
Codeium started life in 2021 as Exafunction, an infrastructure startup that provided big compute for other companies developing deep learning systems. Mohan and his business partner, Douglas Chen, managed 10,000 GPUs on behalf of autonomous vehicle development companies, an industry they previously worked in.
But by late 2022, ChatGPT had exploded onto the scene, and Mohan and Chen realized that the transformer model–the Google-developed model powering the large langauge models (LLM) breakthrough–was going to be a massive game-changer. Autonomous driving eventually would come to fruition, but transformers are, uh, “transforming” the world right now.
“Very rarely do you see something that drops that completely changes the world all in one shot,” Mohan tells Datanami. “This is extremely, extremely uncommon. And that’s what makes it so cool.”
Mohan and Chen pivoted from supporting autonomous driving startups and launched Codeium in early 2023 with an AI coding assistant. The product, which it terms an “intelligent AI code generation tool,” is powered by a custom LLM that customers run on their own gear or in the cloud (customers can opt for GPT-4 running in the cloud if they like). Codeium plugs into more than 40 integrated development environments (IDEs), including major ones like JetBrains, VS Code, Eclipse, etc. and works with more than 70 languages, including big ones like Java, Python, and SQL.
When Codeium launched, the market for AI-powered assistance was dominated by GitHub Copilot, the product of a collaboration between Microsoft and OpenAI that debuted in June 2021. That gave GitHub Copilot a big head start, which Microsoft is building upon by transitioning Copilot into a company-wide development effort. (Microsoft even added a Copilot key to the keyboard of Windows PCs, just to show the world that it’s playing with Monopoly money.)
Despite the massive head start for GitHub Copilot, the market has shown its open to other “copilots”–especially ones that are more open and work with a wider ecosystem of tools than the one from Microsoft. And as Mohan points out, most of the largest companies don’t actually use GitHub. Instead, they use other tools like Bitbucket, Gitlab, Mercurial, Subversion, and CVS.
“There’s a lot of different tools that people use to store their source code,” Mohan says. “We give people personalized experiences, so we make sure that the code that gets generated is actually tied to the private code that a company has. And we actually made sure that the models are tuned and trained on permissively licensed data. So not data that is GPL-licensed.”
Codeium functions like a junior programmer that’s there to help the human programmer working at the IDE, according to Mohan. Its autocomplete function will finish the line of code started by the human, while its AI chat function allows the human programmer to ask questions of all the code in the repo.
“Codeium helps you write a lot of software, but writing software isn’t the goal for a developer,” Mohan says. “The goal for a developer is solving a task and writing software is one part of solving the task.”
Thanks to how Codeium automatically creates an index for each code base it’s exposed to, the product is better able to answer questions the developer might have, and also offer better suggestions, Mohan says. That translates into time-savings for the developer.
“One of the big things Codeium has actually done is shrink the time it takes to onboard a new code base from three to six months to three to six weeks because we know what the code base is fundamentally doing,” he says.
Because Codeium understands the context of the code its working with, it lends itself to code reuse. That helps to minimize code bloat, Mohan says.
There are restrictions to what Codeium can do. You can’t just tell Codeium to go create new software for you, test it, integrate it with the code base, and then deploy it. The tendency for copilots to hallucinate means that humans need to maintain strict oversight, Mohan says.
“It’s for the building of code, for generating ideas, and for more quickly reviewing software,” he says. “But the core fundamental principles of the software development lifecycle are still the same. You need to test your code, debug your code, review your code, and deploy your code.”
Soon after the company started early last year, it had garnered about 1,000 users. But Codeium has grown significantly since then, and today, more than 600,000 developers use the product, according to Mohan.
“We process over 100 billion tokens of code every day, which is over 10 billion lines of code every day,” he says. “We’re one of the top five largest generative AI apps in the world in terms of amount of text processed every day for the product.”
And it’s not just empty virtual keystrokes, either. According to Mohan, about 45% of all software committed to the customers’ code base is written by Codeium and unedited. That’s significantly above the industry average.
One of the early adopters of Codeium is Dell, the Texas-based computer company. According to Mohan, Dell developers are able to get more work done because they’re able to focus on the work in the IDE, eliminating the need to context-switch, and remain in “flow state” longer.
“Writing software isn’t the only thing that developer does, but the reason why it still provides a lot of value is Codeium is able to enable the developer to navigate software way more quickly,” he says. “If there’s context switching overhead, if you make them look at a Web page, where they can’t test the software, they can’t compile the software, and then after that, they need to bring it back to the IDE,” it decreases productivity.
It’s pretty clear that there is a strong case that copilots and AI-coding assistants provide real benefits to developers now. There are still limitations, such as the tendency of LLMs to hallucinate, which means they need strict oversight, like junior programmers typically do. And some of the tougher coding problems, like migrating the billions of lines of old COBOL code to more modern languages like Java or .NET, aren’t going to be solved by copilots anytime soon.
But in the long run, Mohan, who has a Master’s of Engineering from MIT, is bullish on the potential for AI to significantly impact the world of IT.
“Large-scale autonomous agents replacing the way in which software development works in the next year, despite the existing hysteria, probably is not going happen,” he says. “But is AI going to generate and do higher- and higher-level tasks? Yes. I think the next five years are going to be very crazy. There’s going to be a lot of innovation.”
Related Items:
Data Quality Is A Mess, But GenAI Can Help
Why A Bad LLM Is Worse Than No LLM At All
Microsoft’s New Copilot Pro Offers AI in Office Apps and Priority Access
Editor’s note: This article was corrected. Exafunction managed GPUs for customers; it did not own them. And 45% of code committed by customers is written by Codeium; previously the story stated that 45% of Codeium-generated code was committed. Datanami regrets the errors.