In the IT world, Devin, a hyped AI IT professional, is currently making headlines, but not for the right reasons. Three data scientists have concluded that Devin lacks substantial skills in his field.
Devin was introduced in March 2024 by Cognition AI. By December, he was available for interested users at a starting price of $500 per month. According to the developers, Devin was supposed to order lunch via DoorDash, create apps from scratch, and fix errors independently. However, reality seems to reveal Devin as a pretender.
Initial tests highlighted Devin’s weaknesses. A promotional video released by Cognition AI, meant to showcase Devin’s ability to complete projects independently on the freelancer platform Upwork, was thoroughly dissected by developer Carl Brown on YouTube. Brown’s channel, named “Internet of Bugs,” is aptly titled.
Another critic pointed out significant security issues in Devin’s system. Now, three data scientists linked with the AI research lab Answer.AI have put Devin to the test, with disappointing results.
Out of 20 assigned tasks, Devin successfully completed only three. He initially showed promise with minor successes, such as exporting data from a Notion database to Google Sheets or creating a planet tracker. However, as tasks became more complex, Devin’s weaknesses became apparent: technical dead ends, overly complicated solutions, and a tendency to get stuck on impossible tasks.
One particularly absurd example was when Devin attempted to deploy multiple applications on the Railway platform without understanding that it was technically impossible. Instead, the bot spent more than a day “hallucinating” non-existent functions.
The testers, Hamel Husain, Isaac Flath, and Johno Whitaker, delivered a harsh verdict on Devin’s performance. “When Devin worked, the user experience was impressive – but that was rare,” they wrote. “Even worse: It was impossible to predict which tasks he would actually succeed in. Even seemingly simple tasks failed in surprisingly complex ways.”
If Devin is a preview of the future of autonomous AI developers, there is still a long way to go. Cognition AI has not yet commented on the test results.