Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

An extremely solid and convincing rebuttal. Sad. I wonder what the Devin team will say in response, if anything. Summarizing the video:

• Devin is sold as being able to solve arbitrary Upwork tasks. In the video demo the problem it was asked to solve doesn't match the stated requirements of the customer (who asked for setup instructions, not code).

• Devin is shown fixing errors in the source of a GitHub repo, but the files it's shown editing don't actually exist in that repo and some of the errors its fixing are nonsensical, of the type that'd never be made by a human. Inference: Devin must be fixing bugs in files it has itself created, but that's not clearly indicated.

• There is no need to do any coding in the first place, because the README in the repository has all the instructions needed to achieve the task ready to go and they still work fine with only a one-line tweak, even though the repository is old. This is why the customer asked for instructions for how to run it on EC2 rather than for some coding. Devin didn't seem to read the README or understand that it only had to execute a couple of pre-existing Python scripts. The output in the video makes it look like the task was complex and sophisticated, with a long plan and many check boxes showing work completed, but the work was in fact pointless and redundant.

• Devin's code changes are bad, e.g. writing its own low level file read loop instead of using the standard library properly.

• Although the video makes it look like Devin did the task quickly, and the video creator was able to do the requested task in ~30 minutes, the timestamps in the chat show the task stretching over many hours and even into the next day.

• Devin does nonsensical shell commands like `head -n 5 foo | tail -n 5`

The strange mistakes lead to questions about what underlying model it's using. I don't think GPT-4 would make mistakes like that.

The Internet of Bugs guy is an AI fan and uses coding AI himself, but points out that the company behind it says you can "watch Devin get paid for doing work" which isn't actually supported by their video evidence when watched carefully.



Hearing this just makes me sick.

Like how fake you wanted to be.

Also > Devin does nonsensical shell commands like `head -n 5 foo | tail -n 5`

Why is Devin executing this code, like why?


In other words they managed to successfully recreate most people's experience on upwork.


In other words they managed to fake it until they make! Like most visionaries in silicon valley, lie now, tweet about it, prompt it through fake influencers with their mouth open on YouTube, get that VC money without any due diligence, hire smart people and force them to do it!


You are supposed to make it before you get caught faking it.


a) Taking notes

b) On the topic of notes: What are the odds of this being your first comment, on a 2 yo account and me taking note. A little sus.


Underrated


it is confirmed using gpt-4 though (not sure which version)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: