“No Duh,” say senior developers everywhere.
The article explains that vibe code often is close, but not quite, functional, requiring developers to go in and find where the problems are - resulting in a net slowdown of development rather than productivity gains.



I code with LLMs every day as a senior developer but agents are mostly a big lie. LLMs are great for information index and rubber duck chats which already is incredible feaute of the century but agents are fundamentally bad. Even for Python they are intern-level bad. I was just trying the new Claude and instead of using Python’s pathlib.Path it reinvented its own file system path utils and pathlib is not even some new Python feature - it has been de facto way to manage paths for at least 3 years now.
That being said when prompted in great detail with exact instructions agents can be useful but thats not what being sold here.
After so many iterations it seems like agents need a fundamental breakthrough in AI tech is still needed as diminishing returns is going hard now.
Oh yes. The Great
pathlib. The Blessedpathlib. Hallowed be it and all it does.I’m a Ruby girl. A couple of years ago I was super worried about my decision to finally start learning Python seriously. But once I ran into
pathlib, I knew for sure that everything will be fine. Take an everyday headache problem. Solve it forever. Boom. This is how standard libraries should be designed.I disagree. Take a routine problem and invent a new language for it. Then split it into various incompatible dialects, and make sure in all cases it requires computing power that no one really has.
Pathlib is very nice indeed, but I can understand why a lot of languages don’t do similar things. There are major challenges implementing something like that. Cross-platform functionality is a big one, for example. File permissions between Unix systems and Windows do not map perfectly from one system to another which can be a maintenance burden.
But I do agree. As a user, it feels great to have. And yes, also in general, the things Python does with its standard library are definitely the way things should be done, from a user’s point of view at least.
If it wasn’t for all the AI hype that it’s going to do everyone’s job, LLMs would be widely considered an amazing advancement in computer-human interaction and human assistance. They are so much better than using a search engine to parse web forums and stack overflow, but that’s not going to pay for investing hundreds of billions into building them out. My experience is like yours - I use AI chat as a huge information index mainly, and helpful sounding board occasionally, but it isn’t much good beyond that.
The hallucinations (more accurately bullshitting) and the fact they have to get new training data but are discouraging people from engaging in the structures that do so make this highly debatable
I agree that it is certainly debatable. However, my experience has been that information extracted about, say what may cause a strange error message from some R output, has been at least as reliable as random stack overflow posts - however, I get that answer instantly rather than after significant effort with a search engine. It can often find actual links better than a search engine for esoteric problems as well. This, however is merely a relative improvement, and not some world-changing event like AI boosters will claim, and it’s one of the only use-cases where AI provides a clear advantage. Generating broken code isn’t useful to me.
I will concur with the whole ‘llm keeps suggesting to reinvent the wheel’
And poorly. Not only did it not use a pretty basic standard library to do something, it’s implementation is generally crap. For example it offered up a solution that was hard coded to IPv4, and the context was very ipv6 heavy
I have a theory that it’s partly because a bunch of older StackOverflow answers have more votes than newer ones using new features. More referring to not using relatively new features as much as it should.
I’d wager that the votes are irrelevant. Stock overflow is generously <50% good code and is mostly people saying ‘this code doesn’t work – why?’ and that is the corpus these models were trained on.
I’ve yet to see something like a vibe coding livestream where something got done. I can only find a lot of ‘tutorials’ that tell how to set up tools. Anyone want to provide one?
I could… possibly… imagine a place where someone took quality code from a variety of sources and generate a model that was specific to a single language, and that model was able to generate good code, but I don’t think we have that.
Vibe coders: Even if your code works and seems to be a success, do you know why it works, how it works? Does it handle edge cases you didn’t include in your prompt? Does it expose the database to someone smarter than the LLM? Does it grant an attacker access to the computer it’s running on, if they are smarter than the LLM? Have you asked your LLM how many 'r’s are in strawberry?
At the very least, we will have a cyber-security crisis due to vibe coding; especially since there seems to be a high likelihood of HR and Finance vibe coders who think they can do the traditional IT/Dev work without understanding what they are doing and how to do it safely.