Hacker News Clone

cafkafkMay 25, 2026, 5:23 AM

I think a lot of the problem with the current discourse is how black-and-white it is. Either you're a luddite or "ai pilled".

In most cases, LLMs can get you 80-95% of the way, sometimes less, sometimes more. And heck, sometimes, it just gets you somewhere wrong.

But it seems everyone is arguing about whether LLMs can be perfect software engineers in isolation running in a closet, and using that to say that LLMs do not have a massive potential in other scenarios.

Sometimes, I like to imagine how much more productive most organizations could be from the things that the internet gave us, even to this day. Most companies never really do even a fraction of what is possible. That helps to ground my view of LLMs as well.

The fault dear Brutus isn't in our language models, but in ourselves.

ChrisMarshallNYMay 25, 2026, 10:25 AM

The article specifically calls out agents:

> the adoption of AI agents into software development will be one of the most costly mistakes in the field’s history

I don’t use agents, myself. I use a simple chat interface, and a running dialogue, to build software at a function-building level. The resulting workflow is quite “chimerical,” and benefits greatly from my own experience and expertise. The LLM simply lubricates the process.

In my case, it seems to be working well. I would not want to go back.

ModernMechMay 25, 2026, 3:19 PM

I agree with this. I've tried using agents over the last 2 months, and I feel they are just... bad. I spend more time trying to correct their inexplicable decisions than it takes to just go through step by step in a dialogue.

swazzyMay 25, 2026, 6:08 AM

I think that's geohot's point as well. They're advocating against being fully "ai pilled". Saying we should be using AI as a tool, not for being a luddite.

joe_the_userMay 25, 2026, 7:49 AM

LLMs can get you 80-95% of the way

But the big question is "where will '80-95% of the way' get you?"

Do you grind-out the last 5-20% in a period that's disappointingly long compared to the initial step? Or do you another 80% complete thing on top, and another and another until the whole structure collapses?

The post is talking about what groups might go what directions, which seems fair.

roncesvallesMay 25, 2026, 8:44 AM

>In most cases, LLMs can get you 80-95% of the way

On a tangent, this often gets misinterpreted as "LLMs reduce the time it takes to do the thing by 80-95%". That's not what it means.

hansmayerMay 25, 2026, 8:25 AM

> But it seems everyone is arguing about whether LLMs can be perfect software engineers

That's just those of us with longer memory holding the AI companies to the standards they declared themselves. Nobody forced Sam Altman to blab about a team of pocket PhDs, did they? I don't want the crap that does it correct 60℅ of the time - where is the god damn nation of PhDs in a datacemter already? Where is the AI doing all the SWE work "in 3-6 months"?

pickleRick243May 25, 2026, 11:55 AM

You want the AI that's doing all the SWE work in 3-6 months? Somehow I doubt that.

hansmayerMay 25, 2026, 12:28 PM

No buddy, not me. Dario Amodei however, keeps announcing it every about 6 months, on the dot. Last time i January this year. So I just want them held acccountable to their own statements. Otherwise if they would be untrue, that is at best incompetence, and at worst investor fraud. Both should draw serious consequences, given the ungodly sums which are burnt into these pipe dreams.

overfeedMay 25, 2026, 6:29 AM

> In most cases, LLMs can get you 80-95% of the way, sometimes less, sometimes more.

That's my experience too, but it's 60-95% solutions in my case[1], with about 120-140% of lines of code required. I wish there was a harness that would let me mask code it should/n't change, because prompt-based refactors fail from the same over-eagerness.

1. I try faster, smaller models first.

brabelMay 25, 2026, 6:55 AM

We had the same issue until we created a review skill that we run after a LLM is done implementing a feature. We give it a list of things to check that is based on the problems we have observed previously, like writing too verbose code, and ask it to report on issues and suggest improvements. The developer can then give feedback and let the LLM fix the issues, or just address them manually. It’s still early but I’ve been much happier now with the results. It makes it much easier as well for humans to review since there’s a report about what the change is about, why, things to keep an eye on etc. This is something you can do with any harness you may be using and there’s nothing to buy, just a suggestion from someone trying to make the best use of this insane technology.

kcguyuMay 25, 2026, 7:28 AM

I completely agree with your sentiment of “black or white.” I believe it comes from social media with primarily “radical” perspectives being the ones in the spotlight. Just not an environment that promotes nuance or friendly discussion

1vuio0pswjnm7May 25, 2026, 7:47 PM

"Either you're a luddite or "ai pilled"."

The Luddites were (violent) activists. They were more than just "non-believers"

Generally, those being labeled "Luddites" in today's "discourse" are people who dare to question the "AI" hype. GGenerally, these people are not activists

satvikpendemMay 25, 2026, 7:52 PM

Semantic weakening is common in all languages, just as literally doesn't literally mean literally.

bluegattyMay 25, 2026, 5:37 AM

Yes, exactly, it's 'us' not the AI, which is great.

Why on earth would we ever remotely compare a 'tool' to 'a software engineer' ?

The 'great delusion' is not that 'AI can't code' - because obviously it can, and very well.

The problem is the 'anthropomorphism' and all this AGI nonsense.

If we called it 'Stochastic Mechanisms' and did not 'personalize' our prompts, refer to them as 'chat' or give them 'personalities' but remained in the domain of 'Stochastic Language CLI' ... then our metaphors would pbably not cloud our judgments.

Let the philosophers argue about AGI.

easyThrowawayMay 25, 2026, 8:14 AM

Because the alternative would've been telling everyone "here's the Stochastic Machine. The more what you write looks like the sources we blatantly stol- I mean, trained from, the closer it gets to output working code". And we're not ready to admit we don't want to know how the sausage is made.

By anthropomorphizing it, we give it some sort of authorship, which clears our collective conscience from what's really happening.

gizajobMay 25, 2026, 6:42 AM

The saner philosophers don’t need to argue about AGI because we’re absolutely nowhere near it.

pixl97May 25, 2026, 5:11 PM

Please give a well defined and agreed upon definition of AGI.

For all I know you're the same guy that says we don't need to talk about nuclear weapons in 1937 because we're nowhere near them.

esikichMay 25, 2026, 5:54 AM

You are a tool. You're a human resource, from the perspective of the organization. That pushes buttons on bunch of other tools. That's why you compare it.

Edit: I don't mean tool as a perjoritive.

bluegattyMay 25, 2026, 3:51 PM

There are innumerable other software and process technologies that we use - and never before have we compared them to 'Engineers'.

The 'tool of the system' analogy is not an unreasonable point of discussion but it does not help us in this scenario.

RagnarDMay 25, 2026, 8:37 AM

I think the irony is that the perception of being called a tool as an insult, is exactly your meaning of it.

jay_kyburzMay 25, 2026, 6:46 AM

I don't know why you are getting downvoted. Perhaps because people don't like the sentiment. But its true, people are hired as tools to write programs.

Both people and AI make mistakes. Perhaps the AI makes more, a lot more, but its so fast, and works around the clock, and has no ego, there is a chance that the benefits outweigh the costs.

SCdFMay 25, 2026, 6:06 AM

So currently there are people who are buying grey market peptides[1], marked "not for human consumption" and injecting themselves with them based on dubious anecdotes and vibes, to make their skin clearer, build muscle mass, and so on.

Are they are all suddenly turning into zombies? No. Do they have any real idea what that is going to do to their body a few years down the line? Also no. Could it be catastrophic? Maybe!

I think about this when I think about how violently much of the industry has pivoted into AI being the primary generator of code in the last 6ish months. AI is the peptide, your codebase[2] is the body. Literally no one knows how maintainable this approach is, because there simply hasn't been enough time to find out. It could be fine. It could be a complete mess, with your entire engineering team falling asleep at the wheel, lulled into thinking they understand what is being built when they don't, completely impotent to fix or maintain it once the LLM is no longer able to.

[1] https://www.bbc.co.uk/news/articles/cdr268m5pxro

[2] Well, _their_ codebase. I've stopped doing it with my own personal codebases, unless I genuinely don't care about maintainability or longevity

jay_kyburzMay 25, 2026, 6:48 AM

I think smart developers will be building isolated modules, so if your AI generated module keeps failing, you can amputate it and make a fresh one.

bcoughlanMay 25, 2026, 2:06 PM

I've been thinking the same: the smaller the codebase, the better AI performs. So a way to scale AI is to modularize your architecture to maximize the number of leaf nodes in the dependency tree, and split out separate libraries where it makes sense.

It is huge for token usage also, Claude grepping the codebase for context it doesn't have is the main consumer of input tokens from what I can see.

sphMay 25, 2026, 10:26 AM

The whole AI powered utopia is a bet on two possible outcomes:

- humans and companies somehow stop being greedy, selfish and cutting corners to optimize for revenue and time-to-market

- LLMs are the path to artificial superintelligences that will be able to deal with the exponential increase in tech debt from throwing AI slop at the wall (vibecoding) because no one has time to do things “the proper way”

The former is impossible. The latter is extremely unlikely and an existential threat to humanity.

The so called Luddites are the only ones to have even engaged at all with these concerns. Everybody else is just focused on the selfish game (see bet #1) of staying afloat in a rapidly changing ecosystem.

jappgarMay 25, 2026, 11:28 AM

Where do you find the smart developers in 2026. Half are rage quitting and the other half have full-blown psychosis.

threatofrainMay 25, 2026, 11:42 AM

And interactive feedback rather than having to specify everything up-front.

NitionMay 25, 2026, 4:46 AM

With the level of ability that AI is at right now, I've found it useful personally to think of it something like a very good search over existing knowledge. Another step up in searchability in the lineage of reference books, stack overflow, GitHub etc.

Programmers are rewriting and reinventing the same techniques more often than any other vocation I can think of, and so we were primed for a really good search over prior art. The fact that AI can also adapt that prior art to your particular use case makes it even more powerful.

Much like how great success never came from cobbling together various bits of copy-pasted code from Stack Overflow though, current AI can't really build your whole project.

BalinaresMay 25, 2026, 7:07 AM

One under-discussed phenomenon here, I think:

The hardest thing in software engineering is solving the right problem. The ability to identify the right problem to solve, is IMO, what distinguishes the top senior engineers. And we could have endless discussions about what constitutes the right problem, but for the sake of this discussion, let's reduce it to: the problem whose resolution adds the most value to the product for the amount of complexity and afferent costs that it incurs.

Once upon a time, long ago, I worked on a Web product whose original junior designer had figured it would be neat to be able to manage the backend with LDAP tools. So the database schema and structure that the product used mimicked that of OpenLDAP, with compound CN keys, and the entire codebase had to deal with that structure whenever reading from or writing to the DB. LDAP compatibility was not the right problem to solve when designing the DB schema.

But software that solves the right problems can be hard to identify because, quite often, how it does things seems so obvious that it's not readily apparent what other designs might have been chosen.

Now, the thing that usually keeps the blast radius of wrong-problem designs limited over time, is the very friction that they introduce. Development slows down, including the development of more wrong-problem designs. It's a self-limiting phenomenon.

And that's one major thing which worries me about LLM coding agents:

They paper over this friction. They don't repair it; they just make it so its cost is deferred.

So you gradually end up with codebases that grow unboundedly complex for the value they provide, with no controlling mechanisms.

You end up with juniors who never face the feedback loop from which they'd develop the engineering instincts and the taste for what makes a problem the right problem to solve in a given design.

At scale, as a field, you might end up forgetting there ever was such a thing as solving the right problem.

And I don't know what to do about that. Plan for an early retirement, maybe.

thedjpetersenMay 25, 2026, 5:16 AM

Part of my job is working on trying to make these models productive for the large corporation I work for. It's a lot of throwing tomatoes at a wall and to a degree I see the issue he is talking about output seemingly having a certain ceiling.

At the same time in no part of his post is any code snippet or anything to latch on to of "the model performed poorly here when it should have done this" - this style of criticism seems to be a pattern of most of these "the LLMs will never work" style posts on blogs and twitter.

They obviously can perform better than autocomplete and in my own day to day development build out huge portions of a codebase that I would have expected a junior or midlevel engineer to perform at.

How are we really supposed to grasp their actual capabilities when no one will actually cite specifically what mistakes they are making.

farhanhubbleMay 25, 2026, 4:45 AM

I'm in the "haven't written any code in a while" boat ATM. I'd love to see examples of issues that are so big that they warrant reverting to manual coding.

My main issue has been the inconsistent quality across between model releases and the tendency to insert older APIs or documentation, especially with command line tools.

I can understand if the model struggles with a million line monolithic codebase with a decade of cruft but can't think of why it'd be too much of a pain with new codebases.

pickleRick243May 25, 2026, 10:33 AM

Agent harnesses have barely been available for a year, and reasonably reliable for only half, and there's already fatigue. I think this says less about whether LLM's will actually be able to program and more about how mentally exhausting AI-assisted programming can be, which involves a higher frequency of decision making and reading an astronomical amount of both code and prose if you actually want to stay on top of what the agent is doing to your codebase. This personal/psychological exhaustion and negative sentiment is now being inaccurately transferred into a pessimistic prognosis for the advancement of the technology itself.

sevenseacatMay 25, 2026, 12:00 PM

https://evilmartians.com/chronicles/ai-assisted-engineers-ar...

jappgarMay 25, 2026, 11:25 AM

The technology doesn't have an existence outside how people use it.

If everyone who uses it correctly finds it frustrating, and if the only people who love it produce a mountain of unmantainable slop, we will quickly abandon it to the dustbin of history.

A lot of things have "potential" but never amount to anything.

We're going to keep using LLMs but the utility of agentic coding has already peaked in my opinion.

mountainriverMay 25, 2026, 4:36 AM

My guess is the models just continue to get better and better

When I got into agentic coding a year or two ago I was sure it was only good at autocomplete. Something happened earlier this year where the models hit a new level of capability.

Everyone I know now just does agentic coding, and it’s really amazing. I think we should just try pushing this as far as we can possibly go, it really feels like the acceleration of the human race is upon us.

tptacekMay 25, 2026, 4:35 AM

AFL didn't find more vulnerabilities than LLMs. AFL and skilled practitioners found vulnerabilities. AFL triggers faults, many (most?) of which aren't exploitable, and humans (or, now, agents) have to triage and evaluate them. And they did so in a pre-AFL corpus of memory-unsafe software. The heyday of AFL was a decade ago. Every target is harder now.

decimalenoughMay 25, 2026, 6:25 AM

For context: the author is George "geohot" Hotz, who has a long list of exploits, likely the best known of which is basically vibe coding (I mean that in the nicest possible way) comma.ai for autonomous cars on a shoestring budget long before actual AI vibe coding was a thing.

https://en.wikipedia.org/wiki/George_Hotz

c0rruptbytesMay 25, 2026, 5:27 AM

ai agents can program, in fact, in our current time with current models, i'd say they program better than most people in the industry (an industry where people were literally copying and pasting from stack overflow for years prior)

being able to program is not the only skill required to be a successful software engineer, so no ai agents cannot be software engineers

very important distinction - i personally like the radiologist example - looking at scans is a part of a radiologist job, AI can do it better than most of them, but looking at scans is a small part of the job, most of it communicating with doctors to help their patients

intendedMay 25, 2026, 4:40 AM

If nothing else, Eternal Sloptember is a term that seems obvious once you have it. I can’t believe this is the first time I’m seeing it.

bandramiMay 25, 2026, 4:44 AM

If you were on Usenet before '93 the words still haunt you

dwdMay 25, 2026, 5:05 AM

Joined Usenet in 1990 and for a few years it was great.

I always wonder whether HN suffers from periodic influxes of newbies who don't get it yet and rile up the regulars.

BLKNSLVRMay 25, 2026, 5:27 AM

There was the Reddit Exodus of 2023 and there seems to be a fair few new accounts posting recently, which feels like an Attempted Agentic Takeover.

lmmMay 25, 2026, 8:32 AM

It's not periodic so much as continuous. But yes, HN is constantly where Reddit was about 3 years ago and correspondingly declining. E.g. see the rise in "underrated comment" type comments lately.

TZubiriMay 25, 2026, 4:45 AM

I was very much unborn in '93, could you share the folk history for the record?

ZeWakaMay 25, 2026, 4:49 AM

https://en.wikipedia.org/wiki/Eternal_September

intendedMay 25, 2026, 6:50 AM

Its one of my favorite parts of online trivia.

intendedMay 25, 2026, 4:06 PM

I phrased it poorly, I haven’t heard the term sloptember.

Eternal September is the kind of trivia that let me know if someone is a grey beard or not.

discreteeventMay 25, 2026, 9:45 AM

Eternal LLMber?

WeaselsWinMay 26, 2026, 7:30 AM

A pattern i'm seeing is that people working with low level stuff (USB <-> PCIe chip reverse engineering) are much more often refusing the pill. My career is essentially cooking an unholy amount of bog standard nodejs CRUD. And i think i definitely represent a bigger segment than low level gurus, and for me AI is irreplaceable.

sandrusoMay 25, 2026, 5:05 AM

Not reviewing outputs, which is my main issue, is one-way to subpar experience. No amount of "make it right" will fix that.

I hope that professionalism still matters as these new ways of doing things strikes me as unprofessional as f...

Yeah, the next macOS will be worse... time to place bet on prediction market

nilirlMay 25, 2026, 5:00 AM

I agree that I can write better code than an agent.

But it can write working code much faster than I can.

And in a lot of cases, unfortunately, faster beats better.

petterroeaMay 25, 2026, 5:18 AM

I think geohot is somewhat of a clown but I think he is speaking reason here and I'm happy to see voices address this. Most seniors I work with agree.

edg5000May 25, 2026, 9:30 AM

He did admit that he writes his posts in a slightly baity manner. They are his views, but he likes to make waves for the sake of it. It's a smart tactic; it helps his companies if he stays well-known. It's also just fun to read his inflammatory musings. I'm sure he emjoys writing them to.

linsomniacMay 25, 2026, 12:47 PM

>But each time I suspected I could have done it better and faster manually.

I've heard this said so many times, but my experience has just been so dramatically the opposite that it rings false. But geohot seems to be a pretty productive and smart guy, so it's hard to just dismiss what he's saying.

I get the sense that he's truly one of the 10x engineers. And maybe he can do it faster and better manually. But for those of us who aren't 10x, I think it lets us bridge that gap. Now we're getting back to "status anxiety": is this an attack on his ego, if the average becomes 10x?

Anecdote: Over 2 weeks of spare time, I used AI tooling to build a fairly sophisticated debian package caching proxy server (~72KLOC, 27K implementation, 45K tests). This would have easily taken me 6 months of focused time to implement by hand. I literally couldn't have done it because I can't take that much time off work and I have other weekend/evening obligations.

syl5xMay 25, 2026, 1:27 PM

Exactly my thoughts, I believe that the 10x engineers are basically living in a bubble and completely oblivious to the average developer/engineer (I consider myself average). Yes the LLMs cannot bring the level of sophistication that geohot would have but they totally could satisfy the needs of an average developer and their average job. More than 95% engineers are not having the problems that geohot is solving and most of our works is straightforward that any LLM can do, thus enabling us to do more of the same and possibly focus on slightly abstract problems if we have the time. Someone said couple of months ago that manual coding will be a privilege and I see that now.

forgetfreemanMay 25, 2026, 5:36 AM

"It’s definitely a better Google for most searches"

This is dangerously incorrect. AI summaries of search results consistently return incorrect information and grossly oversimplified and thus misleading summaries, neither of which are detectable unless one either has prior domain knowledge or spends time drilling into search results to validate the AI output.

bad_usernameMay 25, 2026, 5:53 AM

My experience with ChatGPT as a search engine - it is totally paranoid about checking and re-checking its answers by referencing them in multiple places (I usually read its thinking output). I have not seen an outright hallucination for at least a year. (It is of course a different situation with Google's "AI summary" which is wrong half of the time.)

forgetfreemanMay 25, 2026, 6:02 AM

Ironically I quit using ChatGPT a while back. I decided to run it through it's paces and asked it some rather detailed questions about a range of topics that I have significant domain knowledge on. Without exception the responses I got back where glibly superficial to the point the responses were almost totally devoid of meaningful information. The AI summary on Google search results is so bad it represents an assault on reason.

bmenrighMay 25, 2026, 12:22 PM

Every C program I've had codex write ended up costing me more time than had I just done it from the start myself. Whereas almost every Python program it's written for me saved me time, even including the time I spent cleaning it up.

I chalk this up to primary two reasons. First, I cared a lot more about the implementation details of the C program than I did the Python one, and second, it's just better at simple stand-alone python programs than it is at C programs.

The criteria I know use is "do I care about the implementation details of this?". If I do (because for example it's going to be long-term code that I need to maintain) then the agent likely isn't worth it. But if I don't, there are huge efficiency gains to be had using the agent.

luodaintMay 25, 2026, 6:15 AM

Data from six months of production from one SaaS codebase provides a more limited response. Maintainability doesn't depend on the level of AI usage. Maintainability depends on the discipline during diff reviews. Good sessions: One topic per session; scope defined prior to the agent starting; all diffs read prior to committing. Poor sessions: Broad scope; undefined constraints; rubber-stamped results.

The quality of the codebase decays precisely at the rate you stop reading the results. This is not an issue of AI writing the code. This is an issue of unreviewed code. geohot's issue is entirely valid. This problem does exist. But this isn't dependent on the generation phase.

baqMay 25, 2026, 5:21 AM

Wonder if LLMs in autoreasearch loops would be able to complete tasks geohot has in mind in say 100x average token budget.

If the answer is yes, the argument doesn’t matter: you just run the loop and wait for llm analog of moore’s law to get costs down.

f311aMay 25, 2026, 7:16 AM

Loops make the code even worse. The more local the changes, the better LLMs at it.

dalemhurleyMay 25, 2026, 9:52 AM

When I started coding with AI I would copy / paste code into GPT-3.5 and ask it to update the code, it was a massive productivity boost, minor changes, fully reviewed. Then VSCode allowed tabbing, it was okayish, but I had my finger on the pulse and knew exactly what was changed and had an opinion on the suggestions. Then cursor allowed you to see and approve changes, after a few changes it started making bigger changes but had an review and approve process, things were starting to feel more magic and required more discipline to be on top of changes. Then YOLO mode hit and you could make massive changes, slowly it became easier and easier to just let the AI build code and you just guide it.

The issue is people mix up complexity, novelty, repeatability and scale.

Well documented complex problems can easily be solved by LLMs.

Doing the same thing over and over again is easy for an LLM.

Novelty and scale is very hard for an LLM.

Even small novel problems confuse LLMs.

When you start a new code base the LLM smashes through the boilerplate work. Then when it gets to scale it struggles with context rot plus novelty.

athrowaway3zMay 25, 2026, 6:24 AM

This post hits the nail at a bit of an angle.

The AI agents are great, and any expert can prompt them correctly to get good code. LLMs occasionally pick wrong patterns and start digging a hole, but this is why an expert is required. The code itself is just not worth writing when a detailed prompt can get you the same code typing 20x less text.

Where I agree with the post is:

The adoption of AI agents into software engineering is a problem. Solo projects are great, but our teams have not adjusted to the speed-of-change to a mental model of a project. So I see orgs making a choice to either: slow down or forgo the shared mental model.

Anybody choosing to forgo the mental model is building crooked legacy slop at scale. You can and should save the mental model to an AGENTS.md, but devs need it in their brain to prevent the digging a hole behavior.

To be fair the digging a hole behavior is something humans do just as well. But in teams you'd communicate enough to catch it - hopefully^1. It's the combination of higher speeds and teams that's creating a bit of a disaster.

I'm not sure what a good solution is either. There is a case for solo devs running for 2-month sprints with much more freedom. Perhaps we'll have an "AI Agile manifesto" within a year.

[1] Though you should not underestimate the amount of poor code being created before LLMs. There are enough teams for whom LLMs are practically all upsides. Stay very far away from those.

jvidalvMay 25, 2026, 9:17 AM

Thanks for the message, is exactly what I have in my brain and how I'm seeing it unfold.

I'm lucky to be part of different codebases, +200 engineers codebase in a 10 years old company and code, +5 engineers on fresh code. My personal projects, that are beyond POC's, real users, hundreds of commits.

The LLM agent sweet spot is the last one, they are perfect, as I can contain most of the knowledge in my brain of how it works in/out. Speed is insane as a solo developer.

Then the 5 engineers codebase, is also really good, but here you already start to see the problems, thanks to agents you don't even need to care how it works, I have been working on it for +6 months, it uses TRPC and I don't even know (I don't care) how TRPC works. You feel that no one in the team really knows how stuff works at 100% (fresh codebase, we have build this ourselves!!).

Then there is the old codebase with +200 engineers, this is the worst of all of them, you described it perfectly, a bottomless pit of tech debt. This codebase before agents was an old non-typescript one, it was not perfect, but you could build a mental model and understand it perfectly after a few weeks working on it. Now, is a hot-mess of code duplication and the quality is degrading faster and faster as the code gets worse and the Claude Code adoption increases within the engineering team.

Not sure what will be the outcome of all this, but I wouldn't be surprised if some company wakes up in 2027 with a codebase that maintenance and development has increased by x100 fold thanks to Agents.

baqMay 25, 2026, 6:55 AM

‘If it hurts, stop’ vs ‘if it hurts, do more of it’. Organizations have a choice, some slow down to avoid, some speed up in hope to make issues… non-issues. If the go-fast orgs find workflows that actually truly speed things up without loss of quality, it’s like hitting the jackpot - you’ve found a way to run away from competition without them even realizing it’s possible (for a while anyway, until they notice they’re grossly outpaced).

fatata123May 25, 2026, 3:04 PM

[dead]

Erenay09May 25, 2026, 7:22 AM

While I was reading this post, Anthropic sent me an email with the subject line "Your account has been suspended". What a coincidence :D

totetsuMay 25, 2026, 6:42 AM

>When people see an artifact, they make assumptions about the process that was used to create it. Without even thinking about it, they assume the creator had a basically human state of mind. This assumption is no longer true. Things can be broken in ways that weren’t previously possible, and old proxies of underlying quality like syntax and grammar are useless. AI produced artifacts are not produced by the same process as human ones, and this difference, while extremely subtle in statistics, makes itself obvious when you try to interact with and build on the artifact in human ways.

Once Humans just had oral language, and we could us words to pass ideas from one human mind to another. Then with writing ideas could pass to minds that weren't immediately close together in space or time.. and with this we made complext global spanning civilization. When words just become noise, that one has to be suspect of each one as to whither they'er coming from another human mind, or just a statistical process, can this civilization even survive?

teo_zeroMay 25, 2026, 6:25 AM

When digital cameras replaced traditional ones, we thought it would make photography more democratic: each of us would be Helmut Newton for 15 minutes. But it didn't give us the beautiful portraits and inspired lanscapes we expected, only millions of pictures of food.

How much will it take for AI agents to pass from distilling decades of collective wisdom to copying each other's worst mistakes?

mikewarotMay 25, 2026, 6:31 AM

>But it didn't give us the beautiful portraits and inspired lanscapes we expected, only millions of pictures of food.

Here's a sample of my work using digital cameras, not a food picture in sight.

https://flickr.com/photos/---mike---/albums/7217772029640662...

The thing about having the ability to take effectively free photographs is that it really lets you experiment and learn the edges of what's possible.

I was inspired by Stanford's camera array, and wound up doing virtual focus synthetic aperture photography. I'm hoping to build a rig to do it on near real time, instead of the manual process I used to do on my train rides to and from work.

Sure, the removal of cost lead to a flood of the mundane, but it also means we can capture our lives in ways that even kings couldn't afford in the past. I have thousands of good photos, and even some video, of friends and family.

wietherMay 25, 2026, 10:56 AM

The comparison between AI and digital photography is a great one IMHO.

I used to be deep into photography in the 00s, until I came to the conclusion that I was spending lots of money, spending hours trying to take the perfect picture and then hours reworking them on the computer... for what?

Sure, I had fun while doing all of this, but to me art is about sharing. And given the sheer amount of pictures that were published daily on Flickr at the time, I basically had to spend more time in trying to reach out to people to share with them, than in creating something, which was my initial goal.

Or I could strong-harm my relatives, but I myself hate people that are forcing me to watch at their kids/holidays pictures, so...

And looking at the figures on your pictures, taking them at absolute value (not considering what personal value you can give them), it seems that your art is not seen by many. Contrary to thousands of random pictures of take-away than can reach thousands of eyes in a mater of hours.

Which is, to me, depressing. You put your soul in something and it has almost no reach, while random snapshots got millions of views.

And I see how the same thing will apply to software: flood of vibecoded trash that can be used by many, while deeply thought after software will be used by a few.

The good news for me here is that I never thought software as art but only a tool. I can vibecode the hell out of an issue, and the code will live on my private repos. Don't care. Sometimes I put stuff on Github but I don't care if someone (or something) uses it. I'm focussed on solving my issues.

mikewarotMay 25, 2026, 8:00 PM

>it seems that your art is not seen by many

Oh, there's a story there about assumptions and bad UI. I had fairly large numbers, and tried to whittle down the thousands of photos I had posted to the ones that were favorited, and in the process erased everyone elses favorite tags, leading to rage quitting Flickr for a while. It's all now a mere hint of what it was.

hmontazeriMay 25, 2026, 6:55 AM

When a blog like this goes completely black or white on a topic I get skeptical. Nothing in life is 0 or 1. So is AI. Has some good to it and some obvious issues. All not that big of a deal. Ppl try to position themselves on the edges bc that’s what’s polarizes and engages conversation…

fontainMay 25, 2026, 4:39 AM

We all remember cryptocurrency. Everyone in tech proclaimed fiat was dead, every office buzzed with talk of every possible way that cryptocurrency could be used, billions of dollars flooded in to projects losing money hand over fist. The cynics reacted to the froth with outright rejection of the idea. And today… cryptocurrency exists, it has some use, but it didn’t take over the world, it didn’t kill fiat, it was useful in some areas and worthless in others. AI will be the same. The noisiest proponents will be over exaggerating. The most cynical cynics will be underestimating. The result will be somewhere in the middle. Success will not be predicated on adoption of the technology. We, nerds, are bad at predicting the impact of technology.

eadwuMay 26, 2026, 1:35 AM

I do think that between luddite or "ai pilled" ai usage should be much more in favor of "ai pilled".

However, this isn't a plug to be using AI for coding everything, but a more general plug that AI should be integrated to a lot more things outside of the mainstay of chatbots.

There is a lot of merit to using AI to establish a new abstraction layer.

simianwordsMay 25, 2026, 4:56 AM

People misunderstand how AI is used in coding in normal work environments. New feature requirement comes - maybe you need a new service or some new classes. You need to do some research first.

You guide the AI with some prompts and give it some guidance on how to scenario-test it. It makes some classes, test methods. Maybe ~2000 lines and you do a quick verification, check if the overall idea looks okay. Ask it to fix a few design things and then merge it.

Its much easier than doing it yourself with all the boilerplate and understanding each esoteric language specific thing. Which library do I use for UDP communication in golang? The agent might have made a good assumption. These kind of things is where it speeds it up.

jhanschooMay 25, 2026, 5:25 AM

I don't think Geohot has a good idea about LeCun and Hutter's views on the limitations of LLMs. I think that on abstract, textual domains, LLMs perform superbly, and they would agree. I am not too well-informed about LeCun and Hutter's views either, but I think that:

LeCun thinks that LLMs are a bad fit for AI that understands the physical, dynamical systems that we inhabit, and that understanding this is necessary for AGI/ASI.

I don't know that Hutter is bearish on LLMs, but Hutter is interested in AI that can reason exceptionally well given infinite compute, and approximations of such a reasoning AI. I think he is open to the idea that LLMs can be such an approximation.

mreidMay 25, 2026, 6:02 AM

From the article:

> Without fully endorsing all their ideas, I’m now in the LeCun/Marcus camp on LLMs.

I'm pretty sure he means "Yann LeCun and Gary Marcus" not "Yann LeCun and Marcus Hutter".

fluxusarsMay 25, 2026, 8:11 AM

I think the ai-as-an-exoskeleton analogy is quite apt: 1. It still requires your intent to move in the right direction rather than being fully autonomous, and 2. It's too big and bulky to use for delicate tasks.

edg5000May 25, 2026, 9:25 AM

Yes. I think of it as a car or tractor (which is also a kind of exoskeleton). You still need intricate knowledge; it's really an amplifier. Steer it wrong and you'll have a 1000 (very hard to detect upfront) bugs in the blink of an eye. Indeed it's hard to wield. At a minimum you need to understand your harness at the character level - the exact shape of the context should be roughly known when operating a harness.

I'm very interested in APIs that allow client-side context construction rather than relying on opaque APIs concatenating strings from your JSON messages and injecting tool prompts. I found that generally, you can craft the entire context as a unicode string and just stuff it in the system message. This works best with models where the chat template is published.

StefanSkoMay 25, 2026, 6:42 PM

Very much agree. All these currently hyped workflows removing the human from the loop attribute a modularity to those agents that just does not hold up. They will always be leaky abstractions given their stochastic nature. That being said, they can be a great tool for getting past the "blank page" and just start or getting unstuck in general.

fagnerbrackMay 25, 2026, 5:55 AM

"Things can be broken in ways that weren’t previously possible" and also "Things can work in ways that weren’t previously possible". It all depends on what the use the tool for, if you're a carpenter you're going to do a bad job regardless if you have a fancy hammer or a basic one. If you're an expert, give them a basic hammer and they'll do the work, give them a fancy hammer and they'll do the same, perhaps a little faster (or not).

webprofusionMay 25, 2026, 5:19 AM

I don't think you can go completely hands-off for quality products but you can relax and let the agent do as much as possible. It does enable things that probably wouldn't have happened otherwise.

If you are already comfortable with letting other devs work on features then it's easier, because it's similar (arguably you have more control with AI, because what you say goes regardless of hierarchy).

anabisMay 26, 2026, 12:33 AM

Yet the Eternal September is what made the modern AI possible. I asked Cowork for share of the corpus before / after the event, and the corpus before it is <1%, which fits my hunch.

sgarrityMay 25, 2026, 9:55 AM

"When people see an artifact, they make assumptions about the process that was used to create it. Without even thinking about it, they assume the creator had a basically human state of mind. This assumption is no longer true."

I've been running into this experience with non-code artifacts, like slideshows and documents.

m132May 25, 2026, 11:26 AM

Didn't expect this to come from him. Seeing some of his recent YouTube streams and previous blog posts, he seemed like he has unconditionally bought into the idea of vibecoding, even as he had Opus 4.5 (latest at the time) stuck failing to enumerate a serial device for solid hours. What a turn.

vascoMay 25, 2026, 6:33 AM

> And whenever you need a quick prototype and don’t care about polish, it is absurdly fast. But is it a software engineer? Not close to the bar at any company I have worked at.

This line which he wrote, will override any quality gaps, because the cost to produce that shitty software will be lower than the cost to produce good software.

albinnMay 25, 2026, 5:32 AM

> It’s definitely a better Google for most searches

I can't agree with this. You tend to get one point of view, often without any actual resources and references so you have to go look it up yourself, on [insert search engine]. Plus, what does it say when we consider an AI the one stop for our data intakes.

oneeyedpigeonMay 25, 2026, 5:45 AM

I find that it's typically better than Google search has been for a while, but not better than it's ever been.

EkarosMay 25, 2026, 7:06 AM

More so tells me just how bad Google search has become and just how bad content in general has become.

jrvarela56May 25, 2026, 5:27 AM

how do you measure if google’s engineering org is more productive than meta’s? What about comparing 2 startups/small teams?

I think the discussion about methods (coding agents included) depends on answering those questions. Seems pointless to claim these agents [dont] make you more productive.

Although, at a first glance, the productivity increase does seem like nothing I’ve seen before. Even more than the transition of making webapps in plain js -> jquery -> frameworks or going from something like Flask to using Rails.

Problem is this is not evidence based. I just feel prototyping has speed up 100x. So the number of iterations/attempts has gone up. Transforming specs into a test suite takes a fraction of the time. Dunno, feels weird not to be able to be overall more productive (do more with less time) if you have these new tools.

gojomoMay 25, 2026, 5:30 AM

Smart guy but whoever eventually actually fixes X search will probably use AI coding assistance to do it.

p0w3n3dMay 25, 2026, 6:52 AM

  But each time I suspected I could have done it better and faster manually

There is a class of tasks that can't be done faster manually, unless you're some sort of colour-smells-like-chicken-and-numbers-have-taste genius. And there is other class (my suspicion now is any non-standard task+framework) that are slower than using agents. So I can imagine you have excellent experience with some tasks like USB hacking and would do it faster than LLM. On the other hand for me, as a Java developer, hacking a USB is finally possible with LLM. Otherwise I'd need to stop-and-learn for some time, which I wouldn't, so either I'd by a more expensive hardware that fulfills my requirements, or put the USB reverse engineering project to my 100 acre todo list

f311aMay 25, 2026, 7:12 AM

It all depends on the code quality bar. If it's high, a lot of tasks will not be completed much faster. The main speed comes from trusting LLMs output. When you review each change and reprompt LLMs to make the code look like you want. Suddenly, things become much slower and reviews/reprompts are very mentally exhausting.

pipeline_peakMay 25, 2026, 5:05 AM

The more specific your work is, the more these LLM’s seem to struggle.

If your work was previously googling stack overflow, it can be incredibly useful at working through that. Which let’s face it, that’s what a lot of us do.

mehdixMay 25, 2026, 11:41 AM

I have started to think of engineers who have entirely replaced their critical thought processes with AI agents, AI proxies.

sometimelurkerMay 25, 2026, 1:13 PM

AI labs should put some incentives in their RL to make their models write shorter code so it's easier to check

georgevenMay 25, 2026, 5:22 PM

The issue then becomes it writes insane one liners that are impossible to read. Would be difficult to set up an RL environment where the goal is correctness and readability.

sometimelurkerMay 25, 2026, 10:08 PM

you could have an AI judge look for correct and readable code

makerofthingsMay 25, 2026, 7:17 AM

I wish these posts that talk about non-human mistakes that agents make would post some examples. They would be interesting to see.

edg5000May 25, 2026, 9:34 AM

One nasty set of bugs Claude recently introduced; it was doing a large refactor which involved changing call sites to conform to a changed API. Tedious, but straight forward. It helpfully added about 50 if(!something) continue; statements, this would make the code silently absorb issues that should have thrown. Had I accepted this, the results would have made the program run like shit but not crash, making debugging much harder than it needs to be. Really effing annoying! Thanks Claude!

protocoltureMay 25, 2026, 5:01 AM

Eh but statistical models are obviously useful, because statistically 99% of your codebase wont involve new idea invention. Tools that write all the boilerplate code used to have names and job titles.

I hate how both the for and against case for LLMs are just so bloody terrible at addressing these things.

dathanb82May 25, 2026, 5:15 AM

This is a good take. The most effective combination of AI and skilled practitioner is using AI to amplify the abilities of the skilled practitioner. And in particular, max benefit comes from exploiting comparative advantage. AIs are really good at boilerplate -- in many cases better than humans because humans will optimize the process by doing copy/paste and often inject errors in the process -- whereas humans are better at abstract and critical reasoning. There's a very real and valuable use case for AI, but it's not replacing humans, it's taking the things that humans don't like doing (and that a computer can do well already) off the human's plate, so humans can focus more exclusively on the things that they do better than the AI. And at least with the current architecture of AI models, there will _always_ be higher-level reasoning that humans do better than the machine.

theojulienneMay 25, 2026, 5:30 AM

This. A ~staff software engineer designing big changes at one level above the raw implementation details using Opus 4.7 + superpowers today can genuinely ship multiple times more at the same quality level than pre-AI. The level of what a whole team could ship before.

You have to use something like superpowers, the key is that the humans need to make the important decisions.

You have to review the code - just like you had to review the code humans wrote. There will be iterations.

You have to give the LLM skills and patterns to follow, access to architectural documents, etc, just like humans needed to be onboarded at a company and do the same.

If you get all of these right with today's LLMs, you will never write code at all because it is so obviously not the best use of your time. If you feel that you are still better at writing the code manually, you have not done the above right, fix your workflow and try again.

zarzavatMay 25, 2026, 4:53 AM

It really feels like a mass psychosis. I'm not an AI sceptic insofar as I fully expect to get replaced by some future AI system. But what we have now isn't it.

To use a Geohot-inspired analogy, what we have now is like the Google self-driving car of 2010. It works most of the time, yet sometimes fails in unpredictable ways. So you need a safety driver behind the wheel to constantly watch what it's doing (the code review).

A real AI agent would not need a safety driver. We don't have that but many people are basically saying "fuck it, I'm just going to set this car off on its own and see what happens". And sure if you're prototyping it's not dangerous. But for production systems that is dangerous.

HerbManicMay 25, 2026, 4:59 AM

That is a fair analogy.

There is some very cool tech it just needs continued refinement, there is a path forwards even if it isn't always the clearest. This is happening but it is taking years and a lot of work to get done.

palla89May 25, 2026, 10:15 AM

I don't know why but I feel happy and relieved reading a piece of this written by Geohot himself.

rakel_rakelMay 25, 2026, 5:25 AM

> The bottom performers won’t have that self check. They are the ones producing 10x output with the agents. What do you think is happening to the average output of that organization?

Nailed it!

At my last place this was encouraged (by non-technical leadership driving the AI adoption policies, as well as setting salaries) and seen as a huge win.

The "step change in number of created PR's" was celebrated (cult-style), and by one of the (co) CEO's praised as a paradigm shift of the same magnitude as the personal computer. Meanwhile, I was stuck finding insta-reject level bugs in pull requests from people one-shotting 6000 line PR's "finally solving" long-standing issues from the backlog. Needless to say I left.

jreynarMay 25, 2026, 12:54 PM

Another problem with perception of AI tools, for coding and other things, is that people often adopt a one-size-fits-all view. If Claude/Codex whatever can fix a bug in my tiny hobby project then it's going to revolutionize all software engineering. If it can write a haiku, then it the great American novel will be dead in a few years and the novelists will starve.

There aren't many truly general purpose tools so viewing things this way seems like either a fantasy or an over-reaction. And if nothing else the processes we use will have to change along with the tools.

It's the early days so we still have a lot to figure out but one of the most significant is which tools are appropriate for what sort of tasks. I've had good luck refactoring a small code base, building some small hobby projects and building features for our company's product. But, I've also dodged bullets doing greenfield development on some features where Claude (my default) has made what seemed like sound choices early on, and which I approved of, only to build something fragile or with unforseen consequences. I haven't quite figured out what distinguished those situations from the successful ones but I'm trying. But it's complicated by the fact that things are evolving quickly and yesterday's failure mode isn't the same as today's and, for that matter, yesterday's successes aren't guaranted to be repeatable today.

practalMay 25, 2026, 9:44 AM

On Saturday I thought I had vibe coded myself into a mess. I had implemented a new block type in my structured editor for Practal Zero (or rather let Codex do it), and suddenly the syntax highlighting broke in the whole document. Asking Codex to fix it didn't work. I was contemplating to restart the whole project on a basis that I actually fully understand, but that would set me back so much when the first reasonable prototype seemed so close. Instead, I took a walk.

See, the project actually has a well thought out structure that I design carefully, but more and more of it gets filled out by Codex. Codex is not smart enough to remember all the high-level design considerations, some of which had not been documented because I was just implicitly assuming them. So the fix was to use Codex to isolate the error, think about in terms of the high-level design, and fix the problem, which was partially an implementation problem, and partially a problem of the high-level design.

I fixed the high-level design with discussions with Codex, and documenting this, and then let Codex implement the fixes. The discussion took me more than an hour, the implementation was done in a few minutes.

This working style is similar to doing math: You have a high-level idea of what you are doing, and let that guide you, and Codex assumes the role of something that fills out all of the details you take for granted. Often it turns out your high-level idea had flaws, and this shows up in your code not working as expected. So you revise your high-level idea, refactor the code to reflect the modified high-level design, rinse and repeat.

Working this way is still really hard, but it allows me to do things I could not have done before. Getting your ideas validated (or refuted) in minutes instead of days is huge, and makes it possible to march through stuff that would have turned into a deadly swamp before, at least for me.

Now. Do I think that most corporate programmers will use Codex or CC in this way? I don't know, but I think probably not. So what will stop them going into the swamp until it swallows them, instead of backing up in time and marching around it?

practalMay 25, 2026, 10:29 AM

To add, what also often happens in these discussions is that Codex suggests a design that makes no real sense at all, or that it brings up two or three design alternatives, and recommends exactly the wrong one.

readthenotes1May 25, 2026, 6:03 AM

There's a time and a place for assembly language programming. Of course, I knew someone who would say there's a time and a place for machine language programming (improved it by reprogramming a device by flipping the 17 switches on the front panel)

ChaosvexMay 25, 2026, 5:03 AM

> and it’s taking longer and longer to realize that they can’t

For something to take "longer and longer" to realise, doesn't they imply that it's been realised at least once before or that there was an expected deadline for the realisation?

Okay, that's a nitpick.

mrecMay 25, 2026, 5:19 AM

I read it as "agents can't program, and with each new generation of agents it's taking longer and longer to realize that that specific iteration can't". Maybe taking the Principle of Charity too far, I dunno.

ChaosvexMay 25, 2026, 5:37 AM

Nah, that's fair, I was just being a bit tongue-in-cheek.

thecatappsMay 25, 2026, 5:10 AM

Wake me up.... when sloptember ends.

mikojanMay 25, 2026, 6:44 AM

It won‘t..

DeathArrowMay 25, 2026, 5:18 AM

To me this sounds like an old cobbler complaining that machines aren't producing good shoes if left unsupervised and that the old process of making shoes completely by hand is far superior.

So what he is telling us? That agents are not infaillable and they are not capable to one shot complex software and they do not produce perfect code?

We know what and the solution is to use agents for what they are good at and work around their limitations and we have a human in the loop.

>not some RLVR shit that comments out the failing test and tells you all the tests are now passing

That's what harnesses should be about: detect when the agent is misbehaving and force it to take the right approach.

This example in particular should be easy to solve if we generated the tests before coding and we have a workflow or state machine that doesn't allow the agent to disable tests and doesn't allow it to reach the next stage unless all tests are passing.

Alex_L_WoodMay 25, 2026, 5:37 AM

LLM proponents always use some language like "these old, stuck up dinosaurs with their manual labor vs us cool smart kids with automated labor", but they forget one thing - with automated labor the performance and cost difference was easily measurable in favor of the automation. With LLMs it's neither measurable nor visible (no better software, no faster delivery overall in the industry), and the costs are pretty bad. Besides personal anecdotes of someone toiling away at yet another AI harness project on GitHub.

DeathArrowMay 25, 2026, 6:57 AM

Right now, to get some good results from AI and save time, you will have to spend a lot of tokens and money. Maybe in the future, the things will get better, I don't know.

Saving money is the wrong reason to use AI now. AI is expensive if you want good results.

But what AI is good for, is it allows you to build fast.

Also, I don't see everything being automated. To get good results you have to drive the AI.

The factories still have workers supervising the process and doing some high value manual processes even if most of the production is done by machines.

chrisco255May 25, 2026, 9:11 AM

Modern shoes are made by a mix of machine and hand. There is still quite a bit of manual labor to produce shoes: https://www.youtube.com/watch?v=bK8pcAYapXQ

This is effectively what's happening to software. We are getting some forms of automation but I believe there's plenty of manual work and coordination left for humans to do.

mattlondonMay 25, 2026, 8:07 AM

It's just a tool. Use it well or use it badly - just the same as any. If you are generating slop using the tool, we'll then that is your own problem.

For me, the AI is essentially "faster hands" that can type what I am thinking way faster than I can do it. I tell it what I want, I give it the broad architecture and design patterns/types to use, and any specific test conditions, and let it write all of that usually by the time I have responded to a single email or chat message or two. Custom instructions etc build overtime to address model blind spots or my own personal taste so I don't have to repeat myself in every prompt for cross-cutting things.

Does it "one shot it"? Almost never - we go around the cycle a few times, treating it like pair programming a junior or intern by keeping a close eye on the broad direction and making sure it is acceptable - course-correcting where it matters, but cutting some slack where it doesn't. Sometimes I ask it why it picked a particular approach (that I wouldn't have necessarily) and it gives me a cogent explanation and we go with it, so I actually sometimes learn new things from it too which is great.

The other use case is just it's sheer capacity to research a codebase and hold everything in it's attention at once. It can comprehend unfamiliar code way faster and way more in-depth than I can. So if you are in an unfamiliar code base or a language or framework you are not that familiar with, it absolutely shines because it can just absorb all that info in seconds, and then you can just drill it with questions and what-abouts and how does it do this and what technique is used for that and that, what are the existing patterns and norms in this codebase when it comes to foo or bar? Etc etc

What I am not doing is deferring everything off to the AI unless it really doesn't matter (e.g. disposable one-off or prototype code). Same that I would not expect a junior or intern to make big architectural decisions when implementing something - you keep them on a fairly close leash and watch what they are up to.

borskiMay 25, 2026, 8:14 AM

Precisely. It is faster than I am at the easy part of writing the code; but the decisions about what to write (at least at a high level) are still my own.

big-chungus4May 25, 2026, 6:36 AM

Why sloptember when it's may

MattJ100May 25, 2026, 7:07 AM

It's a reference to Eternal September, the name applied to the cultural shift of the internet as the general public started gaining access to it en masse in the 90s.

Sloptember is clearly a reference to this - the similarity being that masses of AI generated content, from social media posts to open source contributions are replacing the human internet. In a way this is related to the "dead internet theory", an idea I previously found hard to believe, but these days could easily be true.

If the history of the internet interests you, both these are worth looking up.

tardedmemeMay 25, 2026, 6:55 AM

https://en.wikipedia.org/wiki/Eternal_September

DeathArrowMay 25, 2026, 5:52 AM

On the other hand we see success stories such as antirez using agents to work on Redis and Deepseek v4 flash inference.

olalondeMay 25, 2026, 1:04 PM

Prediction: the author will wish the Internet Archive didn't exist in a few years.

4b11b4May 25, 2026, 6:08 PM

It depends

bassieeMay 25, 2026, 11:10 AM

My point is, before LLM's 90% of the code was already human made slop, now its just going toward computer generated slop instead.

biosubterraneanMay 25, 2026, 4:34 AM

preach it

dainiusseMay 25, 2026, 5:17 AM

The horse is better than a car!

sodafountanMay 25, 2026, 5:26 AM

Good News! The horse and the car can coexist

tonyedgecombeMay 25, 2026, 6:04 AM

Bad news! The horse population declined by 85% after the widespread introduction of the car.

chrisco255May 25, 2026, 9:03 AM

Good news! Buggy/carriage mechanics became auto mechanics and the number of mechanics jobs increased overall.

sodafountanMay 25, 2026, 6:30 AM

Good News! I learned something today.

tonyedgecombeMay 25, 2026, 12:24 PM

I wouldn’t get too excited, I got that figure from ChatGPT. Who knows if it’s correct.

tardedmemeMay 25, 2026, 10:35 AM

The car is better than the hyperloop!

slashdaveMay 25, 2026, 5:19 AM

> It is a golden era for buckets and buckets of slop, and a dark age for gems of quality.

I mean, this has been the trend for decades really, before LLMs were a thing. The incentive is skewed toward quantity rather than quality. The new tools just add more fuel to the fire.

Code quality is also really lacking in much of the industry. The truth is, these LLM models, as limited as they are, program at a level above that of the median junior programmer.

mthrowawayMay 25, 2026, 5:16 AM

Bro claims to write good code. He got fired <4 weeks from twitter. AI is hyped but code isnt that bad.

coolThingsFirstMay 25, 2026, 5:19 AM

spill the tea, fired or quit?

ModernMechMay 25, 2026, 3:33 PM

Wasn't enough room at Twitter for those two egos.

spiderfarmerMay 25, 2026, 4:42 AM

Coders underestimate the utility of AI in so many boring day to day tasks. If you freelance, that’s where the money is at, not in creating a startup that fills holes in AI offerings or in creating generic slop while hoping for ad money.

altcognitoMay 25, 2026, 4:52 AM

The amount of domain specific apps that will be created will likely make excel look like yesterday’s news.

Alex_L_WoodMay 25, 2026, 5:42 AM

I heard that a year ago, I guess we still need to wait a bit more. Thought agents were fast!

dansquizsoftMay 25, 2026, 5:30 AM

People have been trying to replace Excel for the last 40 years...

cmrdporcupineMay 25, 2026, 10:57 AM

"I don’t think models like this will ever be able to program,"

I don't get how anybody who has used the SOTA models in the last 3-4 months can write a sentence like this?

They most certainly can program. And usually better than 90% of my coworkers.

The question is really.. Can they engineer? By which I mean handle the duties of a software engineer working in a team, managing a large complex system, making reviewable pieces, forward progress in incremental steps, etc.

No, that part I'm definitely more skeptical about. That requires slave driving by the person in front of the prompt.

But this is a useful distinction to make. Because making overly pessimistic claims about the coding capabilities of the models makes me question the author's experiences with them.

I think agentic tools are toxic to team programming culture and engineering that produces reliable stable results. But I wouldn't for the life of me question their ability to write programs.

geraldsterlingMay 26, 2026, 1:15 PM

[flagged]

coalstartprobMay 25, 2026, 8:30 AM

[dead]

coalstartprobMay 25, 2026, 8:29 AM

[dead]

zenai666May 25, 2026, 7:20 AM

[flagged]

--usernameMay 25, 2026, 5:45 AM

[flagged]

blobbersMay 25, 2026, 4:56 AM

I don't think LeCun is saying they won't be able to program. I think he says we won't hit AGI. Programming does not require AGI; it's a pretty specific skill!

-- I think this article is COPE, if I'm being quite honest. I thought of putting cute analogies, like the C programmers saying the Python and Javascript programmers are not "hardcore" enough... but the truth should be obvious to anyone using LLMs effectively.

-- Current AI is a much better programmer than 100% of people and when directed by someone in that top 10%, it's a force majeur.

martinpwMay 25, 2026, 9:35 AM

> it's a force majeur

I assume you meant something like 'force multiplier"? Force Majeure is an uncontrollable event that prevents a party from fulfilling a contract. Which some may argue is what AI will also deliver. :)

cheevlyMay 25, 2026, 6:01 AM

This is my experience. Though ice been writing LLM harnesses, agents, tooling, etc for 5 years now and believe it requires several hundred hours of experience before understanding how to consistently outperform at scale.

simianwordsMay 25, 2026, 4:48 AM

Nah this person is dead wrong. Lets come back in 2 years and check on it. I'm willing to make a reasonable bet on these terms: companies will go even more AI native, will use even more tokens and spend even more money.

EDIT: To people downvoting me, please come up with a reasonable bet and lets try to work it out.

EDIT 2: $500 bet paid to your account on whether LLM's are going to still be used productively or not. No one?

EDIT 3: Any bet that would express the author's argument in a way that can be disproven in the future

coolThingsFirstMay 25, 2026, 5:06 AM

Geohot's next venture will be writing a book titled "Fear & Trembling".

SebastianSosaMay 25, 2026, 6:54 AM

A problem that's impossible to detect is indistinguishable from it working. Hence it works. Hence it's not a problem.

wyagerMay 25, 2026, 4:37 AM

> They are a highly sophisticated statistical model designed to mimic the distribution of programming

Are we really still doing this?

bluegattyMay 25, 2026, 4:56 AM

" Agents cannot program, and it’s taking longer and longer to realize that they can’t. They are a highly sophisticated statistical model designed to mimic the distribution of programming"

In other words - they can program, and probably better than you.

I don't like being too critical but this is a really superficial post - as if either 'AI is a Software Engineer - or - It must be Fraud'

It's an extremely powerful tool that is very 'pattern oriented' and with guidance can absolutely write great code - and even across modules given the right basis.

It's also great at so many other tasks - finding bugs in big code bases, doing migrations etc.

It's not going to make very goo architectural decisions for you, and if you're doing anything novel you have to read most of the code ... but that's too be expected.

nixgeekMay 25, 2026, 5:12 AM

It’s not like the author is a noob.

https://en.wikipedia.org/wiki/George_Hotz

In fact, he’s done several things that are truly hard, and has a well-deserved engineering reputation.

latentseaMay 25, 2026, 6:05 AM

Well he doesn't write CRUD apps, which plenty of us do, and with a decent harness, agents can do decent enough work on them.

bluegattyMay 25, 2026, 5:27 AM

The author is absurdly wrong.

It's ridiculous to suggest that 'AI can't code' - when the entire development world has moved into agentic coding, including all of the best developers in the world, and it's yielding positive results in most scenarios.

It's a callow 'bad twitter take' the length of an article.

He's not wrong to suggest that IA is a 'stochastic mechanism' over all the code that's ever been written, but that's evidence of the mechanism, and frankly, describe how it is able to code.

And yes - organizations will misappropriate AI at scale as they do with everything.

His premise is so far out of proportion and misguided, it's tantamount to 'fake moon landing' conspiracy theory.

kelnosMay 25, 2026, 5:35 AM

> when the entire development world has moved into agentic coding

Careful, your bubble is showing.

bluegattyMay 25, 2026, 11:55 AM

“By 2005 or so, it will become clear that the Internet's impact on the economy has been no greater than the fax machine.”

The Eternal Sloptember

Comments