by bertman on 4/28/25, 6:16 AM with 129 comments
by n4r9 on 4/28/25, 10:34 AM
Firstly:
> LLMs are capable of appearing to have a theory about a program ... but it’s, charitably, illusion.
To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.
Secondly:
> Theories are developed by doing the work and LLMs do not do the work
Isn't this a little... anthropocentric? That's the way humans develop theories. In principle, could a theory not be developed by transmitting information into someone's brain patterns as if they had done the work?
by ebiester on 4/28/25, 1:52 PM
However, I have two counters:
- First, the rational argument right now is that one person and money spent toward LLMs can replace three - or more - programmers total. This is the argument with a three year bound. The current technology will improve and developers will learn how to use it to its potential.
- Second, the optimistic argument is that a combination of the LLM model with larger context windows and other supporting technology around it will allow it to emulate a theory of mind that is similar to the average programmer. Consider Go or Chess - we didn't think computers had the theory of mind to be better than a human, but it found other ways. For humans, Naur's advice stands. We cannot assume that this is true if there are tools with different strengths and weaknesses than humans.
by falcor84 on 4/28/25, 1:25 PM
> Second, you cannot effectively work on a large program without a working "theory" of that program...
I find the whole argument and particularly the above to be a senseless rejection of bootstrapping. Obviously there was a point in time (for any program, individual programmer and humanity as a whole) that we didn't have a "theory" and didn't do the work, but now we have both, so a program and its theory can appear "de novo".
So with that in mind, how can we reject the possibility that as an AI Agent (e.g. Aider) works on a program over time, it bootstraps a theory?
by andai on 4/28/25, 10:29 PM
The former is false, and the latter is kind of true -- the network does not update itself yet, unfortunately, but we work around it with careful manipulation of the context.
Part of the discussion here is that when an LLM is working with a system that it designed, it understands it better than one it didn't. Because the system matches its own "expectations", its own "habits" (overall design, naming conventions, etc.)
I often notice complicated systems created by humans (e.g. 20 page long prompts), adding more and more to the prompt, to compensate for the fact that the model is fundamentally struggling to work in the way asked of it, instead of letting the model design a workflow that comes naturally to it.
by woah on 4/28/25, 4:56 PM
by BenoitEssiambre on 4/28/25, 3:36 PM
by philipswood on 4/28/25, 11:40 AM
It isn't certain that this framing is true. As part of learning to predict the outcome of the work token by token, LLMs very well might be "doing the work" as an intermediate step via some kind of reverse engineering.
by BiraIgnacio on 4/28/25, 12:43 PM
by andriesm on 4/29/25, 11:10 AM
However, is it really true that LLM's cannot reason AT ALL or cannot do theory construction AT ALL?
Maybe they are just pretty bad at it. Say 2 out 10. But almost certainly not 0 out of 10.
They used to be at 0, and now they're at 2.
Systematically breaking down problems and Systematically reasononing through parts, as we can see with chain-of-thought hints that further improvements may come.
What most people however now agree is that LLMs can learn and apply existing theories.
So if you teach an LLM enough theories iy can still be VERY useful and solve many coding problems, because an LLM can memorise more theories than any human can. Big chunks of computer software still keeps reinventing wheels.
The other objection from the article, that without theory building an AI cannot make additions or changes to a large code base very effectively - this suggests an idea to try - before promting the AI for a change on a large code base, prepend it with a big description of the entire program, the main ideas and how they map yo certain files, classes, modules etc, and see if this doesn't improve your results?
And in case you are concerned that. documenting and typing out entire system theories for every new prompt, keep in mind that this is something you can write once and keep reusing (and adding to over time incrementally).
Of course context limits may still be a constraint.
Of course I am not saying "definitely AI will make all human programmers jobless".
I'm merely saying, these things are already a massive productivity boost, if used correctly.
I've been programming for 30 years, started using cursor last year, and you would need to fight me to take it away from me.
I'm happy to press ESC to cancel all the bad code suggestions, to still have all thr good tab-completes, prompts, better than stack-overflow question answering etc.
by analyte123 on 4/28/25, 4:00 PM
Indeed, it's quickly obvious where an LLM is lacking context because the type of a variable is not well-specified (or specified at all), the schema of a JSON blob is not specified, or there is some other secret constraint that maybe someone had in their head X years ago.
by xpe on 4/28/25, 3:55 PM
This is often the case but does not _have_ to be so. LLMs can use chain of thought to “talk out loud” and “do the work”. It can use supplementary documents and iterate on its work. The quality of course varies, but it is getting better. When I read Gemini 2.5’s “thinking” notes, it indeed can build up text that is not directly present in its training data.
Putting aside anthropocentric definitions of “reasoning” and “consciousness” are key to how I think about the issues here. I’m intentionally steering completely clear of consciousness.
Modern SOTA LLMs are indeed getting better at what people call “reasoning”. We don’t need to quibble over defining some quality bar; that is probably context-dependent and maybe even arbitrary.
It is clear LLMs are doing better at “reasoning” — I’m using quotes to emphasize that (to me) it doesn’t matter if their inner mechanisms for doing reasoning don’t look like human mechanisms. Instead, run experiments and look at the results.
We’re not talking about the hard problem of consciousness, we’re talking about something that can indeed be measured: roughly speaking, the ability to derive new truths from existing ones.
(Because this topic is charged and easily misunderstood, let me clarify some questions that I’m not commenting on here: How far can the transformer-based model take us? Are data and power hungry AI models cost-effective? What viable business plans exist? How much short-term risk, to say, employment and cybersecurity? How much long-term risk to human values, security, thriving, and self-determination?)
Even if you disagree with parts of my characterization above, hear this: We should at least be honest to ourselves when we move the goal posts.
Don’t mistake my tone for zealotry. I’m open to careful criticism. If you do, please don’t try to lump me into one “side” on the topic of AI — whether it be market conditions, commercialization, safety, or research priorities — you probably don’t know me well enough to do that (yet). Apologies for the pre-defensive posture; but the convos here are often … fraught, so I’m trying to head off some of the usual styles of reply.
by IanCal on 4/28/25, 10:40 AM
> In this essay, I will perform the logical fallacy of argument from authority (wikipedia.org) to attack the notion that large language model (LLM)-based generative "AI" systems are capable of doing the work of human programmers.
Is any part of this intended to be valid? It's a very weak argument - is that the purpose?
by philipswood on 4/28/25, 11:49 AM
I suspect that the question to his final answer is:
> To replace human programmers, LLMs would need to be able to build theories by Ryle’s definition
by edanm on 4/28/25, 6:28 PM
by hnaccountme on 4/30/25, 10:06 AM
But most programmers i've encountered are just converting English to <programming language>. If a bug is reported then convert English to <programming language>
AI is the new Crypto
by voidhorse on 4/28/25, 12:17 PM
There are alternative views on theorizing that reject flat positivistic reductions and attempt to show that theories are metaphysical and force us to make varying degrees of ontological and normative claims, see the work of Marx Wartofsky, for example. This view is far more humanistic and ties in directly to sociological bases in praxis. This view will support the author's claims much better. Furthermore, Wartofsky differentiates between different types of cognitive representations (e.g. there is a difference between full blown theories and simple analogies). A lot of people use the term "theory" way more loosely than a proper analysis and rigorous epistemic examination would necessitate.
(I'm not going to make the argument here but fwiw, it's clear under these notions that LLMs do not form theories, however, they are playing an increasingly important part in our epistemic activity of theory development)
by fedeb95 on 4/28/25, 3:49 PM
A problem as old as human itself.
by ninetyninenine on 4/28/25, 3:41 PM
I have a new concept for the author to understand: proof. He doesn’t have any.
Let me tell you something about LLMs. We don’t understand what’s going on internally. LLMs say things that are true and untrue just like humans do and we don’t know if what it says is a general lack of theory building ability or if it’s lying or if it has flickers of theory building and becomes delusional at other times. We literally do not know. The whole thing is a black box that we can only poke at.
What ticks me off is all these geniuses who write these blog posts with the authority of a know it all when clearly we have no fucking clue about what’s going on.
Even more genius is when he uses concepts like “mind” and “theory” building the most hand wavy disagreed upon words in existence and rest his foundations on these words when no people ever really agree on what these fucking things are.
You can muse philosophically all you want and in any direction but it’s all bs without definitive proof. It’s like religion. How people made up shit about nature because they didn’t truly understand nature. This is the idiocy with this article. It’s building a religious following and making wild claims without proof.
by turtlethink on 4/28/25, 6:11 PM
The basic argument in the article above (and in most of this comment thread) is that LLMs could never reason because they can't do what humans are doing when we reason.
This whole thread is amusingly a rebuttal of itself. I would argue it's humans that can't reason, because of what we do when we "reason", the proof being this article which is a silly output of human reasoning. In other words, the above argument for why LLMs can't reason are so obviously fallacious in a multiple ways, the first of which is that human reasoning is a golden standard of reasoning, (and are a good example of how bad humans are at reasonin.
LLMs use naive statistical models to find the probability of a certain output, like "what's the most likely next word". Humans use equally rationally-irrelevant models that are something along the lines of "what's the most likely next word that would have the best internal/external consequence in terms of dopamine or more indirectly social standing, survival, etc."
We have very weak rational and logic circuits that arrive at wrong conclusions far more often than right conclusions, as long as it's beneficial to whatever goal our mind thinks is subconsciously helpful to survival. Often that is simple nonsense output that just sounds good to the listener (e.g. most human conversation)
Think how much nonsense you have seen output by the very "smartest" of humans. That is human reasoning. We are woefully ignorant of the actual mechanics of our own reasoning. The brain is a marvelous machine, but it's not what you think it is.
by latexr on 4/28/25, 10:36 AM
> Go read Peter Naur's "Programming as Theory Building" and then come back and tell me that LLMs can replace human programmers
Which to me gives a very different understanding of what the article is going to be about than the current HN title. This is not a criticism of the submitter, I know HN has a character limit and sometimes it’s hard to condense titles without unintentionally losing meaning.
by karmakaze on 4/28/25, 11:38 AM