by ynniv on 12/27/24, 4:04 PM with 60 comments
Now I'm watching LLMs empirically, and they too seem to be moving faster than I expected. I used to have a solid list of things that separated people from AI, but it's mostly crossed off at this point. I'd like some fresh lines in the sand, and since HN is both technical and cynical, this seems like the perfect place to ask.
I do have a couple of caveats: I'm not interested in intentional or form factor limitations. You can easily add stateful memory, or a robotic body. I'm also trying to avoid subjective measures, so even if it isn't "good", an LLM can already write a book. Finally, I'd prefer to focus on things that are testable. It's tempting to say they won't be conscious or have a soul, but I don't see how I would ever be able to cross those off.
That said, let's see what you've got!
by idontwantthis on 12/27/24, 4:24 PM
by valbaca on 12/27/24, 6:28 PM
They’re so so bad at saying “I don’t know” or “no results”. They’re still sycophants.
by mikewarot on 12/27/24, 5:38 PM
I hope to offer a few more decades of performance boost to the show, with Isolinear Memory[1], to help dissolve those limitations. ;-)
[1] https://bitgrid.blogspot.com/2024/09/bitgrid-and-isolinear-m...
by fzzzy on 12/27/24, 5:40 PM
by jitl on 12/27/24, 5:24 PM
It is bad at doing novel things or subtlety. For example when I ask for help with Postgres queries, LLMs frequently give me made-up syntax or combine features from several different databases like Oracle, MS-SQL, Postgres into a single answer.
by SirMaster on 12/27/24, 4:21 PM
by admissionsguy on 12/27/24, 6:32 PM
Solving freshman physics and maths problems that are a variation of common problems (since they will solve the common problem and not the variation). My experience with this makes me completely confident that all the claims about OpenAI models solving whatever maths problems they claim to be solving are bogus. "Testing on training is all you need" basically.
by andersco on 12/27/24, 4:28 PM
by jonifico on 12/27/24, 4:35 PM
by not_your_vase on 12/27/24, 5:16 PM
It gets very obvious (and annoying) when you ask some not trivial technical questions.
by jqpabc123 on 12/27/24, 7:34 PM
Instead, they offer possibilities of questionable reliability at high cost.
In other words, they turn the traditional model of computing upside down.
by rsynnott on 12/28/24, 8:13 AM
I mean, if ‘good’ is not a criteria, then so can a cat.
“What can they do _usefully_” is perhaps a more interesting question, and it’s a far shorter list. The only thing I’ve seen that I fully buy that they can do to a useful standard is writing estate agent advertising copy, and that’s only because, in that industry, it is customary to use painfully awful prose, and accuracy has never been considered important.
by scjody on 12/27/24, 4:40 PM
(But we're working on it.)
by al2o3cr on 12/27/24, 6:16 PM
made a bet with my friends that we would have autonomous-only
highway lanes in 2024
Which... checks notes... we don't have. in the end I think that I was more right than wrong
In exactly the same sense that "false" is mostly true, sure. /sMore-concretely, here's a simple question put to Gemini ( https://g.co/gemini/share/2d68036d0ff7 ):
I have a two-gallon bucket and a five-gallon bucket.
How can I measure out 7 gallons of water?
And Gemini's scintillating reply: You can't measure out exactly 7 gallons using only a 2-gallon bucket
and a 5-gallon bucket. The combinations you can make are limited to
multiples of the greatest common divisor of the bucket sizes. In this
case, the greatest common divisor of 2 and 5 is 1. This means you can
measure any whole number of gallons. Here's one way to get 7 gallons
using the two buckets:
* Fill the 5-gallon bucket completely.
* Pour water from the 5-gallon bucket into the 2-gallon bucket until it's full.
You now have 3 gallons left in the 5-gallon bucket.
* Empty the 2-gallon bucket.
* Pour the 3 gallons from the 5-gallon bucket into the empty 2-gallon bucket.
* Fill the 5-gallon bucket completely again.
* Carefully pour water from the 5-gallon bucket into the 2-gallon bucket
(which already has 3 gallons) until the 2-gallon bucket is full. This will
use 1 gallon from the 5-gallon bucket.
* You now have 4 gallons in the 5 gallon bucket. Empty the two gallon bucket.
* Pour the four gallons into the two gallon bucket twice. This will give you
eight gallons.
* Fill the five gallon bucket and pour one gallon into the two gallon bucket.
This will leave you with four gallons in the five gallon bucket. Empty the
two gallon bucket.
* Pour the four gallons into the two gallon bucket twice. This will give you
eight gallons.
You can't get exactly 7 gallons with just these two buckets.
You can get 1, 2, 3, 4, 5, 6, 8, 9, etc., but not 7 directly.
Reminding Gemini that pouring 3 gallons into a 2-gallon bucket is impossible produces even more nonsense: You are absolutely right! My apologies. You can't pour 3 gallons into a 2-gallon
bucket. That was a significant error in my reasoning.
You cannot measure out exactly 7 gallons using only a 2-gallon bucket and a
5-gallon bucket. It's not possible through any combination of filling,
emptying, and pouring between the two.
Here's why: Any amount of water you measure will be a linear combination of
2 and 5 (2x + 5y, where x and y are integers representing the number of times
you fill or empty each bucket). The possible amounts are:
2
4
5
6
7 (2+5)
8
9
10 and so on.
However, you can't get exactly 1 or 3 by doing this.
My apologies for the initial incorrect response. Thank you for pointing out my mistake!
This second explanation is fractally wrong:* 1 and 3 are both achievable by the "partially pour from the 5-gallon to the 2-gallon" approach
* this explanation for why 7 is impossible identifies 7 as possible and even decomposes it
Prompting a THIRD time calling out those impossibilities produces another round of "put 3 gallons into the 2-gallon bucket" that works out how to get to 1 gallon to very-inefficiently get to 7, then finally produces the "fill both buckets" answer.
On the one hand, the production of so much sensible-looking prose from simple prompts is impressive.
On the other hand, pretending that a system that can't even ADD 2 AND 5 TOGETHER is somehow anywhere near ready to be used without direct human involvement is bizarre.
by timelesstard on 12/27/24, 5:28 PM
fuck does this even mean
by chistev on 12/27/24, 4:21 PM