from Hacker News

Can LLMs do randomness?

by whoami_nr on 4/28/25, 5:32 PM with 69 comments

  • by sgk284 on 4/30/25, 9:31 AM

    Fun post! Back during the holidays we wrote one where we abused temperature AND structured output to approximate a random selection: https://bits.logic.inc/p/all-i-want-for-christmas-is-a-rando...
  • by captn3m0 on 4/30/25, 9:11 AM

    Wouldn’t any randomness (for a fixed combination of hardware and weights) be a result of the temperature and any randomness inserted at inference-time?

    Otherwise, doing a H/T comparison is just a proxy to what the underlying token probabilities are and the temperature configuration (+hardware differences for a remote-hosted model).

  • by DimitriBouriez on 4/30/25, 9:24 AM

    One thing to consider: we don’t know if these LLMs are wrapped with server-side logic that injects randomness (e.g. using actual code or external RNG). The outputs might not come purely from the model's token probabilities, but from some opaque post-processing layer. That’s a major blind spot in this kind of testing.
  • by Repose0941 on 4/30/25, 10:14 AM

    Is randomness even possible? You can't technically prove it just see it, more likely to be close to that, in https://www.random.org/#learn they talk a little about this
  • by whoami_nr on 4/30/25, 9:55 AM

    Author here. I know 0-10 is one extra even number. I also just did this for fun so don't take the statistical significance aspect of it very seriously. You also need to run this multiple times with multiple temperature and top_p values to do this more rigorously.
  • by dr_dshiv on 4/30/25, 9:09 AM

    Oh, surprising that Claude can do heads/tails.

    In a project last year, I did a combination of LLMs plus a list of random numbers from a quantum computer. Random numbers are the only useful things quantum computers can produce—and that is one thing LLMs are terrible at

  • by david-gpu on 4/30/25, 12:11 PM

    During my tenure at NVidia I met a guy that was working on special versions of to the kernels that would make them deterministic.

    Otherwise, parallel floating point computations like these are not going to be perfectly deterministic, due to a combination of two factors. First, the order of some operations will be random due to all sorts of environmental conditions such as temperature variations. Second, floating point operations like addition are not ~~commutative~~ associative (thanks!!), which surprises people unfamiliar with how they work.

    That is before we even talk about the temperature setting on LLMs.

  • by jansan on 4/30/25, 9:35 AM

    What I find more important is the ability to get reproducible results for testing.

    I do not know about other LLMs, but Cohere allows setting a seed value. When setting the same seed value it will always give you the same result for a specific prompt (of course unless the LLM gets an update).

    OTOH I would guess that they normally simply generate a random seed value on the server side when processing a prompt, and it depends on their random number generator how random that really is.

  • by bestest on 4/30/25, 9:34 AM

    I would suggest them to repeat the experiment while including sets from answers to "choose heads or tails" AND "choose tails or heads", ditto for numbers or rephrase the question to not include a "choice" (choose from 0 to 9 (btw, they're asking to choose from 0 to 10 inclusive, which is inherently wrong as the even subset is bigger in this case)), but rather "choose a random integer".
  • by GuB-42 on 4/30/25, 10:29 AM

    Is the LLM reset between each event?

    If LLMs are anything like people, I would expect a different result depending on that. The idea that random events are independent is very unintuitive to us, resulting in what we call the Gambler's Fallacy. LLMs attempts at randomness are very likely to be just as biased, if not more.

  • by mrdw on 4/30/25, 12:37 PM

    They should measure for different temperatures, where at 0 it will be the same output every time, but it's interesting to see how results will change for different temperatures from 0.01 to 2. But, I'm not sure if temperature is implemented the same way in all llms
  • by baalimago on 4/30/25, 10:24 AM

    I'd be interested to see the bias in random character generation. It's something which would be closer to the domains of LLMs, seeing that they're 'next word generators' (based on probability).

    How cryptographically secure would an LLM rng seed generator be?

  • by ganiszulfa on 4/30/25, 12:23 PM

    LLMs are acting like humans, I believe humans will have biases if you ask them to do random things :)

    On a more serious note, you could always adjust the temperature so they behave more randomly.

  • by hleszek on 4/30/25, 10:04 AM

    Can humans do randomness? Obviously not and I expect if you ask people for a random number, then odd numbers will predominate.
  • by boroboro4 on 4/30/25, 11:16 AM

    It would be nice to inspect logits data/distribution. How close the output of it to uniform is the question.
  • by naghing on 4/30/25, 10:31 AM

    Why not provide randomness to LLMs instead of expecting them to produce it?
  • by evertedsphere on 4/30/25, 9:39 AM

    0-10 inclusive is one extra even
  • by p1dda on 4/30/25, 9:32 AM

    LLMs doesn't even understand basic logic dude, or physics or gravity
  • by edding4500 on 4/30/25, 11:05 AM

    This is silly. Behind an LLM sits a deterministic algorithm. So no, it is not possible without ibserting randomness by other means into the algo, for example by setting temperatures for gradient descent.

    Why are all these posts and news about LLMs so uninformed? This is human built technology. You can actually read up how these things work. And yet they are treated as if it were an alien species that must be examined by sociological means and methods where it is not necessary. Grinds my gears every time :D