from Hacker News

Do not confuse a random variable with its distribution

by reqo on 6/21/24, 10:54 PM with 61 comments

by eru on 6/26/24, 3:25 AM
> A random variable measures a numerical quantity which depends on the outcome of a random phenomenon.
Hmm, that sentence at the beginning is already wrong. Random variables can measure anything, not just numbers. Heads or Tails of a coin, or colours of cars etc.
It's fine to restrict yourself to numeric random variables only. But if you are writing a rant telling other people to be more careful in their analysis, you better dot your i's and cross your t's yourself.
by auraai on 6/26/24, 5:49 AM
This is pretty important in mathematical finance, where one moves from a real-world measure to a risk-neutral measure to make computations feasible.
https://en.wikipedia.org/wiki/Girsanov_theorem https://en.wikipedia.org/wiki/Risk-neutral_measure
by btown on 6/26/24, 7:00 AM
For the code-minded out there, a "random variable" is something of a lazily evaluated value that can be "sampled" and emit a quantity (or a vector/tensor thereof) each time. And the OP article boils down to the fact that it's generally incorrect to assume that any random variable can be represented solely by its unconditional probability distribution; a distribution is more of a visualization than a sufficient definition. Rather, one must track the entire graph of other random variables that may feed the current one (e.g. that the current one is conditional on), akin to how an Excel spreadsheet models all the dependencies of a cell.
The fun part comes when you can ask this computation graph: "what parameters for a random variable early on in the chain would be the ones that optimize some function of variables later in the chain?" And, handwaving a ton of nuance here, when those parameters are weights in a neural network, the function is a loss function on the training data, and the optimization is done by automatic differentiation (e.g. https://pytorch.org/tutorials/beginner/introyt/autogradyt_tu...), you have modern AI.
If you're interested in the theoretical underpinnings here, Bishop's PRML is perhaps the classic starting point: https://www.microsoft.com/en-us/research/uploads/prod/2006/0...
by panic on 6/26/24, 6:36 AM
There’s an interesting connection here to another article on the front page: https://news.ycombinator.com/item?id=40794786
In that article, squaring a number in interval arithmetic is different from multiplying two independent numbers with the same interval. Here, squaring a random variable is different from multiplying two independent random variables with the same distribution.
by condwanaland on 6/26/24, 2:19 AM
Love to see things built with bookdown, which is such an awesome R package (although it's successor, Quarto, is much better and simpler)
by jhrmnn on 6/26/24, 4:18 AM
In quantum mechanics, the measurement and observation are two sides of the same coin, and the sample space is _defined_ by the random variable (observable) of interest, so it makes a little less sense to separate the two. (There is no hidden observation-independent sample space.)
by kazinator on 6/26/24, 6:46 AM
Who confuses a random variable with its distribution, and what does that mistake look like? I don't get it.
by dwqwdqd on 6/26/24, 2:10 AM
Does this work?
X = 1 with probability 0.5, 0 with probability 0.5 Y = 0 when X = 1, 1 when X = 0 (for the \omega for which X(\omega) = 1, Y(\omega) = 0).
They're both bernoulli distributions with p=0.5 (i.e. they follow the same distribution) and P(X=Y) = 0
by baking on 6/26/24, 1:31 AM
Why make it complicated? One coin flip, X = Heads and Y = Tails. P(X = Y) = 0.
by Davidzheng on 6/26/24, 2:47 AM
Honestly it's fine to confuse a random variable with its distribution if you only are working with a single RV. Changing probability space without changing distribution doesn't really matter much, probability space is more of an abstraction it's not really measurable
by tpoacher on 6/27/24, 8:18 PM
I've often felt that one of the reasons such warnings are even necessary, is because the notation we use to denote probabilities in the first place is atrocious, and clearly an abuse of notation.
A better convention would make clear the distinction between the set of possible outcomes, the act of obtaining a (range of) samples from that set, and the probability that those events match a value range of interest. p(x=X) is not enough to capture all that information. let alone p(x) vs p(X).
by clircle on 6/26/24, 2:09 AM
Whoops, you posted the wrong page. The statistics page that Hacker News needs to read is the one about how the Central Limit Theorem doesn't apply to everything damn thing.
by dinobones on 6/26/24, 5:14 AM
These types of explanations are the reason I dislike school. This is such a stuffy and contrived way to explain things.
I’m so glad I have ChatGPT now, I always ask for applied examples and ask it to explain things intuitively. I would’ve been a 4.0 student if I would’ve had ChatGPT as my personal tutor when I was in school.
by glitchc on 6/26/24, 4:20 AM
This article is overly complicated. The random variable X is a function mapping the outcome to its probability. The distribution, or the probability density function, or pdf, is the integral of that function. The cumulative density function, the cdf, is in turn the integral of the pdf.