by kcorbitt on 10/28/24, 5:17 PM with 95 comments
by jerjerjer on 10/28/24, 6:29 PM
> Even if the model gets extremely good at predicting final_score_if_it_hits_front_page, there’s still the inherent randomness of probability_of_hitting_front_page that is fundamentally unpredictable.
In addition to date, you might want to include three fields:
- day of week (categorical)
- is weekend/holiday (boolean)
- hour or time of the day (categorical, you can have 24 of them or morning/afternoon/etc.).
The probability of a post hitting the front page is usually affected by these things so it can really help the model.
by kelnos on 10/28/24, 6:52 PM
* 1 had a score that was reasonably close (8.4%) to what the model predicted
* 4 had scores wildly lower than the model predicted
* 2 had scores wildly higher than the model predicted
* the remaining 3 were not wildly off, but weren't really that close either (25%-42% off)
Then there's a list of 10 submissions that the model predicted would have scores ranging from 33 to 135, but they all only received a score of 1 in reality.
The graph shown paints a bit of a better picture, I guess, but it's still not all that compelling to me.
by youoy on 10/28/24, 6:27 PM
> The correlation is actually not bad (0.53), but our model is very consistently over-estimating the score at the low end, and underestimating it at the high end. This is surprising; some variation on any given data point is expected, but such a consistent mis-estimation trend isn’t what we’d expect.
This is a consequence on the model objective. If you don't know what is really happening, a good way of reducing the overall error is to do that. If you instead try to exactly predict the very highs and very lows, you can see that you will get very high errors on those, resulting in a bigger overall error.
Appart from that, I want to comment on AI alignment here. For me the objective of "most up votes" is not fully correlated with where I get the most value on HN. Most of the time, the most up voted I would have found them anyway on other platforms. It's the middle range what I really like. So be careful implementing this algorithm at scale, it could turn the website into another platform with shitty AI recommendations.
by oli5679 on 10/28/24, 5:59 PM
https://scikit-learn.org/dev/modules/generated/sklearn.isoto...
I also agree with your intuition that if your output is censored at 0, with a large mass there, it's good to create two models, one for likelihood of zero karma, and another expected karma, conditional on it being non-zero.
by swyx on 10/28/24, 6:09 PM
by Arctic_fly on 10/28/24, 6:32 PM
Based on the later analysis in the post (which I agree with), the total score of a comment is disproportionately tied to whether it hits the front page, and of course how long it stays there. Regardless of the quality of the average post starting in 2015, the sheer quantity would make it impossible for all but a few to stay on the front page for very long. Hacker News got more popular, so each story got less prime time.
by kcorbitt on 10/28/24, 5:23 PM
by sdflhasjd on 10/28/24, 6:00 PM
by pclmulqdq on 10/28/24, 5:47 PM
by manx on 10/29/24, 6:11 AM
by Nevermark on 10/29/24, 11:04 AM
You would do better to leave out dates and authors.
Do you really want the model to hone in on dates & authors? If you just trained on those would it create anything useful?
It can’t for dates, since it isn’t getting any future date examples to prepare for future dates. I suppose you could argue that month & day matter. But surely that would be a much lower quality discriminator than forcing the model to stay focused on title & content.
Similarly with author. You can find out which authors produce content with the most upvotes with a simple calculation.
But again, is that the discriminator you want the model to use? Or the title & content? Because it will use the easiest discriminator it can.
by gavin_gee on 10/28/24, 11:19 PM
by 6gvONxR4sf7o on 10/28/24, 6:36 PM
by Havoc on 10/28/24, 5:46 PM
Did you ever figure out what happened in 2016?
by 1024core on 10/28/24, 8:02 PM
by hnburnsy on 10/29/24, 5:22 PM
Maybe the reputation of the poster is also a factor?
by metalman on 10/30/24, 7:26 AM
by eugenekolo on 10/28/24, 5:43 PM
by hn_throwaway_99 on 10/28/24, 11:29 PM
Well, thanks HN, you were good while it lasted...
by suyash on 10/28/24, 6:20 PM
by octocop on 10/29/24, 8:35 AM
by floobertoober on 10/28/24, 10:16 PM
by chx on 10/28/24, 7:57 PM
this is dangerous talk.
it doesn't understand anything at all.
Reminder: We are more prone to anthromorphizing LLMs than to humanizing suffering humans.
by ChrisArchitect on 10/28/24, 6:23 PM
by ivanovm on 10/29/24, 6:04 PM