by Jhsto on 2/10/25, 5:05 PM with 584 comments
by brink on 2/10/25, 9:13 PM
The developer of Balatro made an award winning deck builder game by not being aware of existing deck builders.
I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before. This makes me kind of sad, because the current world is so interconnected, that we rarely see such novelty with their tendency to "fall in the rut of thought" of those that came before. The internet is great, but it also homogenizes the world of thought, and that kind of sucks.
by abetusk on 2/11/25, 6:51 AM
This is just a quick overview from a single viewing of the video, but it's called "funnel hashing". The idea is to split into exponentially smaller sub arrays, so the first chunk is n/m, the second is n/(m^2), etc. until you get down to a single element. Call them A0, A1, etc., so |A0| = n/m, |A1| = n/(m^2) etc., k levels in total.
Try inserting into A0 c times. If it fails, try inserting into A1 c times. If it fails, go down the "funnel" until you find a free slot.
Call \delta the fraction of slots that are empty (I'm unclear if this is a parameter that gets set at hash table creation or one that's dynamically updated). Setting c = log(1/d) and k = log(1/d) to get worst case complexity O(log^2(1/d)).
This circumvents Yao's result by not being greedy. Yao's result holds true for greedy insertion and search policies and the above is non-greedy, as it's cascading down the funnels.
There are probably many little hairy details to work out but that's the idea, as far as I've been able to understand it. People should let me know if I'm way off base.
This very much reminds me of the "Distinct Elements in Streams" idea by Chakraborty, Vinodchandran and Meel[2].
[0] https://news.ycombinator.com/item?id=43007860
by monort on 2/11/25, 1:29 AM
by orlp on 2/10/25, 8:37 PM
This means insertions when the hash table is less full are slower, but you avoid the worst-case scenario where you're probing for the last (few) remaining open slot(s) without any idea as to where they are.
[1]: https://arxiv.org/pdf/2501.02305
---
An interesting theoretical result but I would expect the current 'trick' of simply allocating a larger table than necessary to be the superior solution in practice. For example, Rust's hashbrown intentionally leaves 1/8th (12.5%) of the table empty, which does cost a bit more memory but makes insertions/lookups very fast with high probability.
by trebligdivad on 2/10/25, 10:09 PM
by quantum2022 on 2/10/25, 10:00 PM
by joe_the_user on 2/10/25, 8:41 PM
What I realized is that the theory of hash tables involves a fixed-sized collection of objects. For this fixed collection, you create a hash-function and used that like a vector-index and store the collection in a (pre-allocated) vector. This gives a (fuzzy-lens'd) recipe for O(1) time insert, deletion and look-up. (The various tree structures, in contrast, don't assume a particular size).
The two problems are you have to decide size beforehand and if your vector gets close to full, you insert etc processes might bog-down. So scanning the article, it seems this is a solution to the bogging down part - it allows quick insertion to a nearly-full table. It seems interesting and clever but actually not a great practical advance. In practice, rather than worrying a clever way to fill the table, I'd assume you just increase your assumed size.
Edit: I'm posting partly to test my understanding, so feel to correct me if I'm not getting something.
by default-kramer on 2/10/25, 8:47 PM
> The team’s results may not lead to any immediate applications
I don't understand why it wouldn't lead to immediate applications. Is this a situation where analysis of real-world use cases allows you to tune your hash implementation better than what a purely mathematical approach would get you?
by dooglius on 2/10/25, 9:29 PM
by throwme_123 on 2/11/25, 12:17 AM
by matsemann on 2/11/25, 5:18 PM
Not sure if it's viewable somewhere. But the conference itself was so fun. https://sites.google.com/view/fun2018/home
I'm not an academic and got my company to sponsor a trip to this Italian island to relax on the beach and watch fun talks, heh.
by ThinkBeat on 2/11/25, 1:52 AM
by _1tan on 2/11/25, 6:19 AM
by cb321 on 2/11/25, 1:09 PM
by sternma on 2/11/25, 8:25 AM
by foota on 2/11/25, 5:19 AM
by nexawave-ai on 2/11/25, 11:31 AM
by elcritch on 2/11/25, 8:01 AM
It's likely a DHT would greatly benefit from this sort of algorithmic reduction in time and be less susceptible to constant factor overheads (if there are any).
by froh on 2/11/25, 7:07 AM
by Canigou on 2/12/25, 4:44 PM
Can someone explain to me how this isn't some kind of Dewey Decimal Classification (https://en.wikipedia.org/wiki/Dewey_Decimal_Classification) ?
by duskwuff on 2/10/25, 8:01 PM
by shaganer on 2/13/25, 3:00 PM
by varjag on 2/10/25, 7:46 PM
by seinecle on 2/11/25, 6:22 AM
by isaacfrond on 2/11/25, 9:26 AM
Curiously, Andrew Krapivin, the genious undergrad in the article, is not one of the authors.
by reportgunner on 2/11/25, 12:29 PM
by bnly on 2/12/25, 6:11 PM
Step two: Try to solve hard problems
Step three: Avoid reading too much of other people's work in the area
Step four: (Maybe) Invent a brilliant new solution
But really, really don't skip step one.
by jjallen on 2/10/25, 8:22 PM
by lupire on 2/10/25, 11:50 PM
That's why the conjecture resists proof -- there is an counterexample that people aren't seeing.
by DeathArrow on 2/11/25, 8:45 AM
by pizza on 2/11/25, 8:02 PM
by aqueueaqueue on 2/11/25, 2:09 AM
by hoseja on 2/11/25, 8:15 AM
by victor106 on 2/11/25, 11:52 AM
Why not?
by qntty on 2/10/25, 8:12 PM
by EternalFury on 2/11/25, 1:38 AM
by amazingamazing on 2/10/25, 8:36 PM
Edit: gpt4, Gemini 2 and Claude had no luck. Human driven computer science is still safe.
by hemant1041 on 2/11/25, 1:28 PM
by nickhodge on 2/10/25, 10:39 PM
"Yeah, sorry. You didn't use the right Hash Table"
by ziofill on 2/11/25, 1:11 AM
by percentcer on 2/10/25, 11:34 PM
by MR4D on 2/10/25, 8:44 PM
It's as through the conclusion seems to defy common sense, yet is provable. [1]
[0] - https://priceonomics.com/the-time-everyone-corrected-the-wor...
[1] - 2nd to the last paragraph: "The fact that you can achieve a constant average query time, regardless of the hash table’s fullness, was wholly unexpected — even to the authors themselves."
by jheriko on 2/11/25, 5:33 AM
hash tables are constant time on average for all insertion, lookup and deletion operations, and in some special cases, which i've seen used in practice very, very often, they have very small constant run-time just like a fixed-size array (exactly equivalent in-fact).
this came up in an interview question i had in 2009 where i got judged poorly for deriding the structure as "not something i've often needed", and i've seen it in much older code.
i'm guessing maybe there are constraints at play here, like having to support unbounded growth, and some generic use case that i've not encountered in the wild...?
by ascorbic on 2/10/25, 8:45 PM
by ryao on 2/10/25, 9:22 PM
by travisgriggs on 2/10/25, 11:24 PM
by pmags on 2/11/25, 12:39 AM
<rhetorical> Hmm....I wonder how such research gets funded?... </rhetorical>
by jimnotgym on 2/10/25, 10:28 PM
by ChrisMarshallNY on 2/10/25, 10:40 PM
"And I would have gotten away with it, if it hadn't been for those meddling kids!"
by zombiwoof on 2/10/25, 10:16 PM
by sam0x17 on 2/10/25, 7:52 PM
by bruce343434 on 2/11/25, 11:26 AM
by kittikitti on 2/11/25, 1:11 AM