from Hacker News

Two Bites of Data Science in K

by crux on 1/26/25, 6:29 PM with 10 comments

  • by RodgerTheGreat on 1/26/25, 11:58 PM

    I had a go at rewriting the latter half in Lil, flexing its K/Q heritage:

        t:readcsv[read["ICC Test Bowl 3003.csv"] "ssiiiiisffiiis"]
    
        sorted: select orderby Wkts desc orderby Ave asc where Wkts from t
        best: select where !gindex by Wkts from sorted
    
        bestInClass: select where each v i in Ave v~min (i+1) take Ave end from best
    
        allWkts: sorted.Wkts
        mostCompetitive: extract where (gindex=0)&15<count gindex by value from allWkts
        mostCompetitiveBowlers: select where Wkts in mostCompetitive from best
    
        gap: min allWkts drop 1+range max allWkts
    
    "bestInClass" is probably the most awkward adaptation; I didn't see a tidy way to make a suffix list like ",\".
  • by gitonthescene on 1/27/25, 2:04 AM

    I had a similar-ish project a while ago. I enjoy doing the "Spelling Bee" game in The NY Times Games section. In the comments someone worried that there weren't enough arrangements to keep the game going very long. I used an open source dictionary to generate all possible puzzles restricted by some basic heuristics like never using the letter S, having the total number of possible words in some reasonable range, etc. I found about 23,000 possible puzzles. My next idea was to use google's n-gram statistics to add some sort of "commonly known" heuristic, but my energy for the project petered out.

    In any event these languages are great for exploring data in projects like these.

  • by g939763 on 1/26/25, 10:49 PM

    op, which version of k are you using in the post? for those who'd like to follow along