from Hacker News

Base65536 encoding

by rahiel on 6/2/17, 10:13 AM with 120 comments

  • by ChuckMcM on 6/2/17, 9:01 PM

    256 byte packet and a 192 bit authentication hash, why use fast flux dns to run C&C on your botnet when you can just make them twitter followers.

    EDIT: And in case that isn't clear. Imagine you have a botnet, and all of the individual members create a twitter account. All of the twitter botnet accounts follow the 'master'. Who can tweet a command (and corresponding authentication key) to the botnet to say "follow chuck and do up to n things for him, here is his public key". Now Chuck suddenly has all these followers and when the time is right he tweets out his command, "ddos my greatest enemy" and adds his 'proof'. Off they go and blast his enemy. If he was only allotted one command then they all un-follow him.

    Basically its social media for botnets.

  • by pfraze on 6/2/17, 8:36 PM

    Yeah but what you really want is base-emoji. https://github.com/pfrazee/base-emoji
  • by anderskaseorg on 6/2/17, 9:30 PM

    See also:

    https://blogs.oracle.com/ksplice/the-1st-international-longe...

    Twitter characters can actually store up to nearly 31 bits each, if you’re using the JSON API. (Or at least, this was true in 2010. I don’t know whether this is still true.)

    http://blog.kevinalbs.com/base122

    https://news.ycombinator.com/item?id=13049329

    Base-122 encoding is 87.5% efficient in UTF-8, better than anything listed in the base65536 repository’s comparison table.

  • by matt_wulfeck on 6/2/17, 11:48 PM

    I think they missed a great opportunity to call it "base64k" encoding.
  • by girst on 6/2/17, 10:15 PM

    I'm the one who made the C / UNIX Shell implementation - it was a fun and quick thing to make.

    https://github.com/girst/base65536

    I'd appreciate some feedback.

  • by Asdfbla on 6/2/17, 9:21 PM

    I don't seem to get the efficiency table (or how efficiency is defined here?). Since Base65536 encodes 16 bits, why can't it encode UTF-16 with 100% efficiency? It says the efficiency is 64% instead.

    I'm sure it's true, just curious why.

  • by bcoates on 6/2/17, 8:29 PM

    At first I thought this was going to be a joke, then I thought it was going to be stupid, but it's actually brilliant.
  • by carry_bit on 6/2/17, 8:26 PM

    You could expand the encoding further if you didn't restrict yourself to a whole number of bits per character.
  • by fiatjaf on 6/3/17, 12:22 AM

    I hate this game.

    Manage to make 1 point at 𤄻𣺻𣼋耈𣺻興𣼫兊𠨋𢪄𡚻𡢁𢙌𢚻𠛀𣪻栌𤄋𤯄𤆻𤆠𠞠𤪇𤆻𠙀𤅴𤆧𣪤𡚻𥪹炌𤆀㶸聙𡊰𠨌𡪻𤇅𤆀薠嫊䂔𔔌𥩋㲼耈𠊁繈倘𤨸𣾔㼬𤚱𢩋𣿋𡉌膹敃ꎹ𡩋肐𠝒𠚬醸聛㰩

    https://qntm.org/files/hatetris/hatetris.html

  • by jxy on 6/2/17, 8:58 PM

    Since when did people start to label C implementation as "Unix shell"?
  • by prophesi on 6/3/17, 12:31 AM

    So, anyone got any good HATETRIS replays? I'll edit my post if I find myself getting a good one. https://qntm.org/files/hatetris/hatetris.html

    Edit: If you didn't look at the repo, this encoding was made to post HATERIS replays on Twitter.

    Edit: Only 3 points so far 𤆂𤆻𡚻𤆥㲺着遈𥮸㼉𤄛皲𤆻孈𤇆𡊾缎𓍌𤂻职𢪻郇膻𤅋𠅌傺𢊰䡪𤇄𤪤𡪻ꋇ𥆸𤶹膺𢡋聜𠆬𤪄膹𠬋㿄𠘬臀㾤冹𣾻𡈰𠭀䂹𤄔㼌𤚐𤢰𢢻𤇀𤞁䂺㬅𢉋𤮹㼆𣛄𡫀𤚒㡋𤢀ᖠ

  • by eponeponepon on 6/2/17, 9:02 PM

    Crikey, I thought this was going to be a joke project, but it isn't. Is it..?

    Either way, it's a neat piece of thinking.

  • by dheera on 6/2/17, 11:22 PM

    If you really want to put binary data on Twitter, why not encode it in an image? You could probably get several tens of kilobytes of binary data reliably encoded in a JPEG of the maximum size Twitter allows.
  • by supernintendo on 6/3/17, 1:31 AM

    Neat. I see a lot of mention of Twitter but the first thing I thought of was packet compression. A ~50 byte packet shaves off around 20 bytes with this. Those are good savings although I haven't looked into the encoder / decoder enough to know if it's worth the tradeoff of having to translate every packet on both ends. I can also see UDP datagrams being a pain in the ass to work with when you're throwing around streams of Unicode characters.

    Overall though, I like it and look forward to Base131072 being possible!

  • by shemnon42 on 6/2/17, 8:58 PM

    All those stats and one lingering question: whats the Weissman Score?
  • by gvx on 6/3/17, 6:30 AM

    Last year, I did a similar project: https://github.com/gvx/base116676

    It had a feature where it automatically would try a couple of compression algorithms on the text to be able to cram even more into a single tweet.

    I don't think it has a practical use, but it was fun to make.

  • by stcredzero on 6/2/17, 9:12 PM

    Base 32768 has a very sexy 93.75% efficiency! Maybe I should use that with my browser game?
  • by comboy on 6/2/17, 8:19 PM

    Do not try playing this game. You're welcome.
  • by marcosdumay on 6/2/17, 9:38 PM

    Looks like we should create some other 20k emoji.
  • by pbhjpbhj on 6/2/17, 9:02 PM

    "See a need, fill a need" (Bigweld).
  • by dzuc on 6/2/17, 8:23 PM

    Enantiomorphic tetris!
  • by terminado on 6/2/17, 11:43 PM

    What, no Java?
  • by jheriko on 6/3/17, 2:29 AM

    a sad sign of our times... what a nonsense.
  • by bitwize on 6/2/17, 9:25 PM

    What surprises me is that this encoding was developed to allow people to share replays of an illegal, and very pathological, Tetris variant. Hackers gonna hack.