from Hacker News

What Your Email Address Reveals About You: LLMs and Digital Footprints

by peab on 2/19/25, 4:01 PM with 16 comments

  • by nico on 2/22/25, 5:23 PM

    This looks just like those personality tests or what kind of fruit are you, etc

    At best it’s just seo spam, at worst it’s collecting people’s emails for direct spam

  • by cmdtab on 2/22/25, 5:38 PM

    Nice way to collect emails for marketing spam. The post seem AI generated as well.
  • by ryantj54 on 2/19/25, 6:02 PM

    I feel like this could be taken as a meta commentary about how easy it is to put someone in a box based on one or two facts about them, and the best generalizing function we have to date is able to do that very well... no surprise that my personal email "ropebunny69@gmail.com" reveals my playful love of rock climbing
  • by ideashower on 2/22/25, 5:39 PM

    Having a text box to enter an email address without saying anywhere what you'll do with that information or whether you will retain it in any form is a big red flag tbh...
  • by simonw on 2/22/25, 5:45 PM

    That story says:

    > Estimates for GPT4, for example, give training data sizes of up to 1 petabyte of data.

    I followed the provided link, which lead to an ad-laden https://seifeur.com/chat-gpt-4-data-size/ article which looks suspiciously like AI-generated slop. It ends with this set of Q&As which make no sense at all:

    > How much data was used to train ChatGPT-4?

    > ChatGPT-4 was trained on a dataset size of 570 GB.

    > How does the size of GPT-4 compare to GPT-3 in terms of training data?

    > GPT-4 has 45 gigabytes of training data, which is significantly larger than GPT-3’s 17 gigabytes.

    > How many terabytes of text data does GPT-4 utilize compared to GPT-3?

    > GPT-4 utilizes a dataset of 1 petabyte, which is notably larger than GPT-3’s 45 terabytes.

  • by gostsamo on 2/22/25, 5:29 PM

    I gave it two of my secondary emails. In both cases it decided that I leave in an english speaking country missing obvious hints that I'm actually not a native english speaker. The rest of the email addresses was on the nose, so it managed to guess those parts.

    Not really impressed, tbh, but still fun.

  • by platelminto on 2/22/25, 5:20 PM

    Surprisingly, it didn't infer anything from my protonmail email address.
  • by eek2121 on 2/20/25, 6:45 AM

    Their tool got it mostly right for me.
  • by Tepix on 2/22/25, 5:33 PM

    pretty lame result for me