from Hacker News

Show HN: WhatsApp-Llama: A clone of yourself from your WhatsApp conversations

by advaith08 on 9/9/23, 5:43 PM with 51 comments

Hello HN!

I've been thinking about the idea of a LLM thats a clone of me - instead of generating replies to be a helpful assistant, it generates replies that are exactly like mine. The concept's appeared in fiction numerous times (the talking paintings in Harry Potter that mimic the person painted, the clones in The Prestige), and I think with LLMs, there might actually be a possibility of us doing something like this!

I've just released a fork of the facebookresearch/llama-recipes which allows you to fine-tune a Llama model on your personal WhatsApp conversations. This adaptation can train the model (using QLoRA) to respond in a way that's eerily similar to your own texting style.

What I've figured out so far:

Quick Learning: The model quickly adapts to personal nuances, emoji usage, and phrases that you use. I've trained just 1 epoch on a P100 GPU using QLoRA and 4 bit quantization, and its already captured my mannerisms

Turing Tests: As an experiment, I asked my friends to ask me 3 questions, and responded with 2 candidate responses (one from me and one from llama). My friends then had to guess which candidate response was mine and which one was Llama's. Llama managed to fool 10% of my friends, but with more compute, I think it can do way better.

Here's the GitHub repository: https://github.com/Ads-cmu/WhatsApp-Llama/

Would love to hear feedback, suggestions, and any cool experiences if you decide to give it a try! I'd love to see how far we can push this by training bigger models for more epochs (I ran out of compute credits)

by brap on 9/9/23, 10:28 PM
I wonder if my clone will also respond 3 days later as if nothing happened
by dools on 9/9/23, 8:15 PM
> The concept's appeared in fiction numerous times (the talking paintings in Harry Potter that mimic the person painted, the clones in The Prestige)
How is your most notable example not when Gilfoyle does exactly this so he doesn’t have to talk to Dinesh in Silicon Valley??
by olvy0 on 9/9/23, 9:23 PM
I was immediately reminded of this black mirror episode:
https://en.m.wikipedia.org/wiki/Be_Right_Back
by cypress66 on 9/9/23, 9:03 PM
Llama 7B is quite dumb. Using the 13B you'd get significantly better results, and you can train a qlora on a single 3090 (I think even less is possible but not sure)
by tloriato on 9/9/23, 7:37 PM
You said that the model fooled your friends 10% of the time.
I wonder how well would chatGPT | llama2 do given just the last 5 messages of each and asking to generate the next reply pre tending to be you…
Somehow I don’t think it would be worse?
by oDot on 9/9/23, 8:03 PM
We are very close to where AI tech can replicate Harry Potter portraits
by porridgeraisin on 9/9/23, 7:37 PM
Nice. I remember thinking of doing something like this when I was much much more of a novice. I wrote a WhatsApp message parser and thought of doing this with the parsed messages. Unfortunately I knew too little back then, and Llama didn't exist either. Cool to see it!
by alt-glitch on 9/9/23, 8:24 PM
Super cool! I had a similar idea where I wanted to create such clones of some of my friends (with consent ofc) and see how well they know me. To extend your clone even more, you can also throw in every piece of digital text you have into this, eg. emails, notes, essays, blogs etc. I'm super down to work on LLM clones like these!
edit: I actually started a little work on this. If you wanna export more messages than the limited 40k, you can use [0]. I did and I have every text I've ever sent since I had WhatsApp.
[0]: https://github.com/YuvrajRaghuvanshiS/WhatsApp-Key-Database-...
by rosslazer on 9/9/23, 8:20 PM
Nice! I did something similar with GPT 3.5 and slack https://rosslazer.com/posts/fine-tuning/
by andai on 9/9/23, 8:40 PM
A few years ago I did the same thing with GPT-2 on my friend and I's WhatsApp conversation history.
So it would simulate conversations between us.
The result was hilarious yet at times uncomfortably accurate... like looking into a mirror...
by codetrotter on 9/9/23, 10:39 PM
> the talking paintings in Harry Potter that mimic the person painted
I remember that the photos in the newspaper moving mimic the person.
But I thought the talking paintings were ghosts living in the paintings or something.
by jzemeocala on 9/9/23, 8:31 PM
Awesome work, I've had the idea for a while of setting up a pipeline like this that could take input from all available sources of the person to clone their voice and image as well as dialogue.
The intent being to create digital avatars of lost loved ones to help people with the grieving process.
I know that there would be tremendous opportunity in such tech for malicious actors to do serious harm, but the stated goal is still a worthwhile endeavor.
by f0e4c2f7 on 9/9/23, 9:12 PM
Cool idea. One more fictional example:
https://www.youtube.com/watch?v=IWIusSdn1e4
by jmkni on 9/9/23, 7:32 PM
This is cool, although I’m guessing you need to input your conversation history manually? Or is there a way to export it from WhatsApp?
by lacrimacida on 9/9/23, 11:07 PM
Good work. But, how is this useful other than for deception and trickery beside the fun aspect of it all? Maybe im lacking imagination and perhaps this type of progress in mimicking human interaction will actually push more and more people back to the IRL world of person to person communication.
by gojomo on 9/9/23, 8:15 PM
Good idea!
I expect there will be profitable businesses based on training LLMs to simulate eminent people & celebrities – on both their public utterances and their private correspondence – then charging for access to the best models.
by slmkbh on 9/10/23, 11:39 AM
Black Mirror was not a manual for the future.... Just watch S02E01 from 2013(!), I know the llms are not quite there yet, but still.
by andai on 9/10/23, 12:35 AM
I'd love to try this but my GPU is potato.
Does anyone know a convenient way to access the kind of GPUs required for this?
Should I just pay for Google Colab?
by andai on 9/10/23, 12:36 AM
Any plans for a Llama 2 version? (Wondering how much difference it makes at such small model sizes.)
by SubiculumCode on 9/9/23, 9:08 PM
So discord, google and fb chats can pretty much do this too...should have been obvious by now.
by RockstarSprain on 9/9/23, 8:50 PM
Very interesting! I’m wondering if anyone attempted something similar in Telegram though.
by BasedAnon on 9/9/23, 9:16 PM
i am screaming in horror on the inside