from Hacker News

Show HN: MyGPT a toy LLM which can be trained on Project Gutenberg and dad jokes

by disconnection on 9/26/23, 2:51 PM with 4 comments

My puny version of ChatGPT.

This was based on the excellent LLM lecture series by Andrej Karpathy: https://www.youtube.com/watch?v=kCc8FmEb1nY

The main points of differentiation are that my version is token-based (tiktoken) with code to load up multiple text files as a trining set. Plus, it has a minimal server which is a drop-in replacement for the OpenAI REST API.

So you can train the default tiny 15M parameter model, and use that in your projects instead of ChatGPT.

I trained it on 20Mb of Project Gutenberg encyclopaedias, then fine-tuned it on 120 dad jokes, to get a Q: A: prompt format.

This model + training set is so small that the results are basically a joke; it's for entertainment purposes only. The code is also very rough, and the server only has the minimum functionality filled in.

I embodied this model in my talking LLM-driven hexapod robot, and it could give very silly answers to spoken questions.

  • by ferfumarma on 9/26/23, 6:04 PM

    Can we see some examples of the jokes it produces?
  • by getwiththeprog on 9/27/23, 2:16 AM

    This is a great idea. I want to make a 'pet' for my kid. I can't get them a real dog, so why not a tinyLLM?

    Training on guttenberg data is a great idea. What I would do is train it on all the e-books I have that are suitable for kids (I managed to find quite a lot online).

    The dad jokes idea is great, please keep doing things along this line.