by prettyblocks on 1/21/25, 7:39 PM with 345 comments
by kaspermarstal on 1/22/25, 7:48 AM
The nice thing is that she can copy/paste the titles and abstracts in to two columns and write e.g. "=PROMPT(A1:B1, "If the paper studies diabetic neuropathy and stroke, return 'Include', otherwise return 'Exclude'")" and then drag down the formula across 7000 rows to bulk process the data on her own because it's just Excel. There is a gif on the readme on the Github repo that shows it.
by antonok on 1/21/25, 11:57 PM
Most cookie notices turn out to be pretty similar, HTML/CSS-wise, and then you can grab their `innerText` and filter out false positives with a small LLM. I've found the 3B models have decent performance on this task, given enough prompt engineering. They do fall apart slightly around edge cases like less common languages or combined cookie notice + age restriction banners. 7B has a negligible false-positive rate without much extra cost. Either way these things are really fast and it's amazing to see reports streaming in during a crawl with no human effort required.
Code is at https://github.com/brave/cookiemonster. You can see the prompt at https://github.com/brave/cookiemonster/blob/main/src/text-cl....
by Evidlo on 1/22/25, 12:00 AM
by behohippy on 1/21/25, 8:57 PM
by nozzlegear on 1/21/25, 11:30 PM
Here's the script: https://github.com/nozzlegear/dotfiles/blob/master/fish-func...
And for this change [1] it generated these messages:
1. `fix: change from printf to echo for handling git diff input`
2. `refactor: update codeblock syntax in commit message generator`
3. `style: improve readability by adjusting prompt formatting`
[1] https://github.com/nozzlegear/dotfiles/commit/0db65054524d0d...by sidravi1 on 1/22/25, 3:04 AM
https://idinsight.github.io/tech-blog/blog/enhancing_materna...
by flippyhead on 1/21/25, 10:12 PM
by simonjgreen on 1/21/25, 10:33 PM
Recently deployed in Home Assistants fully local capable Alexa replacement. https://www.home-assistant.io/voice_control/about_wake_word/
by RhysU on 1/21/25, 8:37 PM
https://m.youtube.com/watch?v=M2o4f_2L0No
Spend the 45 minutes watching this talk. It is a delight. If you are unsure, wait until the speaker picks up the guitar.
by azhenley on 1/21/25, 8:50 PM
by computers3333 on 1/22/25, 8:24 AM
It's a lightweight tool that summarizes Hacker News articles. For example, here’s what it outputs for this very post, "Ask HN: Is anyone doing anything cool with tiny language models?":
"A user inquires about the use of tiny language models for interesting applications, such as spam filtering and cookie notice detection. A developer shares their experience with using Ollama to respond to SMS spam with unique personas, like a millennial gymbro or a 19th-century British gentleman. Another user highlights the effectiveness of 3B and 7B language models for cookie notice detection, with decent performance achieved through prompt engineering."
I originally used LLaMA 3:Instruct for the backend, which performs much better, but recently started experimenting with the smaller LLaMA 3.2:1B model.
It’s been cool seeing other people’s ideas too. Curious—does anyone have suggestions for small models that are good for summaries?
Feel free to check it out or make changes: https://github.com/k-zehnder/gophersignal
by deet on 1/21/25, 9:28 PM
The local models do things ranging from cleaning up OCR, to summarizing meetings, to estimating the user's current goals and activity, to predicting search terms, to predicting queries and actions that, if run, would help the user accomplish their current task.
The capabilities of these tiny models have really surged recently. Even small vision models are becoming useful, especially if fine tuned.
by mettamage on 1/21/25, 8:17 PM
Maybe should write a plugin for it (open source):
1. Put in all your work related questions in the plugin, an LLM will make it as an abstract question for you to preview and send it
2. And then get the answer with all the data back
E.g. df[“cookie_company_name”] becomes df[“a”] and back
by jwitthuhn on 1/22/25, 1:45 AM
I don't have a pre-trained model to share but you can make one yourself from the git repo, assuming you have an apple silicon mac.
by deivid on 1/21/25, 11:14 PM
Bergamot is already used inside firefox, but I wanted translation also outside the browser.
[0]: bergamot https://github.com/browsermt/bergamot-translator
by ata_aman on 1/21/25, 11:08 PM
It also does RAG on apps there, like the music player, contacts app and to-do app. I can ask it to recommend similar artists to listen to based on my music library for example or ask it to quiz me on my PDF papers.
by bashbjorn on 1/22/25, 2:42 PM
At those sizes, it's great for generating non-repetitive flavortext for NPCs. No more "I took an arrow to the knee".
Models at around the 2B size aren't really capable enough to act a competent adversary - but they are great for something like bargaining with a shopkeeper, or some other role where natural language can let players do a bit more immersive roleplay.
by mritchie712 on 1/21/25, 9:51 PM
1. Create several different personas
2. Generate a ton of variation using a high temperature
3. Compare the variagtions head-to-head using the LLM to get a win / loss ratio
The best ones can be quite good.
by psyklic on 1/21/25, 8:05 PM
For context, GPT-2-small is 0.124B params (w/ 1024-token context).
by jbentley1 on 1/22/25, 2:08 PM
1. Getting the speed gains is hard unless you are able to pay for dedicated GPUs. Some services offer LoRA as serverless but you don't get the same performance for various technical reasons.
2. Lack of talent to actually do the finetuning. Regular engineers can do a lot of LLM implementation, but when it comes to actually performing training it is a scarcer skillset. Most small to medium orgs don't have people who can do it well.
3. Distribution. Sharing finetunes is hard. HuggingFace exists, but discoverability is an issue. It is flooded with random models with no documentation and it isn't easy to find a good oen for your task. Plus, with a good finetune you also need the prompt and possibly parsing code to make it work the way it is intended and the bundling hasn't been worked out well.
by gpm on 1/22/25, 3:34 AM
function trans
llm "Translate \"$argv\" from French to English please"
end
Llama 3.2:3b is a fine French-English dictionary IMHO.by ignoramous on 1/21/25, 9:51 PM
A more difficult problem we forsee is to turn it into a real-time (online) firewall (for calls, for example).
[1] https://chat.deepseek.com/a/chat/s/d5aeeda1-fefe-4fc6-8c90-2...
[1] MediaPipe in particular makes it simple to prototype around Gemma2 on Android: https://ai.google.dev/edge/mediapipe/solutions/genai/llm_inf...
[2] Intend to open source it once we get it working for anything other than SMSes
by JLCarveth on 1/22/25, 1:22 AM
by eb0la on 1/21/25, 9:04 PM
by juancroldan on 1/22/25, 12:02 AM
by cwmoore on 1/21/25, 11:37 PM
by spiritplumber on 1/22/25, 12:10 AM
by iamnotagenius on 1/21/25, 8:42 PM
by mrmage on 1/23/25, 9:08 PM
Using it, I find myself often writing only the first half of most words, because the second part can usually already be guessed by the AI. In fact, it has a dedicated shortcut for accepting only the first word of the suggestion — that way, it can save you some typing even when later words deviate from your original intent.
Completions are generated in real-time locally on your Mac using a variety of models (primarily Qwen 2.5 1.5B).
It is currently in open beta: https://cotypist.app
by jmward01 on 1/22/25, 12:05 AM
by jothflee on 1/22/25, 12:31 AM
some of the situations get pretty wild, for the office :)
by lightning19 on 1/22/25, 5:23 PM
by ceritium on 1/22/25, 8:42 AM
It is probably super overengineering, considering that pretty good libraries are already doing that on different languages, but it would be funny. I did some tests with chatGPT, and it worked sometimes. It would probably work with some fine-tuning, but I don't have the experience or the time right now.
by lormayna on 1/22/25, 10:35 AM
by arionhardison on 1/21/25, 8:48 PM
by sauravpanda on 1/22/25, 6:00 AM
With just three lines of code, you can run Small LLM models inside the browser. We feel this unlocks a ton of potential for businesses so that they can introduce AI without fear of cost and can personalize the experience using AI.
Would love your thoughts and what we can do more or better!
by thetrash on 1/21/25, 11:41 PM
by danbmil99 on 1/22/25, 12:06 AM
by linsomniac on 1/22/25, 2:58 AM
by kianN on 1/22/25, 12:32 AM
Its effective context window is pretty small but I have a much more robust statistical model that handles thematic extraction. The llm is essentially just rewriting ~5-10 sentences into a single paragraph.
I’ve found the less you need the language model to actually do, the less the size/quality of the model actually matters.
by A4ET8a8uTh0_v2 on 1/21/25, 9:18 PM
by addandsubtract on 1/22/25, 11:58 AM
by dh1011 on 1/22/25, 3:04 AM
by merwijas on 1/22/25, 6:03 AM
by reeeeee on 1/22/25, 10:06 AM
The bots don't do a lot of interesting stuff though, I plan to add the following functionalities:
- Instead of just resetting every 100 messages, I'm going to provide them with a rolling window of context.
- Instead of only allowing BASH commands, they will be able to also respond with reasoning messages, hopefully to make them a bit smarter.
- Give them a better docker container with more CLI tools such as curl and a working package manager.
If you're interested in seeing the developments, you can subscribe on the platform!
by krystofee on 1/22/25, 11:06 AM
Lets say, I want some outcome and it will autonomousl handle the process prompt me and the other side for additional requirements if necessary and then based on that handle the process and reach the outcome?
by ahrjay on 1/22/25, 11:56 AM
by guywithahat on 1/22/25, 3:54 AM
I haven't benchmarked it yet but I'd be happy to hear opinions on it. It's written in C++ (specifically not python), and is designed to be a self-contained microservice based around llama.cpp.
by herol3oy on 1/22/25, 4:56 PM
by Thews on 1/22/25, 3:23 PM
I think the content you can get from the SLMs for fake data is a lot more engaging than say the ruby ffaker library.
by codazoda on 1/22/25, 1:08 AM
I’m tired of the bad playlists I get from algorithms, so I made a specific playlist with an Llama2 based on several songs I like. I started with 50, removed any I didn’t like, and added more to fill in the spaces. The small models were pretty good at this. Now I have a decent fixed playlist. It does get “tired” after a few weeks and I need to add more to it. I’ve never been able to do this myself with more than a dozen songs.
by sharnabeel on 1/24/25, 9:39 AM
I really hope there would be some amazing models this year for translation.
by sebazzz on 1/22/25, 4:42 PM
I’m now just wondering if there is any way to build tests on the input+output of the LLM :D
by mogaal on 1/22/25, 12:14 PM
by kolinko on 1/22/25, 9:15 AM
by itskarad on 1/22/25, 1:34 AM
by accrual on 1/22/25, 3:47 PM
by HexDecOctBin on 1/22/25, 1:51 AM
I was thinking of hooking them in RPGs with text-based dialogue, so that a character will say something slightly different every time you speak to them.
by jftuga on 1/22/25, 2:48 AM
by ittaboba on 1/24/25, 2:48 PM
by evacchi on 1/22/25, 7:55 AM
by panchicore3 on 1/22/25, 4:08 PM
by numba888 on 1/22/25, 8:53 AM
by kristopolous on 1/21/25, 10:55 PM
My needs are narrow and limited but I want a bit of flexibility.
by Havoc on 1/21/25, 8:44 PM