from Hacker News

Show HN: Countless.dev – A website to compare every AI model: LLMs, TTSs, STTs

by ahmetd on 12/7/24, 9:42 AM with 76 comments

by vunderba on 12/7/24, 4:01 PM
OP, were you inspired by this LLM comparison tool?
https://whatllm.vercel.app
The tables are very similar - though you've added a custom calculator which is a nice touch.
Also for the Versus Comparison, it might be nice to have a checkbox that when clicked highlights the superlative fields of each LLM at a glance.
by ursaguild on 12/7/24, 12:38 PM
I like the idea of more comparisons of models. Are there plans to add independent analyses of these models or is it only an aggregation of input limits?
How do you see this differing from or adding to other analyses such as:
https://artificialanalysis.ai
https://huggingface.co/spaces/TTS-AGI/TTS-Arena
https://huggingface.co/spaces/hf-audio/open_asr_leaderboard
https://huggingface.co/spaces/TIGER-Lab/GenAI-Arena
Great work on all the aggregation. The website is nice to navigate.
by karpatic on 12/7/24, 3:37 PM
Great! I wish there was a "bang to buck" value. Some way to know the cheapest model I could use for creating structured data from unstructured text, reliably. Using gpt4o-mini which is cheap but wouldn't know if anything cheaper could do the job too.
by wslh on 12/7/24, 2:06 PM
I'd like to share a personal perspective/rant on AI that might resonate with others: like many, I'm incredibly excited about this AI moment. The urge to dive headfirst into the field and contribute is natural after all, it's the frontier of innovation right now.
But I think this moment mirrors financial markets during times of frenzy. When markets are volatile, one common piece of advice is to “wait and see”. Similarly, in AI, so many brilliant minds and organizations are racing to create groundbreaking innovations. Often, what you're envisioning as your next big project might already be happening, or will soon be, somewhere else in the world.
Adopting a “wait and see” strategy could be surprisingly effective. Instead of rushing in, let the dust settle, observe trends, and focus on leveraging what emerges. In a way, the entire AI ecosystem is working for you: building the foundations for your next big idea.
That said, this doesn't mean you can't integrate the state of the art into your own (working) products and services.
by gtirloni on 12/7/24, 3:42 PM
Tangent question: is there anything better on the desktop than ChatGPT's native client? I find it too simple to organize chats but I'm having a hard time evaluating the dozen or so apps (most are disguise for some company's API service). Any recommendations? macOS/Linux compatibility preferred.
by politelemon on 12/7/24, 1:13 PM
There are only two audio transcription models. Is this generally true, are there no open source ones like llama but for transcribing? Or just small dataset on that site
by ursaguild on 12/7/24, 1:02 PM
Just saw that this was built for a hackathon. Huge kudos and congratulations!
by tonetegeatinst on 12/7/24, 11:46 PM
Love the UI and table layout. Have you though about showing the different VRAM requirements for models?
by mcklaw on 12/7/24, 11:53 AM
It would be great if llmarena leadership information would also appear to compare performance vs cost.
by xnx on 12/7/24, 11:37 AM
Nice resource. Almost too comprehensive for someone who doesn't know all the sub-version names. Would be great to have a column of the score from lmarena leaderboard. Some prices are 0.00? Is there a page that each row could link to for more detail?
by lolinder on 12/8/24, 12:59 AM
One thing that stands out playing with the sorting is that Google's Gemini claims to have a context window more than 10x that of most of its competition. Has anyone experimented with this to see if its useful context window is actually anything close to that?
In my own experiments with the chat models they seem to lose the plot after about 10 replies unless constantly "refreshed", which is a tiny fraction of the supposed 128000 token input length that 4o has. Does Gemini actually do something dramatically differently, or is their 3 million token context window pure marketing nonsense?
by nikvdp on 12/8/24, 4:17 AM
you guys might also like http://llmprices.dev, similar but it's automatically updated with the latest info every 24h
by robbiemitchell on 12/7/24, 6:16 PM
One helpful addition would be Requests Per Minute (RPM), which varies wildly and is critical for streaming use cases -- especially with Bedrock where the quota is account wide.
by alif_ibrahim on 12/7/24, 3:20 PM
thanks for the comparison table! would be great if the header is sticky so i don't get lost in identifying which column is which.
by ProofHouse on 12/7/24, 8:52 PM
These are hard to keep updated. I find they usually fall off. It would be cool to have one, but honestly, this one already doesn't even have 4o and pro on it which if it was being maintained, it obviously would. Updating a table shouldn't take days. It's like a one minute event.
by Bigie on 12/8/24, 1:06 AM
I feel like the number is still a bit lacking, especially since many models made by Chinese companies are not represented, like speech-to-text.
As far as I know, there's a volcano engine in China that has impressive text-to-speech capabilities. Many local companies are using this model.
by moralestapia on 12/7/24, 3:56 PM
Hey this is great!
A small suggestion, a toggle to exclude between "free" and hosted models.
Reason is, I'm obv. interested in seeing the cheaper models first but am not interested in self-hosting which dominate the first chunk of results because they're "free".
by dangoodmanUT on 12/7/24, 2:40 PM
This is missing... so many models... like most TTS and STT ones.
11labs, deepgram, etc.
by tomp on 12/7/24, 7:53 PM
"every"
you're missing a lot
TTS: 11labs, PlayHT, Cartesia, iFLYTEK, AWS Polly, Deepgram Aura
STT: Deepgram (multiple models, including Whisper), Gladia Whisper, Soniox
just off the top of my head (it's my dayjob!)
by wiradikusuma on 12/7/24, 5:13 PM
Suggestions:
1. Maybe explain what Chat Embedding Image generation Completion Audio transcription TTS (Text To Speech) means?
2. Put a running number on the left, or at least just show total?
by mtkd on 12/7/24, 11:47 AM
Would poss be further useful to have a release date column, license type, whether EU restricted and also right-align / comma-delimit those numeric cells
by 5563221177 on 12/8/24, 3:50 AM
Logs emitted during the build, or test results, or metrics captured during the build (such as how long it took)... these can all themselves be build outputs.
I've got one where "deploying" means updating a few version strings and image reverences in a different repo. The "build" clones that repo and makes the changes in the necessary spots and makes a commit. Yes, the side effect I want is that the commit gets pushed--which requires my ssh key which is not a build input--but I sort of prefer doing that bit by hand.
by shahzaibmushtaq on 12/7/24, 2:47 PM
It's weird that OpenAI has lower prices for same models and Azure has higher prices. Anyone can explain?
BTW impressive idea and upvoted on PH as well.
by mentalgear on 12/7/24, 2:52 PM
This is interesting price-wise, but quality-wise if you do not provide benchmark results, it's not that helpful a comparision.
by ikishorek on 12/10/24, 10:31 AM
Can you please consider adding sort options to the cost columns in the Pricing Calculator?
by Its_Padar on 12/7/24, 12:38 PM
Would be great if it was possible to get to the page where the pricing was found to make it easier to use the model
by victoriawu on 12/9/24, 8:34 AM
It’s indeed quite intuitive to see the details of each AI model, but it feels a bit overwhelming with too much information.
I wonder if adding a chatbot might be a good idea. Users could ask specific questions based on their needs, and the bot could recommend the most suitable model. Perhaps this would add more value.
by SubiculumCode on 12/7/24, 9:50 PM
I was surprised: what is that the model that costs the most per token? Luminous-Supreme-Control
by e-clinton on 12/7/24, 5:39 PM
DeepInfra prices are significantly better than what’s listed for OS models.
by amelius on 12/7/24, 4:51 PM
I'm missing the "IQ" column.
by methou on 12/7/24, 3:06 PM
Thank you on behalf of my waifu!
by NoZZz on 12/7/24, 5:34 PM
Stop feeding their machine.