by neural_thing on 12/10/24, 5:30 PM with 132 comments
by winkle on 12/10/24, 7:03 PM
by Topfi on 12/11/24, 1:51 PM
What we do get is a price of $ 500,- per month from a company that has been caught lying about this very product [0] and has never allowed independent testing.
Cognition, I am sorry to tell you, but there is no reason to trust you. In fact, there are multiple good reasons no to, even if you offered Devin at a fraction.
If this were e.g. Anthropic launching a new beyond Opus size model that was still performant and came with "chain-of-thought" capabilities, a far more extensive context window that still fully passes needle in haystack and is absolutely solid in sourcing from provided files, keeps on track even when provided with large documents, has few or no restrictions on usage and comes with extensive, verifiable benchmarks that showcase this offering being a significant upgrade over other models, maybe such a price could be justified.
You know why Cognition? Because they haven’t actively lied. What they did instead was let people use their models and actually test the advantages. Even Claude Instant way back when had certain use cases that made them have their own niche and showed they could execute before expanding with 2 and the larger context, then 3 with more applications. You never did any of that, you never gave anyone reason to believe what you claim, you didn’t even release benchmarks. See the difference?
Seems more like a simple cash grab, attempting to ride the O1 wave. OpenAI has a hard time justifying their Pro pricing, you doubling that makes this an out of season April fools joke. Waiting for the inevitable reporting that this is just another API wrapper for Claude or ChatGPT with our old faithful RAG.
[0] https://www.youtube.com/watch?v=tNmgmwEtoWE&pp=ygUJZGV2aW4gY...
by preommr on 12/10/24, 9:40 PM
But these are the kinds of problems that help shape the product. The software archictecture should be a compression of a deep and intuitive understanding of the problem space. How can you develop that knowledge if you're just delegating it to a black box that can't operate at a near-human level?
I've used ai based tools to great success, but on an ad-hoc basis, for specific and small functions or modules. To do the integration part requires an understanding of what abstraction is appropriate where. I don't think these tools are good that.
by a-arbabian on 12/10/24, 7:09 PM
by paradite on 12/10/24, 6:36 PM
by Yusefmosiah on 12/10/24, 6:31 PM
In my own experience using Cursor with Claude 3.5 Sonnet (new) and o1-preview, Claude is sufficient for most things, but there are times when Claude gets stumped. Invariably that means I asked it to do too much. But sometimes, maybe 10-20% of the time, o1-preview is able to do what Claude couldn’t.
I haven’t signed up for o1 Pro because going from Cursor to copy/pasting from ChatGPT is a big DevX downgrade. But from what I’ve heard o1 Pro can solve harder coding problems that would stump Claude or o1-preview.
My solution is just to split the problem into smaller chunks that make it tractable for Claude. I assume this is what Devin’s doing. Or is Devin using custom models or an early version of the o1 (full or pro) API?
by gexla on 12/11/24, 2:20 AM
by mfdupuis on 12/11/24, 12:51 AM
That said, I'm super excited about this space and love seeing smart folks putting energy into this. Even if it's still a bit aspirational, I think the idea of cutting down time spent debugging and refactoring and putting more power in the hands of less technical folks is awesome.
by waldenyan20 on 12/10/24, 5:55 PM
by adamgordonbell on 12/10/24, 5:55 PM
( removed pricing q, as I missed it is $500 / month for whole teams. I get why that is the pricing, but doesn't work for me to try it in side projects sadly )
by binarynate on 12/10/24, 6:11 PM
https://www.washingtonpost.com/technology/interactive/2021/p...
by debacle on 12/10/24, 7:54 PM
by didip on 12/10/24, 6:03 PM
by anticensor on 12/12/24, 6:10 PM
by Oras on 12/10/24, 6:32 PM
And other points where it should shine. How does it compare to using Cursor? Is it the slack integration?
by allusernamesare on 12/10/24, 6:42 PM
by daft_pink on 12/10/24, 5:54 PM
by WesleyJohnson on 12/10/24, 7:49 PM
by nextworddev on 12/11/24, 8:59 AM
by DidYaWipe on 12/12/24, 12:34 AM
by adastra22 on 12/11/24, 7:28 AM