from Hacker News

Ask HN: How to explain to execs why gen AI hasn't 10x'd feature dev

by macphisto178 on 1/26/25, 8:24 PM with 27 comments

Our senior eng team has found that gen AI tools like GPT and Claude haven't significantly reduced the time required to develop features in our system. We're having open/good faith talks with product and exec teams about the reasons behind this and would love insights on how to effectively explain the limitations and nuances of using gen AI in engineering.
  • by austin-cheney on 1/26/25, 11:27 PM

    Just be clear about the current technology in language a 5 year old can follow. Use short sentences and connect the dots.

    1. AI is business jargon for Large Language Models.

    2. LLMs are predictive models using plain text.

    3. LLMs are not more accurate or creative than what they are fed.

    4. LLMs can provide excellent documentation for prior encountered problems but cannot provide original solutions to new problems.

    5. A more effective means at cost reduction is to reduce/eliminate regression. This provides the same benefit as LLMs but without sacrificing creativity.

    6. This is an opportunity for executives to reduce expenses and dramatically and simultaneously radically increase product quality. Lead the developers to increase test automation coverage and increase execution speed.

    7. The alternative for the developers is their replacement by LLMs. LLMs cannot replace people but cost so much less they make up the difference if product quality remains marginal or maintenance cost remains high.

    That is how you do it. Small sentences, use numbers to explain the finances, and play devils advocate.

  • by tacostakohashi on 1/26/25, 9:48 PM

    Why is the onus on you, why don't the execs explain why it _would_ 10x the efficiency of their feature factory, if they're so smart?

    The obvious answer is that generating code is not the hard part of building products.

  • by 2024user on 1/26/25, 8:39 PM

    I would look at the claim that it would 10x feature dev. Where did that claim come from? Afaik, OpenAI and Anthropic don't make that claim.
  • by drooby on 1/26/25, 9:02 PM

    Increasing the speed at which you throw shit at a wall does not change the fact that you are throwing shit at a wall.
  • by MattGaiser on 1/26/25, 8:56 PM

    Biggest cause is ineffective reasoning in a large context in my experience (more than 350 lines).

    Im my experience, ChatGPT breaks down when it needs to consider more than 350 lines and its performance is sloppy before that.

    To get solid performance out of it, I essentially need to specify the important areas and changes as well as the desired approach.

    That being said, I’ve found it has cut my development time for most features by at least half, even in large codebase so I would be curious to know how you are approaching it currently.

  • by muzani on 1/28/25, 11:27 AM

    It can 10x writing code and debugging, but neither of these are a software engineer's day to day.

    You still need to do code reviews, especially on the AI. It's quite bug free, more than human code, but you still need to explain the product. v0 doesn't quite replace a good product designer.

    It's not yet at the level it can do architectural work and it doesn't understand the scope or goal of products. It doesn't understand roadmaps. It can't plan the code around where it needs to be in 1 year. You need a proper architect to do this better.

    The average code it's trained on is from 2019 or so. New models has people write new data for it, but most of this data is not production data. So you're likely using an old design too and it tends to recommend these until encouraged otherwise.

    Also if you're not using the newer tools like Cursor, Aider, Windsurf, a lot of the contribution of AI is better test coverage. The value that comes with the "agent" tools is they will write and edit code across multiple files, and they save you the trouble of explaining context when you can just share the source code.

  • by TheMongoose on 1/26/25, 8:41 PM

    I would look at why you're outsourcing this thinking to HN instead of putting together data and an understanding of your own environment.

    Or just ask the LLM's to write it for you.

  • by threecheese on 1/27/25, 12:30 AM

    I had a similar question that I posed to an ally in senior leadership. The answer, obviously I guess, was metrics. Assuming your teams didn’t just FAFO through it, you could reconstruct what the expected outcomes were vs actuals. And follow that with sound recommendations about where it could be used along with some thoughts about the future state (when the tools are faster or cheaper or whatever).

    Another commenter had a great note, that you should take this opportunity to advance your career, given you are in an excellent position to do so. It made me giggle, but then think that you really need to make sure your PoC was executed thoughtfully and seriously, that you understand what the SoTA is wrt using assistants and their various modalities (chat, agent, copilot etc) and despite your expertise it was a no go. Because if you don’t, somebody else who is taking it dead seriously is going to take that commenters advice and demonstrate the value that you didn’t, and this may reflect poorly on you. Execs are getting hype from their back channels or vendors like I have never seen in my career, and you are going to go against that. ($bigco perspective there)

  • by jarsin on 1/27/25, 1:12 AM

    I just watched the latest ycombinator AI promo piece on youtube. Amongst a bunch of other claims, they claim founders in latest batch are saying they won't hire any engineers that don't use AI due to the "force multiplier".

    Then I come to check out Ask HN and see this as top post.

  • by PaulHoule on 1/26/25, 8:37 PM

    Because it can't. It can beat Stack Overflow, but that's because Stack Overflow sucks and hasn't had a serious competitor.

    If somebody else seems to be getting a 10x-speed up they got lucky for a simple problem, are lying (want to make it big as an AI influencer) or are delusional.

    Could some product come out next year that's better? Maybe. Right now it's not a productive conversation to have to look for some "nuance" which will get you to 10x.

  • by viraptor on 1/26/25, 9:00 PM

    Why ask HN? Have you actually tried using the tech? Then you should have the answers. There are lots of different types of software development and in some areas gen AI will be extremely useful, but a net negative in others. Only you can answer the questions about your environment.
  • by franktankbank on 1/26/25, 9:22 PM

    Do you really need to explain it? Why are other teams picking your tools?
  • by AnimalMuppet on 1/27/25, 2:38 AM

    "It hasn't for the same reason it hasn't made you a 10x exec".
  • by verdverm on 1/26/25, 10:17 PM

    AI is like hiring jr developers, have they any evidence this has worked prior? Why would they expect an equivalent to work?

    How did they come to this belief in the 10x AI developer? Get them to question their base assumptions, ask them to justify their expectations

  • by epcoa on 1/27/25, 9:08 AM

    Why not just let them fuck around and find out?
  • by iExploder on 1/26/25, 8:49 PM

    you would be wise to in fact do the exact opposite. claim it does more than 10x. ask to spearhead the AI transformation of your company. promise huge cost savings. make your bag. and who knows, maybe you'll actually deliver on those promises as a side effect. and what if not? remember, the senior eng team is there as a convenient scapegoat ...
  • by ianpurton on 1/27/25, 6:38 AM

    Because writing prompts is hard.

    Most AI coding tools can generate a Todo app from a small prompt. This is because that problem is well understood.

    When you try to use AI coding tools on your own projects you need to start writing a prompt that teaches the AI about your current architecture and decisions.

    So the initial prompt is large.

    Often the task needs knowledge of other files in your project. You can add them by hand or some AI tools will search the code base.

    The prompt is now huge.

    When you run that prompt you may or may not get what you expected.

    So now the issue is how much time do you spend getting the prompt correct vs just writing the code yourself.

    This area is brand new and there are very few resources on how to use AI coding tools effectively.

    I have yet to see one demonstration of effective AI coding tool use on a project of reasonable complexity.