from Hacker News

Why Your Chunking Strategy Makes or Breaks Your AI System

by savanpatel on 6/20/25, 9:12 PM with 2 comments

  • by colbyn on 6/20/25, 11:36 PM

    Personally I’ve been thinking about this problem for some time and have had a neat idea that im almost tempted to patent, I have yet to test it via an actual implementation but the core idea is quite simple…
  • by chiccomagnus on 6/23/25, 2:38 PM

    IMHO, your article is missing an important point: 90% of implementations today flatten documents to plain text before chunking them. Why not consider the visual appearance that the human gave to the document? Using layout information combined with semantics, you can increase rag performances by +160% (tested via benchmarks), so why do most of us only use text?

    Note: multimodal ≠ layout