from Hacker News

Exploring spaCy-based prompt compression for LLMs – thoughts welcome

by metawake on 4/17/25, 1:21 PM with 1 comments

  • by metawake on 4/17/25, 1:21 PM

    Hi HN,

    I’ve been exploring whether prompt compression — done before sending input to LLMs — can help cut down on token usage and cost without losing key meaning.

    Instead of using a neural model, I wrote a small open-source tool that uses handcrafted rules + spaCy NLP to reduce prompt verbosity while preserving named entities and domain terms. It’s mostly aimed at high-volume systems (e.g. support bots, moderation pipelines, embedding pipelines for vector DBs).

    Tested it on 135 real prompts and got 22.4% average compression with high semantic fidelity.

    GitHub: https://github.com/metawake/prompt_compressor

    Would love feedback, use cases, or critiques!