from Hacker News

Extending the context length to 1M tokens

by cmcconomy on 11/18/24, 4:27 PM with 107 comments

  • by aliljet on 11/18/24, 6:05 PM

    This is fantastic news. I've been using Qwen2.5-Coder-32B-Instruct with Ollama locally and it's honestly such a breathe of fresh air. I wonder if any of you have had a moment to try this newer context length locally?

    BTW, I fail to effectively run this on my 2080 ti, I've just loaded up the machine with classic RAM. It's not going to win any races, but as they say, it's not the speed that matter, it's the quality of the effort.

  • by lr1970 on 11/19/24, 12:41 AM

    > We have extended the model’s context length from 128k to 1M, which is approximately 1 million English words

    Actually English language tokenizers map on average 3 words into 4 tokens. Hence 1M tokens is about 750K English words not a million as claimed.

  • by lostmsu on 11/18/24, 6:10 PM

    Is this model downloadable?
  • by swazzy on 11/18/24, 5:48 PM

    Note unexpected three body problem spoilers in this page
  • by anon291 on 11/18/24, 5:49 PM

    Can we all agree that these models far surpass human intelligence now? I mean they process hours worth of audio in less time than it would take a human to even listen. I think the singularity passed and we didn't even notice (which would be expected)