from Hacker News

Mistral AI Launches New 8x22B MOE Model

by varunvummadi on 4/10/24, 1:31 AM with 153 comments

  • by freeqaz on 4/10/24, 2:40 AM

    What's the easiest way to run this assuming that you have the weights and the hardware? Even if it's offloading half of the model to RAM, what tool do you use to load this? Ollama? Llama.cpp? Or just import it with some Python library?

    Also, what's the best way to benchmark a model to compare it with others? Are there any tools to use off-the-shelf to do that?

  • by SushiHippie on 4/10/24, 2:02 AM

  • by mlsu on 4/10/24, 1:40 AM

    8x22b. If this is as good as Mixtral 8x7b we are in for a wonderful time.
  • by nazka on 4/10/24, 6:34 AM

    Out of topic but are we now back at the same performance than ChatGPT 4 at the time people said it worked like magic (meaning before the nerf to make it more politically correct but making his performance crash)?
  • by zmmmmm on 4/10/24, 2:08 AM

    A pre-Llama3 race for everyone to get their best small models on the table?
  • by nen-nomad on 4/10/24, 1:53 AM

    Mixtral 8x7b has been good to work with, and I am looking forward to trying this one as well.
  • by ZeljkoS on 4/11/24, 1:28 PM

  • by deoxykev on 4/10/24, 3:09 AM

    4 bit quants should require 85GB VRAM, so this will fit nicely on 4x 24G consumer GPUs, plus some leftover for KV cache optimization.
  • by zone411 on 4/11/24, 4:33 PM

    Very important to note that this is a base model, not an instruct model. Instruct fine-tuned models are what's useful for chat.
  • by talsperre on 4/10/24, 2:30 AM

    Right on time as LLama 3 is released.
  • by abdullahkhalids on 4/10/24, 2:20 AM

    Why are some of their models open, and others closed? What is their strategy?
  • by wkat4242 on 4/11/24, 8:05 AM

    Weird, the last post I see at that link is from the 8th of December 2023 and it's not about this.

    Edit: Ah, it's the wrong link. https://news.ycombinator.com/item?id=39986047

    Thanks SushiHippie!

  • by intellectronica on 4/11/24, 12:54 PM

    It's weird that more than a day after the weights dropped, there still isn't a proper announcement from Mistral with a model card. Nor is it available on Mistral's own platform.
  • by ein0p on 4/10/24, 2:48 AM

    To this day 8x7b Mixtral remains the best model you can run on a single 48GB GPU. This has the potential to become the best model you can run on two such GPUs, or on an MBP with maxed out RAM, when 4-bit quantized.
  • by varunvummadi on 4/10/24, 1:33 AM

    They Just announced their new model on Twitter, which you can download using torrent
  • by aurareturn on 4/10/24, 7:26 AM

    Might be a dumb question but does this mean this model has 176B params?
  • by resource_waste on 4/11/24, 4:09 PM

    What is the excitement around models that arent as good as llama?

    This is clearly an inferior model that they are willing to share for marketing purposes.

    If it was an improvement over llama, sure, but it seems like just an ad for bad AI.

  • by swalsh on 4/10/24, 1:33 AM

    Is this Mistral large?
  • by stainablesteel on 4/11/24, 3:43 PM

    has anyone had success making an auto-gpt concept for mistral/llama models? i haven't found one
  • by angilly on 4/10/24, 2:21 AM

    The lack of a corresponding announcement on their blog makes me worry about a Twitter account compromise and a malicious model. Any way to verify it’s really from them?
  • by tjtang2019 on 4/10/24, 2:19 AM

    What are the advantages compared to GPT? Looking forward to using it!