by Thoreandan on 3/25/25, 3:31 AM with 4 comments
by 42772827 on 3/25/25, 6:09 AM
>The model only came out a few hours ago and MLX developer Awni Hannun already has it running at >20 tokens/second on a 512GB M3 Ultra Mac Studio ($9,499 of ostensibly consumer-grade hardware) via mlx-lm and this mlx-community/DeepSeek-V3-0324-4bit 4bit quantization, which reduces the on-disk size to 352 GB.
by mdaniel on 3/25/25, 5:53 AM
by sinenomine on 3/25/25, 3:58 AM
by ilrwbwrkhv on 3/25/25, 7:04 AM