from Hacker News

helloericsf

joined 5/9/22, 8:24 PM has 608 karma

Better than DeepSeek R1? MiniMax-M1:open-weight hybrid-attention reasoning model
by helloericsf on 6/16/25, 5:28 PM, with 0 comments
kit - Code Intelligence Toolkit
by helloericsf on 5/8/25, 11:16 PM, with 0 comments
DeepSeek Open Source Optimized Parallelism Strategies, 3 repos
by helloericsf on 2/27/25, 2:01 AM, with 8 comments
DeepSeek Open Source DeepGEMM – FP8 GEMM Library(300 lines for 1350+ FP8 TFLOPS)
by helloericsf on 2/26/25, 1:08 AM, with 1 comments
Alibaba Open Source Large-Scale Video Generative Models: Wan2.1
by helloericsf on 2/25/25, 3:03 PM, with 2 comments
DeepSeek open source DeepEP – library for MoE training and Inference
by helloericsf on 2/25/25, 2:27 AM, with 71 comments
DeepSeek Open Source FlashMLA – MLA Decoding Kernel for Hopper GPUs
by helloericsf on 2/24/25, 1:37 AM, with 108 comments
New Qwen2.5-Max Outperforms DeepSeek V3 in Benchmarks
by helloericsf on 1/28/25, 4:08 PM, with 2 comments
Longest context up to 4M, MiniMax-01 hybrid 456B Open source model
by helloericsf on 1/14/25, 7:32 PM, with 1 comments
DeepSeek v3 beats Claude sonnet 3.5 and way cheaper
by helloericsf on 12/26/24, 11:47 AM, with 9 comments
NeurIPS and Dr. Picard released statement for singling out Chinese scholars
by helloericsf on 12/16/24, 6:16 PM, with 2 comments
Tencent Hunyuan-Large
by helloericsf on 11/5/24, 6:52 PM, with 103 comments
Chinese AI Community: open-source Heatmap
by helloericsf on 7/31/24, 10:46 PM, with 1 comments
Poolside is raising $400M+ at a $2B valuation to build a coding co-pilot
by helloericsf on 6/20/24, 8:10 PM, with 1 comments
Is LMDeploy the Ultimate Solution? Why It Outshines VLLM, TRT-LLM, TGI, and MLC
by helloericsf on 6/20/24, 3:48 PM, with 8 comments
21.2× faster than llama.cpp? plus 40% memory usage reduction
by helloericsf on 6/12/24, 9:58 PM, with 14 comments