Moore Threads Open‑Sources MusaCoder: LLM for GPU Kernel Generation, Trained Entirely on Domestic GPUs

Release date:2026-06-11 Number of clicks:96

On June 10, Moore Threads released and open‑sourced MusaCoder , the first dedicated code LLM fully trained and validated on a domestic GPU compute base. MusaCoder comes in 9B and 27B parameter versions, generating native CUDA/MUSA kernel code from PyTorch standard operators – lowering the bar for hand‑coding low‑level GPU kernels.

1781162945291299.jpg

The full post‑training pipeline (SFT, RFT, RL, async rollout, online compilation/execution validation, and reward calculation) was completed on Moore Threads’ Kua´e AI compute cluster built with MTT S5000 GPUs , proving domestic GPUs can handle the entire code‑LLM lifecycle, including frequent compilation and feedback loops.

On KernelBench, MusaCoder‑27B‑RL achieved Overall Pass@8 of 93.2% and Avg.@8 of 88.60% , surpassing Claude Opus 4.7, DeepSeek‑V4 Pro, GLM‑5.1, and Kimi K2.6.

ICgoodFind: Open‑source MusaCoder fills a software gap for domestic GPUs, proving China’s AI compute stack can train top‑tier code models end‑to‑end.

Home
TELEPHONE CONSULTATION
Whatsapp
Semiconductor Technology