NVIDIA released Nemotron 3 Super at GTC 2026, a 120B-parameter open-weight Mamba-Transformer hybrid MoE model with 12B active parameters and a 1-million-token context window.
It leads open-weight models on SWE-Bench Verified at 60.47% and achieves up to 7.5x higher inference throughput than comparable dense models. Released under a permissive license for enterprise agentic AI.
The Mamba-Transformer hybrid architecture enables significantly more efficient inference at long context lengths, making it particularly attractive for agentic workflows where models need to process large amounts of context.
Visit the model page for full specifications and benchmark breakdowns.