Nvidia coming launched its latest group of AI models successful its unfastened root Nemotron series for powering AI agents: Nemotron 3 Nano, Super and Ultra. For nan first time, Nvidia is besides releasing not conscionable nan models but besides 3 trillion tokens worthy of pre-training information and 18 cardinal samples of post-training data. Thanks to Nvidia’s existing training environments, for which nan institution is besides launching 10 caller gym environments, and nan company’s unfastened root reinforcement learning libraries, developers will besides beryllium capable to easy return these models and train them for their ain usage cases.
The Nano exemplary is disposable now, pinch nan Super and Ultra models expected to beryllium disposable successful nan first half of 2026.
The Nemotron 3 Model Family
As for nan models themselves, this is nan first of nan Nemotron families to usage nan mixture of experts (MoE) technique, which fundamentally decouples exemplary size from compute costs by only keeping a subset of parameters progressive astatine immoderate fixed time. That, successful turn, intends these caller models are importantly faster, pinch Nvidia arguing that nan 30-billion-parameter Nemotron 3 Nano exemplary (with 3 cardinal progressive parameters because of that MoE technique), is up to 4x much performant than nan balanced Nemotron 2 Nano model. It besides generates up to 60% less reasoning tokens to create its answers, which will bring down nan costs of utilizing this exemplary moreover more. It’s besides 1 of nan fewer unfastened root models to connection a discourse model of 1 cardinal tokens.
The Nano model, which Nvidia says should activity particularly good for targeted tasks, is now disposable connected HuggingFace.
The Nemotron 3 Super exemplary is simply a 100-billion-parameter exemplary pinch 10 cardinal progressive parameters, is meant for multi-agent applications. The Nemotron 3 Ultra exemplary features 500 cardinal parameters and 50 cardinal progressive ones, and while that is going to make nan smartest of these caller models and awesome for much analyzable applications, it’ll besides beryllium nan astir costly to run.

Credit: Nvidia.
Nvidia did not supply nan property pinch elaborate benchmarks up of nan embargo time, each nan institution has said truthful acold is that “Artificial Analysis, an independent statement that benchmarks AI, classed nan exemplary arsenic nan astir unfastened and businesslike among models of nan aforesaid size, pinch starring accuracy.”
You tin stitchery a spot much accusation astir wherever nan Nano exemplary falls from nan beneath Artificial Analysis graph, which puts Nemotron 3 Nano successful nan aforesaid ballpark arsenic OpenAI’s GPT-OSS-20B (high), Qwen 3 30B and Qwen 3 VL 32B, though pinch overmuch higher tokens per 2nd output speed. ServiceNow’s Apriel Thinking exemplary is importantly slower but a spot up of nan Nano exemplary successful Artificial Analysis’s intelligence index.

Nvidia Nemotron 3 Nano benchmark according to Artificial Analysis. Credit: Nvidia.
Availability
Given nan unfastened root quality and licence of these caller models, developers will beryllium capable to tally them themselves if they person nan required hardware arsenic an Nvidia NIM microservice, but they will besides beryllium disposable done commercialized providers and different platforms, including nationalist clouds for illustration Amazon Bedrock (serverless) and, soon, connected Google Cloud, Coreweave, Nebius, Nscale and Yotta.
Inference services for illustration Baseten, Deepinfra, Fireworks, FriendliAI, OpenRouter and Together AI will besides connection it, arsenic good arsenic platforms for illustration Couchbase, DataRobot, H2O.ai, JFrog, Lambda and UiPath.
Why Nvidia Builds Its Own Models
While Nvidia is amended known for creating nan hardware accelerators that nan immense mostly of ample connection models person been trained on, nan company’s travel successful building its ain models started successful 2019, pinch nan Megatron-LM model. The first models nether nan Nemotron marque launched successful 2024, pinch a reasoning exemplary based connected Meta’s Llama 3.1. Since then, Nvidia has launched rather a fewer Nemotron models successful different sizes and tuned for circumstantial usage cases, each pinch comparatively permissive licenses that allowed companies for illustration ServiceNow to tune these models for their ain usage cases.
When asked why Nvidia is building its ain models successful a property convention up of today’s announcement and if nan institution is trying to go a frontier exemplary builder, Kari Briski, nan VP of Generative AI for Enterprise astatine Nvidia, noted that portion of nan thought present is to push nan company’s ain hardware to nan limit successful some training and exemplary inference.
“I wouldn’t person to opportunity ‘is it competing?’ It’s for building it for ourselves and we’re giving it to nan ecosystem to spot and create connected apical of,” she explained.
This, Briski argued, is besides why Nvidia is willing successful building this unfastened ecosystem astir its models — and exemplary creation successful general. “If we judge that generative AI and ample connection models are — and we do — nan improvement level of nan future, I’m looking astatine these LLMs arsenic if they’re a library. And what do we do pinch libraries? We put them retired location for [developers] to inspect nan code, truthful that you tin understand it, that you tin build connected it, that we tin hole bugs, that we tin amended it, and past put that backmost retired there. So nan much that we put that retired there, nan much developer engagement.”
YOUTUBE.COM/THENEWSTACK
Tech moves fast, don't miss an episode. Subscribe to our YouTube channel to watercourse each our podcasts, interviews, demos, and more.
Group Created pinch Sketch.
English (US) ·
Indonesian (ID) ·