From Open Models to Open Circuits: Nvidia Nemotron 3 and OpenAI’s Circuit Sparsity

A synthesis of 6 perspectives on AI, machine learning, models release, models benchmarks, trending AI products

December 16, 2025

machine learning

models release

models benchmarks

trending AI products

AI Madness Digest

AI-Generated Episode

Dec 16, 2025

0:00-0:00

Nvidia Bets on Open Models as OpenAI Maps the Circuits Inside Neural Networks

This week on The NeuralNoise Podcast, we look at how Nvidia is moving from pure chipmaker to model provider—and how OpenAI’s new “circuit sparsity” tools are opening a window into the inner workings of large language models.

Nvidia’s Nemotron 3: From Silicon to Open AI Platforms

Nvidia has built its dominance by supplying the GPUs that power modern AI, but with its new Nemotron 3 family the company is edging into another role: major open‑model provider.

Nemotron 3 consists of three downloadable, modifiable language models—Nano (30B parameters), Super (100B), and Ultra (500B). These are among the most capable open models that can be run on users’ own hardware, according to benchmarks shared by Nvidia. Unlike many US rivals, Nvidia is not just releasing models, but also the training data and tools that went into them.

That transparency is strategic on multiple fronts:

Open models are where the action is. A recent OpenRouter report shows open models generating around a third of all tokens on its platform in 2025. Many of the most rapidly updated and widely used open models now come from Chinese labs like DeepSeek, Alibaba, and Moonshot AI.
US firms are closing up. Meta’s Llama kick‑started the current open‑weights race in 2023, but Meta has signaled it may move away from full openness. Other US giants are similarly guarded.
Chinese ecosystems are decoupling. China is pushing state‑owned AI data centers toward domestic chips, while the US has only partially relaxed export controls to allow Nvidia’s H200 into China. If Chinese AI models increasingly co‑evolve with Chinese silicon, Nvidia risks being sidelined.

Nemotron 3 is Nvidia’s answer: an open, high‑end model line tightly integrated with its hardware and software stack. CEO Jensen Huang framed it as a bet on “open innovation,” and the company is backing that up with:

A hybrid latent mixture‑of‑experts architecture optimized for agentic systems that can take actions on computers or the web
Libraries for reinforcement learning, letting developers train task‑specific agents through simulated rewards and penalties
Full training‑data release, giving enterprises and researchers a clearer path to customization and compliance

The message is clear: Nvidia doesn’t just want to sell you GPUs; it wants to be the platform you build your AI agents on—especially if closed US models and increasingly independent Chinese stacks leave a gap in the middle.

OpenAI’s Circuit Sparsity: Peering Into the Brain of a Model

While Nvidia leans into openness at the model and tooling level, OpenAI is experimenting with openness at the mechanistic level—how individual neurons and connections implement algorithms.

OpenAI has released the openai/circuit-sparsity model on Hugging Face and the companion openai/circuit_sparsity toolkit on GitHub, packaging the work from the paper “Weight-sparse transformers have interpretable circuits”. The core idea: train models to be extremely sparse from the start so their internal “circuits” become small enough to study directly.

Key elements of the approach:

Weight sparsity baked into training. After each optimizer step, only the largest‑magnitude weights in every matrix and bias are kept; the rest are zeroed. The sparsest models have about 1 in 1,000 weights nonzero.
Mild activation sparsity. Roughly 1 in 4 activations (across neurons, attention channels, and residual paths) are allowed to be nonzero, further simplifying the computational graph.
Tiny but complete circuits. On 20 carefully designed Python next‑token tasks—like closing quotes correctly, counting brackets for nested lists, or deciding whether a variable is a set or a string—the team prunes models down to the smallest node‑level circuits that still hit a target loss. Some algorithms emerge as circuits with just a dozen nodes and single‑digit edges.

The openai/circuit-sparsity model itself is a 0.4B‑parameter GPT‑2–style code model (csp_yolo2 in the paper) released under Apache 2.0 and runnable via a standard Hugging Face interface. The GitHub toolkit adds a Streamlit visualizer so researchers can explore circuits, ablate nodes, and inspect activations interactively.

Perhaps the most intriguing innovation is bridges: encoder–decoder pairs that map activations between a sparse model and a standard dense baseline. By training these bridges so that mixed sparse–dense forward passes match the original dense model, OpenAI can:

Manipulate an interpretable feature (like a “quote type” neuron) in the sparse model
Map that intervention into the dense model
Observe the resulting behavioral change in a production‑style network

This gives researchers a new handle on the age‑old question: do the clean, human‑readable circuits we find in toy or idealized setups actually correspond to structure in real, large‑scale systems?

Why These Two Stories Matter Together

Nvidia’s Nemotron 3 and OpenAI’s circuit sparsity tools point in the same broad direction: more control and visibility over powerful models.

On one side, open, well‑documented foundation models like Nemotron 3 make it easier for startups, enterprises, and researchers to build specialized agents and systems without being locked into a single vendor’s black box. On the other, mechanistic interpretability work like circuit sparsity and activation bridges gives us a sharper lens on what those systems are doing under the hood—and how to steer them safely.

As AI diffuses into critical infrastructure, both levers will matter: the freedom to run and customize your own models, and the ability to understand and debug them at the circuit level.

Loading reflection...

From Open Models to Open Circuits: Nvidia Nemotron 3 and OpenAI’s Circuit Sparsity

AI Madness Digest

How This Was Made

Nvidia Bets on Open Models as OpenAI Maps the Circuits Inside Neural Networks

Nvidia’s Nemotron 3: From Silicon to Open AI Platforms

OpenAI’s Circuit Sparsity: Peering Into the Brain of a Model

Why These Two Stories Matter Together

From Open Models to Open Circuits: Nvidia Nemotron 3 and OpenAI’s Circuit Sparsity

AI Madness Digest

How This Was Made

Nvidia Bets on Open Models as OpenAI Maps the Circuits Inside Neural Networks

Nvidia’s Nemotron 3: From Silicon to Open AI Platforms

OpenAI’s Circuit Sparsity: Peering Into the Brain of a Model

Why These Two Stories Matter Together