A synthesis of 6 perspectives on AI, machine learning, models release, models benchmarks, trending AI products
AI-Generated Episode
This week on The NeuralNoise Podcast, we look at how Nvidia is moving from pure chipmaker to model provider—and how OpenAI’s new “circuit sparsity” tools are opening a window into the inner workings of large language models.
Nvidia has built its dominance by supplying the GPUs that power modern AI, but with its new Nemotron 3 family the company is edging into another role: major open‑model provider.
Nemotron 3 consists of three downloadable, modifiable language models—Nano (30B parameters), Super (100B), and Ultra (500B). These are among the most capable open models that can be run on users’ own hardware, according to benchmarks shared by Nvidia. Unlike many US rivals, Nvidia is not just releasing models, but also the training data and tools that went into them.
That transparency is strategic on multiple fronts:
Nemotron 3 is Nvidia’s answer: an open, high‑end model line tightly integrated with its hardware and software stack. CEO Jensen Huang framed it as a bet on “open innovation,” and the company is backing that up with:
The message is clear: Nvidia doesn’t just want to sell you GPUs; it wants to be the platform you build your AI agents on—especially if closed US models and increasingly independent Chinese stacks leave a gap in the middle.
While Nvidia leans into openness at the model and tooling level, OpenAI is experimenting with openness at the mechanistic level—how individual neurons and connections implement algorithms.
OpenAI has released the openai/circuit-sparsity model on Hugging Face and the companion openai/circuit_sparsity toolkit on GitHub, packaging the work from the paper “Weight-sparse transformers have interpretable circuits”. The core idea: train models to be extremely sparse from the start so their internal “circuits” become small enough to study directly.
Key elements of the approach:
The openai/circuit-sparsity model itself is a 0.4B‑parameter GPT‑2–style code model (csp_yolo2 in the paper) released under Apache 2.0 and runnable via a standard Hugging Face interface. The GitHub toolkit adds a Streamlit visualizer so researchers can explore circuits, ablate nodes, and inspect activations interactively.
Perhaps the most intriguing innovation is bridges: encoder–decoder pairs that map activations between a sparse model and a standard dense baseline. By training these bridges so that mixed sparse–dense forward passes match the original dense model, OpenAI can:
This gives researchers a new handle on the age‑old question: do the clean, human‑readable circuits we find in toy or idealized setups actually correspond to structure in real, large‑scale systems?
Nvidia’s Nemotron 3 and OpenAI’s circuit sparsity tools point in the same broad direction: more control and visibility over powerful models.
On one side, open, well‑documented foundation models like Nemotron 3 make it easier for startups, enterprises, and researchers to build specialized agents and systems without being locked into a single vendor’s black box. On the other, mechanistic interpretability work like circuit sparsity and activation bridges gives us a sharper lens on what those systems are doing under the hood—and how to steer them safely.
As AI diffuses into critical infrastructure, both levers will matter: the freedom to run and customize your own models, and the ability to understand and debug them at the circuit level.