Jensen Huang: NVIDIA and the AI Revolution

Presenter: Jensen Huang
Host Institute: Lex Fridman Podcast
Host: Lex Fridman
This post distills a conversation between Jensen Huang (CEO of NVIDIA) and Lex Fridman, recorded in March 2026 (Lex Fridman Podcast #494). The central theme: NVIDIA's rise to a $4 trillion company is not about building better chips — it's about extreme co-design across the entire computing stack, from silicon to software to supply chain, all driven by first-principles reasoning about physical limits. Full transcript available at lexfridman.com.
本文整理自 Jensen HuangNVIDIA CEO)与 Lex Fridman 于 2026 年 3 月的对话(Lex Fridman Podcast #494)。核心主题:NVIDIA 崛起为 4 万亿美元公司,靠的不是造更好的芯片——而是在整个计算栈上的极端协同设计,从硅片到软件到供应链,全部由基于物理极限的第一性原理驱动。完整文字稿见 lexfridman.com

Extreme Co-Design

Jensen opens with a principle that defines NVIDIA’s approach: “the problem no longer fits inside one computer to be accelerated by one GPU.” Modern AI workloads span thousands of machines, which means optimization must happen across every layer simultaneously — GPUs, CPUs, memory, networking, power, cooling, and software.

This isn’t just a technical observation; it’s an organizational one. NVIDIA’s structure mirrors its product. Jensen has 60+ direct reports, almost all engineers, who participate in group problem-solving sessions rather than isolated one-on-ones. A memory specialist tunes into a discussion about thermal management because everything interconnects. The Vera Rubin rack exemplifies this: 1.3 million components, 200 suppliers, 7 chip types, all optimized together.

His design philosophy: “as complex as necessary, but as simple as possible.” Complexity is not avoided — it is managed through co-design across every boundary that traditional organizations treat as separate.

Jensen 开场提出了一个定义 NVIDIA 方法论的原则:“问题已经不再能装进一台电脑、由一块 GPU 加速了。” 现代 AI 工作负载跨越数千台机器,这意味着优化必须在每一层同时进行——GPU、CPU、内存、网络、电力、散热和软件。

这不仅是技术观察,也是组织观察。NVIDIA 的组织结构映射其产品结构。Jensen 有 60 多个直接下属,几乎全是工程师,参与的是集体问题解决会议而非孤立的一对一汇报。内存专家会参与散热管理的讨论,因为一切相互关联。Vera Rubin 机架就是例证:130 万个组件、200 家供应商、7 种芯片类型,全部协同优化。

他的设计哲学:”尽可能复杂到必要的程度,但尽可能简单。“复杂性不是要规避的——而是要通过跨越传统组织视为分离边界的协同设计来管理的。

Four Types of AI Scaling

Jensen lays out a framework of four scaling laws, each compounding on the others:

  1. Pre-training scaling: skeptics predicted data would become the bottleneck. Instead, synthetic data generation dissolved this constraint — humans create ground truth, AI augments and regenerates it in a self-perpetuating cycle.

  2. Post-training scaling: continued expansion through refinement, fine-tuning, and alignment processes.

  3. Test-time scaling: Jensen rejects the notion that inference is “easy” and commoditizable. “Inference is thinking, and thinking is hard.” Reasoning, planning, and problem-solving demand immense compute. This scaling law surprised many but now drives hardware demand.

  4. Agentic scaling: agents spawning sub-agents create exponential compute requirements. One agent orchestrating multiple specialized agents scales intelligence faster than linearly.

These form interconnected loops: agents generate experiences that become training data for pre-training, refined through post-training, enhanced at test-time, then deployed agentically — perpetuating endless scaling potential. The implication for hardware: “intelligence is gonna scale by one thing, and that’s compute.”

Jensen 提出了四种缩放定律的框架,每一种都在前一种基础上叠加:

  1. 预训练缩放:怀疑者预测数据会成为瓶颈。但合成数据生成消解了这一约束——人类创造真实标注,AI 增强和再生成数据,形成自我延续的循环。

  2. 后训练缩放:通过精调、微调和对齐过程持续扩展。

  3. 测试时缩放:Jensen 否定了推理”简单”且可商品化的观点。“推理就是思考,而思考是困难的。” 推理、规划和问题解决需要巨大的算力。这一缩放定律让许多人感到意外,但现在正驱动着硬件需求。

  4. 智能体缩放:智能体派生子智能体创造了指数级的算力需求。一个智能体协调多个专门化智能体,比线性增长更快地扩展智能。

这些形成相互关联的循环:智能体产生的经验成为预训练的训练数据,经过后训练精调,在测试时增强,然后以智能体方式部署——维持着无尽的缩放潜力。对硬件的启示是:“智能将由一样东西来缩放,那就是算力。”

CUDA and the Install Base Bet

One of the most revealing segments covers the decision to put CUDA on GeForce gaming GPUs — a bet that nearly killed the company. It increased costs by 50%, consuming all gross profit on a 35% margin business. NVIDIA’s market cap dropped from ~$8 billion to $1.5 billion.

The reasoning: “install base defines an architecture. Everything else is secondary.” Jensen draws the analogy to x86 — an inelegant instruction set that dominated because developers followed users. Beautiful RISC architectures lost not on technical merit but on adoption. By embedding CUDA in millions of gaming GPUs, NVIDIA ensured researchers would discover it organically — and they did, building the first deep learning clusters from GeForce cards.

This is a pattern worth studying: Jensen repeatedly bets on creating installed ecosystems rather than optimizing isolated products. The cost is existential risk in the short term; the payoff is architectural lock-in that compounds over decades.

最具启发性的段落之一是关于将 CUDA 放入 GeForce 游戏 GPU 的决定——这个赌注差点让公司倒闭。它使成本增加了 50%,在 35% 毛利率的业务上吞噬了全部毛利润。NVIDIA 的市值从约 80 亿美元跌至 15 亿美元。

推理是:“装机量定义架构。其他一切都是次要的。” Jensen 类比 x86——一个不优雅的指令集,因为开发者跟随用户而占据主导地位。漂亮的 RISC 架构输的不是技术优劣,而是采用率。通过将 CUDA 嵌入数百万块游戏 GPU,NVIDIA 确保研究者会自然发现它——他们确实做到了,用 GeForce 显卡搭建了第一批深度学习集群。

这是一个值得研究的模式:Jensen 反复押注于创建已安装的生态系统,而非优化孤立的产品。代价是短期的生存风险;回报是数十年复利累积的架构锁定。

Speed of Light Thinking

Jensen’s decision-making framework centers on what he calls “speed of light” thinking — testing every decision against physics-based limits before considering practical trade-offs. The sequence matters: first establish what’s theoretically possible, then discuss what’s practical.

This prevents incremental thinking. If a manufacturing process takes 74 days, the natural instinct is to optimize and save 2 days. But if you first ask “what is the physical minimum?” and discover it’s 6 days, you realize the entire process needs to be reimagined, not tweaked.

On shaping organizational belief: rather than top-down mandates, Jensen lays “bricks” of information over months or years — at GTC talks, board meetings, management sessions, company-wide communications. When a major pivot is announced (like “go all-in on deep learning” or “acquire Mellanox”), employees feel it was inevitable. The goal is 100% buy-in because everyone has been convinced incrementally, not surprised.

This is manifesting the future through systematic belief construction — arguably more powerful than charismatic leadership because it scales beyond personal presence.

Jensen 的决策框架围绕他所称的“光速”思维——在考虑实际权衡之前,先将每个决策与基于物理的极限进行对照。顺序很重要:先确定理论上什么是可能的,然后再讨论什么是实际的。

这防止了渐进式思维。如果一个制造流程需要 74 天,自然的本能是优化并节省 2 天。但如果你先问”物理极限是多少?”然后发现是 6 天,你会意识到整个流程需要被重新构想,而不是微调。

关于塑造组织信念:Jensen 不是自上而下发号施令,而是在数月或数年间铺设信息的”砖块”——在 GTC 演讲、董事会会议、管理层会议、全公司沟通中。当重大转向宣布时(如”全力投入深度学习”或”收购 Mellanox”),员工感到这是必然的。目标是 100% 的认同,因为每个人都已被逐步说服,而非被突然告知。

这是通过系统性的信念构建来”显化”未来——可以说比魅力型领导更强大,因为它可以超越个人存在的规模进行扩展。

Supply Chain as Strategy

NVIDIA’s growth isn’t just accelerating — it’s accelerating while increasing market share. Jensen spends substantial time educating upstream partners (TSMC, ASML, memory manufacturers) and downstream infrastructure providers about demands that don’t yet exist.

He convinced DRAM manufacturers that HBM memory — previously niche in supercomputing — would become mainstream. He persuaded them that low-power phone memory (LPDDR5) could scale to data centers. These conversations preceded actual demand by years, requiring billions in capital investments from suppliers.

The shift from DGX-1 to NVLink-72 changed where assembly happens: from data centers to the supply chain itself. Each NVLink-72 rack ships as a complete two-ton supercomputer. This requires supply chain partners to add manufacturing power at a scale measured in gigawatts per week.

Jensen’s approach to these relationships: visit partners personally, reason from first principles, draw pictures, respect their questions, build trust so they act with confidence on multi-billion-dollar bets. The supply chain is not a vendor relationship — it’s a co-engineering partnership.

NVIDIA 的增长不仅在加速——而且是在增加市场份额的同时加速。Jensen 花大量时间教育上游合作伙伴(台积电、ASML、内存制造商)和下游基础设施提供商关于尚不存在的需求。

他说服 DRAM 制造商,HBM 内存——此前仅在超级计算中使用——将成为主流。他说服他们,低功耗手机内存(LPDDR5)可以扩展到数据中心。这些对话早于实际需求数年,要求供应商投入数十亿美元的资本。

从 DGX-1 到 NVLink-72 的转变改变了组装发生的位置:从数据中心转移到供应链本身。每个 NVLink-72 机架作为一台完整的两吨重超级计算机出货。这要求供应链合作伙伴以每周吉瓦为单位增加制造能力。

Jensen 处理这些关系的方式:亲自拜访合作伙伴,从第一性原理出发推理,画图,尊重他们的问题,建立信任使他们能自信地执行数十亿美元的赌注。供应链不是供应商关系——而是协同工程伙伴关系。

Elon Musk and Colossus

Jensen offers a detailed assessment of how Elon Musk built Colossus — 200,000 GPUs deployed in Memphis in four months. He attributes this to several compounding factors:

Radical minimalism: strip everything to essentials without sacrificing core functionality. “He questions everything: Is it necessary? Does it have to be done this way? Must it take this long?”

Systems thinking: apply minimalism across all disciplines simultaneously, not just one domain.

Ground presence: Elon is physically present at the point of action, detailing cable-routing processes with engineers to eliminate errors. Being there forces problem-solving that remote management cannot.

Urgent leadership: personal urgency cascades through supply chains. Suppliers treat his projects as top priority because he demonstrates it relentlessly.

Jensen sees parallels to NVIDIA’s extreme co-design philosophy but notes Elon’s unique willingness to rebuild entire processes from scratch rather than incrementally improve existing ones. The Colossus timeline was not achieved through better project management — it was achieved through questioning whether the standard process should exist at all.

Jensen 详细评价了 Elon Musk 如何建造 Colossus——在孟菲斯四个月内部署 20 万块 GPU。他将此归因于几个叠加因素:

极端简约主义:在不牺牲核心功能的前提下剥离一切到本质。“他质疑一切:这是必要的吗?必须这样做吗?必须花这么长时间吗?”

系统思维:同时在所有学科中应用简约主义,而不仅限于一个领域。

现场在场:Elon 亲身出现在行动现场,与工程师详细讨论布线流程以消除错误。在场迫使解决问题,这是远程管理无法做到的。

紧迫领导力:个人的紧迫感通过供应链级联传递。供应商将他的项目视为最高优先级,因为他不懈地展示这种紧迫感。

Jensen 看到了与 NVIDIA 极端协同设计哲学的相似之处,但指出 Elon 独特之处在于他愿意从头重建整个流程,而非渐进改进现有流程。Colossus 的时间线不是通过更好的项目管理实现的——而是通过质疑标准流程是否应该存在来实现的。

China's Tech Ecosystem

Jensen provides one of the more nuanced assessments of China’s tech competitiveness. “50% of the world’s AI researchers are Chinese.” He identifies several structural advantages:

Competitive federalism: 30+ provinces with mayors competing on economic metrics. This produces dozens of EV companies, AI startups, and tech ventures in ruthless internal competition that forges winners.

Knowledge diffusion: a cultural norm of sharing within extended networks — schoolmates, family connections, former colleagues. “Family first, friends second, company third” means information flows freely across organizational boundaries. Open-sourcing technology feels natural when your schoolmate at the competing company already knows how it works.

Engineering culture: strong math and science education, cultural prestige attached to engineering careers rather than law or finance. Jensen notes: “It’s a builder nation. Their leaders are engineers, ours are lawyers.”

The result is the fastest-innovating country globally in certain domains, driven by rapid knowledge diffusion, intense internal competition, and abundant top talent.

Jensen 对中国的科技竞争力提供了较为细致的评估。“全球 50% 的 AI 研究人员是中国人。” 他识别了几个结构性优势:

竞争性联邦制:30 多个省份的市长在经济指标上竞争。这产生了数十家电动车公司、AI 初创企业和科技公司,在残酷的内部竞争中锻造赢家。

知识扩散:在扩展网络中分享的文化规范——同学、家族关系、前同事。“家人第一,朋友第二,公司第三” 意味着信息在组织边界间自由流动。当你在竞争公司的同学已经知道技术如何运作时,开源技术就显得自然了。

工程师文化:扎实的数理教育,工程师职业而非法律或金融享有文化声望。Jensen 指出:“这是一个建设者的国家。他们的领导者是工程师,我们的是律师。”

结果是在某些领域全球创新最快的国家,由快速的知识扩散、激烈的内部竞争和丰富的顶尖人才驱动。

Open Source and Nemotron

NVIDIA’s open-source strategy with Nemotron 3 reflects a deeper logic than altruism:

  1. Co-design intelligence: NVIDIA conducts basic research (SSMs, conditional GANs, diffusion models) to anticipate future computing requirements. Open-sourcing models reveals architectural innovations that inform hardware design.

  2. Ecosystem activation: proprietary models serve as products; open models activate every industry, researcher, and country. NVIDIA can afford this because its business model is hardware, not model APIs.

  3. Modality diversity: “AI is not just language.” Biology AI, weather prediction, physical AI, chemistry — not everything fits language model architectures. NVIDIA doesn’t build cars but wants every car company accessing great models. Doesn’t discover drugs but wants Eli Lilly having world-class biology AI.

  4. Full transparency: NVIDIA open-sourced Nemotron’s weights, data, and creation methodology — not just model weights.

The strategic insight: when your business is selling the pickaxes, you want as many people mining as possible. Open-sourcing models maximizes demand for NVIDIA hardware across every domain.

NVIDIA 的 Nemotron 3 开源策略反映了比利他主义更深层的逻辑:

  1. 协同设计智能:NVIDIA 进行基础研究(SSM、条件 GAN、扩散模型)以预测未来的计算需求。开源模型揭示了指导硬件设计的架构创新。

  2. 生态系统激活:专有模型作为产品;开源模型激活每个行业、研究者和国家。NVIDIA 能够承担这一点,因为其商业模式是硬件,而非模型 API。

  3. 模态多样性“AI 不仅仅是语言。” 生物 AI、天气预测、物理 AI、化学——并非所有领域都适合语言模型架构。NVIDIA 不造车但希望每家车企都能获取优秀模型。不研发药物但希望礼来拥有世界级的生物 AI。

  4. 完全透明:NVIDIA 开源了 Nemotron 的权重、数据以及创建方法论——不仅仅是模型权重。

战略洞见:当你的生意是卖镐头时,你希望尽可能多的人去挖矿。开源模型最大化了每个领域对 NVIDIA 硬件的需求。

Agents and Future Computing

Jensen reasons about agents through a thought experiment: what does a capable agent actually need?

  1. Access to ground truth — file systems, databases, documentation.
  2. Research capability — because no agent is omniscient.
  3. Tool use — rather than morphing into different instruments, use existing tools intelligently.

This naturally leads to agents that access file systems, read manuals, and execute code — but with security constraints. NVIDIA developed OpenClaw (agent framework) and NemoClaw (security layer) to demonstrate responsible deployment: agents can access files and execute code, but enterprise policy engines control what combinations of capabilities are available.

Jensen frames agentic computing as “the impact to the future of computing is deeply profound” — not because individual agents are revolutionary, but because agent-spawning-agent dynamics create exponential scaling of intelligence that maps directly to hardware demand. Each layer of agent orchestration multiplies compute requirements, creating a flywheel between software capability and hardware sales.

Jensen 通过思想实验推理智能体:一个有能力的智能体实际上需要什么?

  1. 访问真实数据 ——文件系统、数据库、文档。
  2. 研究能力 ——因为没有智能体是全知的。
  3. 工具使用 ——不是变形为不同工具,而是智能地使用现有工具。

这自然导向访问文件系统、阅读手册和执行代码的智能体——但有安全约束。NVIDIA 开发了 OpenClaw(智能体框架)和 NemoClaw(安全层)来展示负责任的部署:智能体可以访问文件和执行代码,但企业策略引擎控制可用的能力组合。

Jensen 将智能体计算定义为“对计算未来的影响是深远的” ——不是因为单个智能体是革命性的,而是因为智能体生成智能体的动态创造了智能的指数级缩放,直接映射到硬件需求。每一层智能体编排都倍增了计算需求,在软件能力和硬件销售之间创造了飞轮效应。

The Vera Rubin Pod

The conversation culminates with the Vera Rubin pod — what Jensen calls “the most complex computer the world has ever made”:

  • 1 pod = 1,100+ Rubin GPUs, 60 exaflops, 10 petabytes/second bandwidth
  • 1 NVLink-72 rack = 1.3 million components, 1,300 chips
  • 7 chip types, 5 rack types, 40 racks per pod
  • Nearly 1.2 quadrillion transistors
  • Target: 200 pods per week

Vera Rubin adds storage accelerators and a new Rock subsystem for agentic workloads — requirements invisible two years prior but logical through first-principles reasoning about how agents interact with data. Grace Blackwell focused on inference; Vera Rubin anticipates the agent era.

Jensen’s hardware anticipation cycle: model architectures change every 6 months, system architectures every 3 years, hardware every 3+ years. NVIDIA must predict trends 2-3 years ahead through research, industry signals, and principled reasoning — which is why the co-design philosophy is not optional but existential.

对话的高潮是 Vera Rubin pod——Jensen 称之为“世界上有史以来最复杂的计算机”

  • 1 个 pod = 1,100 多块 Rubin GPU,60 exaflops,每秒 10 PB 带宽
  • 1 个 NVLink-72 机架 = 130 万个组件,1,300 块芯片
  • 7 种芯片类型,5 种机架类型,每个 pod 40 个机架
  • 近 1.2 千万亿个晶体管
  • 目标:每周 200 个 pod

Vera Rubin 增加了存储加速器和新的 Rock 子系统用于智能体工作负载——这些需求在两年前是不可见的,但通过关于智能体如何与数据交互的第一性原理推理是合逻辑的。Grace Blackwell 聚焦推理;Vera Rubin 预见了智能体时代。

Jensen 的硬件预判周期:模型架构每 6 个月变化一次,系统架构每 3 年,硬件每 3 年以上。NVIDIA 必须通过研究、行业信号和原则性推理提前 2-3 年预测趋势——这就是为什么协同设计哲学不是可选的,而是存亡攸关的。

Power Grid and Graceful Degradation

Jensen identifies a non-obvious bottleneck: power availability is constrained not by generation capacity but by contractual rigidity. “99% of the time, power grid has excess power” — grids are engineered for worst-case scenarios but typically operate at ~60% capacity.

The problem is a cascade of uptime guarantees: end customers demand six-nines availability, cloud providers replicate those demands, and utilities must then guarantee impossibly high reliability. Jensen’s proposed solution is not engineering but contracting: data centers that voluntarily reduce power during infrastructure emergencies, shift workloads geographically, or accept slightly longer latency.

“We need data centers that gracefully degrade.” This uses idle grid capacity without forcing utilities to overbuild — a systems-thinking approach to what most people frame as a physics problem.

Jensen 发现了一个不明显的瓶颈:电力可用性的约束不在于发电能力,而在于合同的刚性。“99% 的时间,电网都有多余的电力” ——电网按最坏情况设计,但通常运行在约 60% 的容量。

问题在于正常运行时间保证的级联:终端客户要求六个九的可用性,云提供商复制这些要求,公用事业公司则必须保证不可能的高可靠性。Jensen 提出的解决方案不是工程而是合同:数据中心在基础设施紧急情况下自愿降低功率,在地理上转移工作负载,或接受略长的延迟。

“我们需要能优雅降级的数据中心。” 这利用了闲置的电网容量而无需迫使公用事业公司过度建设——这是用系统思维方法来看待大多数人视为物理问题的问题。