Conversation with NVIDIA's Vice President of Business: The "ChatGPT Moment" for Robotics Is Coming

2026-03-23 00:01:31

Understanding today’s NVIDIA may be more challenging than ever, but how this company, which influences many AI developments, is shaping the future of AI remains worth exploring.

Signals of NVIDIA’s business expansion are becoming clear. At this year’s GTC conference, NVIDIA announced products covering data center accelerators, rack systems, networking equipment, and multiple open-source models. Keywords like CUDA, GPU, LPU (Language Processing Unit), AI factories, robotics, autonomous driving, and open-source models were frequently mentioned in CEO Jensen Huang’s speech. This GPU-renowned company now seems more appropriately defined as a provider of AI infrastructure or AI factories across multiple segments.

Even within the data center accelerator segment, NVIDIA’s product variety has diversified. Beyond the Rubin platform, an LPU has been added. Originally a dedicated integrated circuit (ASIC), LPU stands in contrast to general-purpose GPUs, but after NVIDIA acquired Groq’s licensing, the integration of both chip types has begun.

Outside of the 60% of business serving large cloud providers, NVIDIA is also venturing into a more complex 40% of its business. Autonomous driving and robotics in physical AI have become two key areas. To deploy physical AI, NVIDIA not only develops hardware but also autonomous driving platforms and models.

Understanding NVIDIA today may be more difficult than before, but how this company, which influences many AI fields, is outlining the future of AI remains a question worth exploring. During GTC, First Financial journalists held separate conversations with Ian Buck, Vice President of Large-Scale and High-Performance Computing at NVIDIA, and Rev Lebaredian, Vice President of Omniverse and Simulation Technologies, to interpret NVIDIA’s product strategies and considerations, discuss the phenomenon of chip heterogeneity, NVIDIA’s layout for physical AI, and why the “ChatGPT moment” for robots is arriving.

Why GPUs Still Dominate

Based on Groq’s technology, NVIDIA launched the Groq 3 and Groq 3 LPX rack at this GTC. According to the company, Groq 3 LPX, used alongside Rubin CPUs and GPUs, can increase inference throughput per megawatt by 35 times. Groq 3 LPX will be integrated into the next-generation Vera Rubin AI factory in the second half of this year.

The addition of Groq 3 means GPUs are no longer the sole form of NVIDIA’s data center accelerators. Previously, how the GPU camp would face challenges from the ASIC camp has been a long-standing discussion. NVIDIA’s non-exclusive IP agreement with Groq at the end of last year, and the hiring of Groq founder Jonathan Ross, President Sunny Madra, and other core team members, are seen as strategic responses to market challenges. Low-latency inference is a key feature of Groq’s LPU. So, what is NVIDIA aiming to do by including LPU in its product lineup?

Jensen Huang explained that tokens from different-sized models differ. While Rubin remains a crucial platform for current token production needs, new niche markets are emerging. As models grow larger, and context lengths increase, inference speed must be very fast. Combining new chip configurations can meet various computational demands.

Ian Buck provided his perspective. He told reporters that Groq 3 LPU can be seen as an “enhanced package” for Rubin. It features impressive fast SRAM memory capable of rapid floating-point calculations. However, it has limitations: running trillion-parameter models solely on LPU might require dozens of racks, making large-scale deployment impractical and costly with low infrastructure efficiency. But if an LPX rack enables collaboration between LPU and Rubin racks, leveraging the strengths of both chips, all attention calculations can be handled on GPUs, and all matrix computations for expert models on LPUs.

“For most current chatbots or recommendation systems, the majority of AI workloads will continue to be served by Rubin. LPU won’t replace these scenarios. But for the next generation of intelligent agents, with trillion-parameter models, hundreds of thousands of tokens in context, and speeds of thousands of tokens per second, combining these two chips becomes possible,” said Ian Buck.

Other GPU manufacturers are also experimenting with different chips in data centers. AMD, for example, announced a partnership with Meta at the end of February, including a special collaboration to design semi-custom chips. Earlier this month, Su Zifeng explained that AI infrastructure is becoming more complex, with various workloads—training, inference, large models, small models—requiring different types of computation. “In the next stage of AI infrastructure, no single chip can do everything best; this is already a heterogeneous world. People also need to consider the cost per watt, aiming for high efficiency when running large AI workloads. ASICs will always have a place in computational demands,” she said. Her views align with Jensen Huang’s on the cost and diversification of AI workloads.

As chips move toward heterogeneity, will ASICs increasingly challenge the programmable, general-purpose GPU? Especially when certain ASIC products tailored for specific workloads offer speed and cost advantages?

Ian Buck sees this as a balance between meeting specific computational needs and platform programmability. “We can design an ASIC for GPT-OSS, optimized for extreme environments. I believe that would be efficient. But such a model and its implementation would be fixed in silicon, preventing further software-based optimization, making GPT-OSS less adaptable and scalable,” he explained.

He also shared that DeepSeek-R1, released a year ago, has become more efficient as the community has learned new methods to run mixture-of-experts models on GPUs. “This is possible because these chips are open and configurable. New execution methods like tensor parallelism, wide expert parallelism, pipeline parallelism, and moving from FP16 to FP8 and FP4 have emerged. The platform’s programmability enables performance improvements by multiples, allowing general GPUs to run faster, reduce costs, and increase revenue,” Buck said.

He recounted that NVIDIA’s 400 software engineers spent about four months and 1.2 million GPU hours optimizing DeepSeek-R1, achieving a fourfold performance boost through software alone. “We can tailor solutions for different workloads or even hard-code models into chips, but that risks missing opportunities for new algorithms and innovations. We found that about 95% of optimizations and techniques based on a programmable platform can be applied across the ecosystem, making future models smarter,” he added.

Regarding whether Groq will also join the CUDA ecosystem, Buck said that while the first-generation LPU isn’t ready yet, NVIDIA plans to open its programming environment later, possibly through CUDA or other means.

Laying the Foundation for Physical AI

At this GTC, NVIDIA also announced several developments in physical AI. For robotics, NVIDIA introduced the Isaac simulation framework, Cosmos, and Isaac GROOT open-source models for industry development, training, and deployment. Cosmos 3 is the first unified world-generation model combining synthetic environment creation, physical AI inference, and motion simulation. For autonomous driving, NVIDIA launched the Alpamayo 1.5 inference VLA (Visual Action Language) model to enhance vehicle reasoning capabilities.

NVIDIA is increasingly focusing on software deployment in physical AI, beyond hardware. This includes deep model-level open-source initiatives.

Rev Lebaredian emphasized that open-source is more important than ever. NVIDIA has invested heavily in open research and open-source technologies, especially for physical AI, because no single company can build physical AI alone. Achieving a ChatGPT-like moment for robots requires collective contribution. As NVIDIA is at the center of AI, connecting everyone in the ecosystem, this work must start with NVIDIA.

He explained that large world models learn from the physical laws of the universe, not just language. Cosmos is open-source, allowing any company to run it on their computers for various applications. NVIDIA provides data, frameworks, and blueprints needed to create models. “We do this because we’re still far from fully realized physical AI and robotics. Open-source efforts are crucial. Many world model developers are using Cosmos for training and evaluation, turning AI into another teacher for AI,” he said.

Regarding the different development stages of physical AI, Lebaredian noted that for autonomous vehicles, the challenge has shifted from science to engineering—scaling up and figuring out how to get more cars on the road. For general robotics, the situation is different. Challenges include the robot’s physical form, such as the lack of good bodies, hands, sensors, actuators, motors, and batteries.

He stated that even with perfect robot bodies, they wouldn’t be used effectively without extensive engineering and programming. The industry is at a pivotal moment, with enough technology to make robot brains useful. The “ChatGPT moment” for robots is approaching. The connection between technology and application is happening now—using inference capabilities to generate data within Cosmos to train robots.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.