NVIDIA Corporation (NASDAQ:NVDA) Q4 2024 Earnings Call Transcript

So we’re seeing sovereign AI infrastructure is being built in Japan, in Canada, in France, so many other regions. And so my expectation is that what is being experienced here in the United States, in the West, will surely be replicated around the world, and these AI generation factories are going to be in every industry, every company, every region. And so I think the last — this last year, we’ve seen a generative AI really becoming a whole new application space, a whole new way of doing computing, a whole new industry is being formed and that’s driving our growth.

Operator: Your next question comes from the line of Joe Moore from Morgan Stanley. Your line is open.

Joe Moore: Great. Thank you. I wanted to follow up on the 40% of revenues coming from inference. That’s a bigger number than I expected. Can you give us some sense of where that number was maybe a year before, how much you’re seeing growth around LLMs from inference? And how are you measuring that? Is that — I assume it’s in some cases the same GPUs you use for training and inference. How solid is that measurement? Thank you.

Jensen Huang: I’ll go backwards. The estimate is probably understated. And — but we estimated it. And let me tell you why. Whenever — a year ago, the recommender systems that people are — when you run the internet, the news, the videos, the music, the products that are being recommended to you because as you know, the internet has trillions — I don’t know how many trillions, but trillions of things out there and your phone is 3-inches square. And so the ability for them to fit all of that information down to something, such a small real estate, is through a system, an amazing system called recommender systems. These recommender systems used to be all based on CPU approaches. But the recent migration to deep learning and now generative AI has really put these recommender systems now directly into the path of GPU acceleration.

It needs GPU acceleration for the embeddings. It needs GPU acceleration for the nearest neighbor search. It needs GPU acceleration for the re-ranking and it needs GPU acceleration to generate the augmented information for you. So GPUs are in every single step of a recommender system now. And as you know, recommender system is the single largest software engine on the planet. Almost every major company in the world has to run these large recommender systems. Whenever you use ChatGPT, it’s being inferenced. Whenever you hear about Midjourney and just the number of things that they’re generating for consumers, when you when you see Getty, the work that we do with Getty and Firefly from Adobe. These are all generative models. The list goes on. And none of these, as I mentioned, existed a year ago, 100% new.

Operator: Your next question comes from the line of Stacy Rasgon from Bernstein Research. Your line is open.

Stacy Rasgon: Hi, guys. Thanks for taking my question. I wanted Colette — I wanted to touch on your comment that you expected the next generation of products — I assume that meant Blackwell, to be supply constrained. Could you dig into that a little bit, what is the driver of that? Why does that get constrained as Hopper is easing up? And how long do you expect that to be constrained, like do you expect the next generation to be constrained like all the way through calendar ’25, like when do those start to ease?

Jensen Huang: Yeah. The first thing is overall, our supply is improving, overall. Our supply chain is just doing an incredible job for us, everything from of course the wafers, the packaging, the memories, all of the power regulators, to transceivers and networking and cables and you name it. The list of components that we ship — as you know, people think that NVIDIA GPUs is like a chip. But the NVIDIA Hopper GPU has 35,000 parts. It weighs 70 pounds. These things are really complicated things we’ve built. People call it an AI supercomputer for good reason. If you ever look in the back of the data center, the systems, the cabling system is mind boggling. It is the most dense complex cabling system for networking the world’s ever seen.

Our InfiniBand business grew 5x year over year. The supply chain is really doing fantastic supporting us. And so overall, the supply is improving. We expect the demand will continue to be stronger than our supply provides and — through the year and we’ll do our best. The cycle times are improving and we’re going to continue to do our best. However, whenever we have new products, as you know, it ramps from zero to a very large number. And you can’t do that overnight. Everything is ramped up. It doesn’t step up. And so whenever we have a new generation of products — and right now, we are ramping H200’s. There is no way we can reasonably keep up on demand in the short term as we ramp. We’re ramping Spectrum-X. We’re doing incredibly well with Spectrum-X.

It’s our brand-new product into the world of ethernet. InfiniBand is the standard for AI-dedicated systems. Ethernet with Spectrum-X –ethernet is just not a very good scale-out system. But with Spectrum-X, we’ve augmented, layered on top of ethernet, fundamental new capabilities like adaptive routing, congestion control, noise isolation or traffic isolation, so that we could optimize ethernet for AI. And so InfiniBand will be our AI-dedicated infrastructure. Spectrum-X will be our AI-optimized networking and that is ramping, and so we’ll — with all of the new products, demand is greater than supply. And that’s just kind of the nature of new products and so we work as fast as we can to capture the demand. But overall, overall net-net, overall, our supply is increasing very nicely.