Editorial illustration for Nvidia breaks MLPerf records with 288 GPUs as AMD, Intel pursue other goals
Nvidia Shatters MLPerf Records with 288-GPU Powerhouse
Nvidia breaks MLPerf records with 288 GPUs as AMD, Intel pursue other goals
Nvidia just turned another page in the MLPerf scorebook, cranking out record‑setting numbers by running its B200 and B300 models on a 288‑GPU cluster. While the headline grabs attention, the deeper story is about how the results line up—or don’t—with the data other chipmakers are willing to share. AMD, for its part, submitted a handful of runs that let observers compare percentages directly against Nvidia’s figures.
Intel, meanwhile, has been quiet on this particular front, pursuing its own set of benchmarks. The disparity raises a simple question: when one vendor scales to a massive array and the other doesn’t, how useful are the comparisons? Analysts are watching not just the raw throughput but also the methodology behind the numbers.
That’s why the forthcoming comment on transparency, scenario coverage, and software gains matters. It cuts to the heart of whether Nvidia’s leap in performance can be measured against anything AMD actually ran, or if the gap is simply unbridgeable.
AMD's percentage comparisons against Nvidia's B200 and B300 represent the most transparent head-to-head data available, but they only apply to the models and scenarios AMD actually submitted. Nvidia's scaling results with 288 GPUs have no AMD counterpart. And Nvidia's 2.7x software improvement and AMD's 3.1x generational leap measure fundamentally different things: pure software optimization on the same hardware versus a new chip architecture.
Nvidia pushes for a new benchmark that measures real-world API performance A step toward better comparability could come with the upcoming MLPerf Endpoints benchmark. Nvidia announces in its blog post that it's driving the definition of this benchmark within the MLCommons consortium.
Nvidia’s 288‑GPU run shatters the latest MLPerf Inference v6.0 numbers, topping the benchmark that for the first time includes multimodal and video models. Yet the headline masks a more nuanced picture. AMD and Intel each spotlighted different metrics, so a straight‑line comparison is impossible.
AMD’s percentage‑based head‑to‑head data against Nvidia’s B200 and B300 chips is the most transparent slice of the results, but it only covers the models AMD actually submitted. Nvidia, on the other hand, reports a 2.7× software improvement and scaling that has no AMD counterpart in this round.
Intel’s focus diverges further, emphasizing separate goals rather than chasing the same headline figures. Consequently, the three vendors are essentially fighting on distinct fronts.
Unclear whether Nvidia’s scaling advantage will translate to broader workloads, or if AMD’s and Intel’s alternative emphases will prove more relevant in future submissions. The benchmark’s expanded model set offers more data points, but the lack of common ground makes definitive conclusions about overall superiority premature.
Further Reading
- NVIDIA Extreme Co-Design Delivers New MLPerf Inference Records - NVIDIA Developer Blog
- NVIDIA Blackwell Ultra GPUs Crush MLPerf Benchmarks with 2.7x Performance Gains - Blockchair
- NVIDIA Sets MLPerf Inference v6.0 Records with Blackwell Ultra Platform - StorageReview
- AMD Delivers Breakthrough MLPerf Inference 6.0 Results - AMD Blogs
Common Questions Answered
How many GPUs did Nvidia use to break MLPerf records?
Nvidia used a 288-GPU cluster to set new performance benchmarks in MLPerf Inference v6.0. This massive GPU configuration allowed them to demonstrate unprecedented scaling and performance across multiple models, including multimodal and video models.
What makes the MLPerf Inference v6.0 benchmark unique compared to previous versions?
MLPerf Inference v6.0 is the first benchmark to include multimodal and video models, expanding the scope of performance testing beyond traditional AI workloads. This new inclusion provides a more comprehensive view of AI computational capabilities across different types of AI models and processing requirements.
Why is a direct performance comparison between Nvidia, AMD, and Intel challenging in this MLPerf benchmark?
Each company highlighted different metrics and submitted varying models, making a straightforward performance comparison impossible. For instance, Nvidia's 2.7x software improvement differs fundamentally from AMD's 3.1x generational leap, which represents a new chip architecture rather than pure software optimization.