Skip to main content
Jensen Huang, NVIDIA CEO, and Baseten CEO discuss AI inference and cloud scaling, highlighting NVIDIA's investment.

Editorial illustration for NVIDIA puts USD 150 M into Baseten, backing Jensen Huang’s inference‑first pivot

NVIDIA Backs Baseten with $150M AI Inference Bet

NVIDIA puts USD 150 M into Baseten, backing Jensen Huang’s inference‑first pivot

2 min read

The $150 million cash infusion from NVIDIA into Baseten marks a concrete bet on the next phase of AI commercialization. Baseten, a startup that builds tools for running machine‑learning models at scale, lands a deep‑pocketed backer just as the industry’s focus shifts from tinkering in labs to delivering production‑grade services. Jensen Huang’s company has been quietly reshaping its roadmap, earmarking inference—running trained models for real‑world tasks—as the growth engine that will outpace the flashier training segment.

While the funding round itself is modest compared with NVIDIA’s broader portfolio, the timing is notable: enterprises are moving beyond proof‑of‑concepts, looking for dependable, high‑throughput pipelines that can handle billions of queries daily. The partnership therefore serves as a litmus test for how the chip giant plans to capture that demand.

For NVIDIA, the investment reinforces a strategic pivot championed by chief executive Jensen Huang, who has repeatedly argued that inference will ultimately become a much larger market than model training. As enterprises move from experimentation to full‑scale deployment, demand for reliable and cos

For NVIDIA, the investment reinforces a strategic pivot championed by chief executive Jensen Huang, who has repeatedly argued that inference will ultimately become a much larger market than model training. As enterprises move from experimentation to full-scale deployment, demand for reliable and cost-efficient inference infrastructure is accelerating, placing companies like Baseten at the centre of this transition. Baseten's platform is optimised for NVIDIA's latest GPU architectures, including the H100 and next-generation B200 chips.

By enabling high-performance inference workloads on these GPUs, Baseten effectively extends NVIDIA's ecosystem, helping ensure its hardware remains the default choice as AI adoption spreads across enterprises. CapitalG's participation adds a competitive dimension, given Alphabet's own investments in AI infrastructure and model deployment. Nevertheless, the collaboration underlines the strategic importance of inference, even among industry rivals.

At a $5 billion valuation, Baseten now joins a small group of AI infrastructure startups commanding premium multiples.

Related Topics: #AI inference #NVIDIA #Baseten #Jensen Huang #GPU infrastructure #Machine learning #Production AI #Enterprise deployment #B200 chips #Model serving

Will inference truly eclipse training as the primary revenue driver? NVIDIA’s $150 million injection into Baselen, part of a $300 million round that lifted the startup’s valuation to $5 billion, signals a clear bet on that hypothesis. The financing, led by Institutional Venture Partners and CapitalG with NVIDIA’s participation, underscores the chipmaker’s aggressive shift toward inference‑first ventures.

Jensen Huang has repeatedly argued that enterprises moving from experimentation to full‑scale deployment will generate far greater demand for reliable, cost‑effective inference services. Baselen’s recent funding positions it to meet that anticipated need, yet the pace at which the market will expand remains uncertain. While the valuation more than doubles the company’s prior estimate, investors have yet to see concrete revenue metrics tied to large‑scale deployment.

The move reflects NVIDIA’s broader strategy to diversify beyond model training, but whether inference will become the dominant segment is still an open question. Only forthcoming adoption data will clarify the success of this pivot.

Further Reading

Common Questions Answered

How does Baseten's inference stack improve AI model performance?

[baseten.co](https://baseten.co/resources/guide/the-baseten-inference-stack) reveals that their inference stack optimizes every layer of AI model deployment, from hardware to software. The stack combines open-source techniques with proprietary enhancements to deliver low latency, high throughput, and cost-efficient model serving across different AI modalities.

What performance improvements did Baseten achieve with NVIDIA Blackwell GPUs?

[nvidia.com](https://www.nvidia.com/en-us/customer-stories/baseten-cloud-scaling-ai-inference/) reports that Baseten achieved 5× higher throughput for high-traffic endpoints and up to 38% faster LLM serving with NVIDIA Blackwell GPUs. The improvements enable Baseten to serve more user requests with the same GPU infrastructure and reduce latency for large language models.

Why did Baseten raise $150 million in its Series D funding round?

[baseten.co](https://baseten.co/blog/announcing-baseten-150m-series-d) indicates the funding was raised to push the boundaries of performant, reliable, and cost-efficient AI inference. The company aims to build infrastructure that supports production-grade AI systems with fast models, interchangeable compute, and flexible, Pythonic runtimes.