NVIDIA DGX Spark server with four nodes, doubling memory capacity for advanced AI and machine learning.

Editorial illustration for NVIDIA DGX Spark expands node support to four, doubling memory capacity

NVIDIA DGX Spark Expands Memory for AI Workloads

NVIDIA DGX Spark expands node support to four, doubling memory capacity

By AI Daily Post Edited by Brian Petersen, Editor-in-Chief

March 17, 2026 • 3 min read

Why does the memory ceiling matter for autonomous AI agents? Those models chew through data fast, and a single DGX Spark board caps out at 128 GB of RAM. When researchers tried to push larger workloads, they hit a wall: the platform could only double that headroom by linking two nodes, reaching 256 GB.

That limit forced teams to split jobs or compromise on model size, slowing experiments that already run at the edge of feasibility. Here's the thing: expanding the cluster beyond two units changes the calculus entirely. More nodes mean not just more memory, but new ways to arrange compute—different execution topologies that can be matched to specific tasks.

The upgrade promises a broader canvas for scaling agents that need to reason across massive state spaces, and it could reshape how labs benchmark multi‑node performance. In short, the next step opens the door to workloads that were previously out of reach, setting the stage for the announcement that follows.

Until now, NVIDIA DGX Spark has supported scaling up to two nodes, increasing the available memory from 128 GB on one node to 256 GB on two nodes. This capability has now been increased to up to four DGX Spark nodes. DGX Spark also now supports several execution topologies, each tailored to different goals through the low latency of RoCE communication enabled by ConnectX-7 NICs.

- One DGX Spark node: Ideal for low latency, large context size inference, fine-tuning up to 120B parameters, and local agentic workloads - Two DGX Spark nodes: Balanced scaling for faster fine-tuning and larger models, as well as support for up to 400B-parameter inference - Three DGX Spark nodes in a ring: Ideal for fine-tuning larger models or small training jobs - Four DGX Spark nodes with RoCE 200 GbE switch: Local inference server ideal for state-of-the-art models up to 700B parameters, communication intensive workloads, and local AI factory operations Inference can scale up linearly on DGX Spark when internode communication is minimal. When work is largely independent per GPU, the results are aggregated once at the end rather than continuously. In this case, DGX Spark nodes can run in parallel with low synchronization overhead.

For example, a reinforcement learning (RL) workload in NVIDIA Isaac Lab can run many simulations independently on each node. Results are collected in a single step, yielding near-linear scaling across multiple DGX Spark nodes. Inference scaling is less than linear when the workload requires frequent, fine-grained communication between nodes.

During LLM inference, model execution occurs layer by layer, with continuous synchronization required across nodes. Partial results from different DGX Spark nodes must be exchanged and merged repeatedly, which introduces significant communication overhead.

Scaling Autonomous AI Agents and Workloads with NVIDIA DGX Spark - NVIDIA Developer Blog

The upgrade pushes DG X Spark from a two‑node ceiling to four, effectively doubling the pool of on‑board memory that autonomous agents can draw on. Where a single node offered 128 GB, two nodes already delivered 256 GB; four nodes now promise up to 512 GB of local capacity. NVIDIA NemoClaw, part of the NVIDIA Agent Tool, is bundled with the platform, suggesting tighter integration for complex workflows.

Several execution topologies are now supported, each tailored to different workload patterns, though the specifics of those topologies remain undocumented. This expansion could ease the pressure autonomous agents face when juggling long‑running tasks, multiple communication channels, and background subprocesses. Yet, whether the added nodes will translate into proportionally faster or more efficient agent performance is unclear.

The hardware boost is tangible, but the real‑world impact on AI agent productivity will depend on software orchestration and workload characteristics. As the system scales, monitoring how memory and compute resources are allocated will be essential to validate the promised benefits.

Common Questions Answered

How has NVIDIA DGX Spark expanded its node support capabilities?

NVIDIA DGX Spark has increased its node support from two to four nodes, effectively doubling the available memory from 256 GB to 512 GB. This expansion allows researchers to run larger AI workloads and more complex autonomous agent models without previous memory constraints.

What communication technology enables the new DGX Spark node configurations?

The new DGX Spark configurations leverage RoCE (RDMA over Converged Ethernet) communication enabled by ConnectX-7 NICs, which provides low-latency connections between nodes. This technology allows for multiple execution topologies that can be tailored to different computational goals and workload patterns.

What tool is bundled with the updated NVIDIA DGX Spark platform?

NVIDIA NemoClaw, part of the NVIDIA Agent Tool, is now bundled with the DGX Spark platform, suggesting enhanced integration for complex AI workflows. This tool is likely designed to help manage and optimize the expanded multi-node computational capabilities.

🎓

Featured Review

No Code MBA

Build AI apps without coding. Our in-depth course review.

Read Review

NVIDIA DGX Spark Expands Memory for AI Workloads

Further Reading

Common Questions Answered

How has NVIDIA DGX Spark expanded its node support capabilities?

What communication technology enables the new DGX Spark node configurations?

What tool is bundled with the updated NVIDIA DGX Spark platform?

Latest News

Anthropic shuts down Fable 5 and Mythos 5 models amid White House dispute

ATOM Engine Provides OpenAI-Compatible APIs and Parallelism on AMD Instinct

Fused kernels boost MoE training, forward and backward passes up to 1.3×

Salesforce buys Fin for USD 3.6B to boost Agentforce AI agent platform

Hybrid Open-Ended Tri-Evolution Improves Deep Research for AI Agents

UP‑NRPA Allows Dynamic Customization of Dialogue Strategies Without Offline RL

Z.ai releases GLM-5.2 with 1M-token context and dual effort levels

DRL‑Transformer solves open‑shop scheduling, scales to 100×100 instances

Mobile NPU powers on‑device diffusion LLM with Multi‑Block Speculative Decoding

FedSPC Addresses Inconsistent Shared Updates in Personalized Federated Learning

Further Reading

Related Reading

Hermes Agent tops use as Nous Research’s self‑improving model leads OpenRouter

DeepMind spinoff’s AI‑designed drugs enter human trials after AlphaFold 3

Google AI Advisors Let Users Probe Performance with Conversational “Why” Queries

Nvidia's NVentures: 21 Deals in 2023 Fuel AI Ecosystem Expansion

HCLTech and NVIDIA Open AI Innovation Lab in Santa Clara Using Full NVIDIA Stack

Google's MusicFX DJ Enables Real-Time Controllable AI Music Generation

Paper identifies simple games that defeat AlphaGo and AlphaChess training

NVIDIA Dynamo 1.0 Adds Video-Generation Support with Open‑Source Frameworks

You.com AI grounding guide, three-part method beating RAG, noted at Nvidia GTC

Common Questions Answered

How has NVIDIA DGX Spark expanded its node support capabilities?

What communication technology enables the new DGX Spark node configurations?

What tool is bundled with the updated NVIDIA DGX Spark platform?

Latest News

Anthropic shuts down Fable 5 and Mythos 5 models amid White House dispute

ATOM Engine Provides OpenAI-Compatible APIs and Parallelism on AMD Instinct

Fused kernels boost MoE training, forward and backward passes up to 1.3×

Salesforce buys Fin for USD 3.6B to boost Agentforce AI agent platform

Hybrid Open-Ended Tri-Evolution Improves Deep Research for AI Agents

UP‑NRPA Allows Dynamic Customization of Dialogue Strategies Without Offline RL

Z.ai releases GLM-5.2 with 1M-token context and dual effort levels

DRL‑Transformer solves open‑shop scheduling, scales to 100×100 instances

Mobile NPU powers on‑device diffusion LLM with Multi‑Block Speculative Decoding

FedSPC Addresses Inconsistent Shared Updates in Personalized Federated Learning