DGX Containers
NVIDIA DGX Spark · GB10 · ARM64 · CUDA 13
Build vertexnova/gsplat-spark and vertexnova/aiml-spark from the same PyTorch base image. Work through the tabs in order: prerequisites once, then each image, verify day-to-day use, summary and limits last. For raw Docker commands, open Docker on Spark in another tab (same tab + copy pattern).
Prerequisites — do once after first boot
DGX Spark uses unified memory — CPU and GPU share the same 128GB pool. With swap enabled, heavy GPU workloads trigger a death spiral: training fills memory → OS swaps to disk → machine freezes. Disabling swap turns a machine freeze into a clean job crash.
# Comment out swap in fstab
sudo nano /etc/fstab
# Change: /swap.img none swap sw 0 0
# To: # /swap.img none swap sw 0 0
# Apply immediately
sudo swapoff -a
swapon --show # should print nothing docker run --rm --gpus all \
nvcr.io/nvidia/cuda:13.0.0-base-ubuntu24.04 nvidia-smi All containers build on NVIDIA's official PyTorch container — Blackwell-optimised, ARM64 native, CUDA 13, PyTorch 2.7.
docker pull nvcr.io/nvidia/pytorch:25.03-py3
# Size: 21.8GB — download once, reuse for all containers These flags are required for every container on DGX Spark. Without --ipc=host, PyTorch shared memory is capped at 64MB and will crash during training.
docker run --gpus all -it \
--name CONTAINER_NAME \
--ipc=host \ # shared memory — required for PyTorch
--ulimit memlock=-1 \ # no memory lock limit
--ulimit stack=67108864 \ # 64MB stack
-v ~/workspace:/workspace \ # persist files
-p 7007:7007 \ # nerfstudio viewer
IMAGE_NAME bash The GB10 is compute capability SM 12.0. PyTorch 2.7 in the NVIDIA container does not recognise 12.1 — use 12.0 when building any CUDA extension from source.
export TORCH_CUDA_ARCH_LIST="12.0"
# Set this before: pip install gsplat, any CUDA extension build vertexnova/gsplat-spark:v1
3D Gaussian Splatting workspace — gsplat 1.5.3, PyTorch 2.7, ARM64 native.
docker run --gpus all -it \
--name gsplat-build \
--ipc=host \
--ulimit memlock=-1 \
--ulimit stack=67108864 \
-v ~/workspace:/workspace \
-p 7007:7007 \
nvcr.io/nvidia/pytorch:25.03-py3 bash python3 -c "import torch; print(torch.cuda.get_device_name(0))"
# → NVIDIA GB10 Must build from source — no pre-compiled ARM64 + CUDA 13 wheel exists on PyPI. The --no-build-isolation flag prevents pip from trying to install a conflicting PyTorch version.
export TORCH_CUDA_ARCH_LIST="12.0"
pip install git+https://github.com/nerfstudio-project/gsplat.git \
--no-build-isolation
# Compilation takes 5-10 minutes — normal python3 -c "import gsplat; print('gsplat OK', gsplat.__version__)"
# → gsplat OK 1.5.3 exit
docker commit gsplat-build vertexnova/gsplat-spark:v1
docker images | grep vertexnova vertexnova/aiml-spark:v1
CV, tabular work, light 3D mesh I/O, and JupyterLab on top of the same NVIDIA PyTorch base. NumPy, pandas, matplotlib, and scikit-learn already ship in the base image.
docker run --gpus all -it \
--name aiml-dev \
--ipc=host \
--ulimit memlock=-1 \
--ulimit stack=67108864 \
-v ~/workspace:/workspace \
nvcr.io/nvidia/pytorch:25.03-py3 bash pip install \
opencv-python-headless \
scikit-image \
imageio \
Pillow pip install \
seaborn \
tqdm \
rich
# pandas, matplotlib, scikit-learn, numpy already in base image pip install trimesh plyfile
# open3d skipped — no ARM64 wheel available pip install \
jupyterlab \
ipywidgets \
gdown \
python-dotenv \
pyyaml \
requests exit
docker commit aiml-dev vertexnova/aiml-spark:v1
docker images | grep vertexnova Verify & launch containers
# 1. GPU driver OK
nvidia-smi
# 2. Docker GPU access OK
docker run --rm --gpus all \
nvcr.io/nvidia/cuda:13.0.0-base-ubuntu24.04 nvidia-smi
# 3. Images intact
docker images | grep vertexnova
# 4. Optional — containers you expect to stay up
docker ps | grep YOUR_SERVICE docker run --gpus all -it \
--name gsplat-dev \
--ipc=host \
--ulimit memlock=-1 \
--ulimit stack=67108864 \
-v ~/workspace:/workspace \
-p 7007:7007 \
vertexnova/gsplat-spark:v1 bash
# Inside — verify
python3 -c "import gsplat; print('gsplat', gsplat.__version__)" docker run --gpus all -it \
--name aiml-dev \
--ipc=host \
--ulimit memlock=-1 \
--ulimit stack=67108864 \
-v ~/workspace:/workspace \
-p 8888:8888 \
vertexnova/aiml-spark:v1 bash
# Inside — verify everything
python3 -c "
import torch, cv2, numpy, pandas, trimesh, plyfile
print('torch:', torch.__version__, '| GPU:', torch.cuda.get_device_name(0))
print('opencv:', cv2.__version__)
print('trimesh:', trimesh.__version__)
print('ALL OK')
" # Inside aiml-dev container
jupyter lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root
# On your laptop browser
# http://dgx-spark.local:8888 # Check if stopped
docker ps -a | grep dev
# Restart and re-enter
docker start aiml-dev
docker exec -it aiml-dev bash
# Or one line
docker start -ai aiml-dev exit
docker commit aiml-dev vertexnova/aiml-spark:v2
docker images | grep vertexnova Container registry summary
vertexnova/gsplat-spark v1 built 3D Gaussian Splatting workspace.
vertexnova/aiml-spark v1 built General AI/ML workspace.
vertexnova/nerfstudio-spark planned Full nerfstudio with COLMAP for training 3DGS from custom scenes.
vertexnova/vertexnova-spark planned C++ rendering engine development — Vulkan, CMake, SPIRV-Cross.