Argonne turns spare supercomputers into private AI inference portal

AIMay 28, 2026 at 12:45 AM

Read full story

Source: Situation Publishing

TLDR: CHICAGO—Argonne launched a private AI inference service using spare Aurora era compute, running on Sophia and Metis clusters. Researchers get secure LLM access without sending data to public chatbots.

Key Takeaways:

Argonne runs major supercomputing clusters near Chicago, including Aurora, plus smaller AI optimized systems for inference workloads.
The portal currently runs on Sophia with 192 Nvidia A100 GPUs and Metis with 32 SambaNova SN40L accelerators, expanding to Nvidia GH200 and B200 systems.
Researchers can test chatbots on sensitive datasets, from fusion plasma disruption predictions to particle and telescope candidate searches, at scale.

This is supercomputer muscle getting repurposed into something researchers can tap like a utility, not a project. The real win is letting people ask LLMs hard questions while keeping their data off public chatbots.

No comments yet. Be the first to share your thoughts!

Q&A

What bottleneck disappears when inference becomes a shared service instead of a per group build

Teams do not need to provision GPUs, deploy model stacks, or maintain serving systems, so experimentation time shrinks while utilization rises.

How does keeping data off public ChatGPT change model selection

It pushes researchers toward models and workflows that can run in air gapped or tightly controlled environments, making open and locally hosted options more practical.

Why does Argonne’s GPU mix matter for scientific workloads

Different accelerators and memory sizes affect which model sizes and context lengths can run reliably for long running analysis tasks.

What happens next if inference demand grows beyond spare capacity

Argonne may need scheduling policy changes, more AI dedicated nodes, or tighter integration with broader HPC allocations to avoid starving training jobs.

Could secure inference become a standard interface for future missions like Genesis

If the portal proves it can handle real time data, reproducibility, and governance, other labs and mission teams will likely want similar internal service layers.