Tech giant NVIDIA on Thursday launched a new supercomputer claimed to be the fastest in the environment for AI (artificial intelligence) workloads at the Nationwide Energy Research Scientific Personal computer Middle (NERSC) in California.
The new supercomputer “Perlmutter” is named in honor of Saul Perlmutter, an astrophysicist at Berkeley Lab who shared the 2011 Nobel Prize in Physics for the ground shaking discovery that the rate at which the universe expands is accelerating.
It will deliver nearly four exaflops of AI performance for more than 7,000 researchers, which NVIDIA says makes it “the fastest system on the planet on 16-bit and 32-bit mixed-precision math AI uses”.
“Traditional supercomputers can barely handle the math required to generate simulations of a few atoms over a few nanoseconds with programs such as Quantum Espresso,” Wahid Bhimji, the acting head of NERSC’s data and analytics services team.
“But by combining their highly accurate simulations with machine learning, scientists can study more atoms over longer stretches of time.”
The new supercomputer boasts 6,144 NVIDIA A100 Tensor Core GPUs on board and will start by helping to create the largest-ever 3D map of the visible universe to study the dark energy accelerating the cosmos’ expansion.
“In one project, the supercomputer will help assemble the largest 3D map of the visible universe to date. It will process data from the Dark Energy Spectroscopic Instrument (DESI), a kind of cosmic camera that can capture as many as 5,000 galaxies in a single exposure,” wrote Dion Harris, Nvidia HPC & AI Product Marketing Lead in a blog post.
“Researchers need the speed of Perlmutter’s GPUs to capture dozens of exposures from one night to know where to point DESI the next night.
“Preparing a year’s worth of data for publication would take weeks or months on prior systems, but Perlmutter should help them accomplish the task in as little as a few days.”
Perlmutter is being installed in two phases. Phase 1 will includes the system’s GPU-accelerated nodes and scratch file system, while Phase 2 will add CPU-only nodes.
As AMD notes, Phase 1 is now being deployed and features 1,536 nodes, each with one AMD EPYC 7763 processor and four NVIDIA NVlink-connected A100 Tensor Core GPUs. It also includes a 35 PB all-flash Lustre file system that will provide very high-bandwidth storage.
The Phase 2 system expected later this year will add another 3,072 CPU-only nodes, each with two AMD EPYC 7763 processors and 512 GB of memory per node.
Firing up an AI-optimized supercomputer “represents a very real milestone,” said Wahid.
“AI for science is a growth area at the U.S. Department of Energy, where proof of concepts are moving into production use cases in areas like particle physics, materials science, and bioenergy,” he added.
“People are exploring larger and larger neural-network models and there’s a demand for access to more powerful resources, so Perlmutter with its A100 GPUs, all-flash file system, and streaming data capabilities is well-timed to meet this need for AI.”
Once the ambitious 3D universe map is completed, Perlmutter will help researchers learn more about dark energy, probe subatomic interactions for green energy sources, aim to advance science in astrophysics, climate science and more. It will also study atomic interactions that could point the way to better batteries and biofuels.