Intel Successfully Completes Aurora Supercomputer Installation: Over 63K Xeon GPU Max & 21K Xeon CPU Max Chips

By: Muhammad Zuhair Haider Zaidi

Intel & Argonne National Laboratory announces the successful blade installation in the Aurora supercomputer, making it one step closer to full functionality.

Intel-Based Aurora Supercomputer Features 2 ExaFLOPS Computing Power, Potentially Surpassing AMD’s Frontier

The Aurora supercomputer has been the victim of several delays since its inception, but we may finally see it running. For those unaware, the Aurora supercomputer features Intel’s Xeon CPU Max and Xeon GPU Max series, elevating its performance to 2 ExaFLOPS. One of the applications of the Aurora platform will be to provide state of a art generative AI model for science.

It offers 10,624 nodes featuring 21,248 Xeon CPUs from the Sapphire-Rapid SP lineup. It comes with a total of 63,744 GPUs based on the Ponte Vecchio design, enabling it to offer a peak injection of  2.12 PB/s & a peak bisection bandwidth of 0.69 PB/s.

Here is how the Intel-powered Aurora supercomputer has an edge, as detailed by VP of Intel Super Compute Group Jeff McVeigh previously:

  • The Intel Data Center GPU Max Series outperforms Nvidia H100 PCIe card by an average of 30% on diverse workloads1, while independent software vendor Ansys shows a 50% speedup for the Max Series GPU over H100 on AI-accelerated HPC applications.
  • The Xeon Max Series CPU, the only x86 processor with high bandwidth memory, exhibits a 65% improvement over AMD’s Genoa processor on the High Performance Conjugate Gradients (HPCG) benchmark1, using less power. High memory bandwidth has been noted as among the most desired features for HPC customers.
  • 4th Gen Intel Xeon Scalable processors – the most widely used in HPC – deliver a 50% average speedup over AMD’s Milan4, and energy company BP’s newest 4th Gen Xeon HPC cluster provides an 8x increase in performance over its previous-generation processors with improved energy efficiency.
  • The Gaudi2 deep learning accelerator performs competitively on deep learning training and inference, with up to 2.4x faster performance than Nvidia A100.

For memory capacity, the Aurora supercomputer features 10.9 PB of DDR5 system DRAM, 1.36 PB of HBM capacity through the CPUs, and 8.16 PB of HBM capacity through the GPUs. Moreover, it uses an arrangement of 1,024 storage nodes providing a total capacity of 220TB. If you’re curious about how this gigantic system will be utilized, the following is a quick explanation:

From tackling climate change to finding cures for deadly diseases, researchers face monumental challenges that demand advanced computing technologies at scale. Aurora is poised to address the needs of the HPC and AI communities, providing the necessary tools to push the boundaries of scientific exploration.

The newest Intel Data Center GPU Max Series 1550, operating on Aurora, provides the best SimpleFOMP performance, beating out the NVIDIA A100 and AMD Instinct MI250X accelerators. However, the supercomputer is yet to pass preliminary testing. After that, it is expected to appear in the list, potentially overtaking the AMD-powered Frontier supercomputer. The Aurora supercomputer is on track to be fully functional by this year.