Intel-based Aurora supercomputer achieves exascale status

The exascale barrier is starting to look less like a barrier and more like a milestone, as Intel, Argonne National Laboratory, and Hewlett-Packard Enterprise together announced at the International Supercomputing Conference in Hamburg, Germany this week that the Aurora supercomputer located at Argonne had achieved exascale performance of 1.012 exaflops and more than a quintillion calculations per second.

Aurora joins the AMD-built Frontier supercomputer at Oak Ridge National Lab in Tennessee as the second machine to achieve exascale status, Frontier having done so in 2022, registering 1.1 exaflops at the time. The El Capitan supercomputer at Lawrence Livermore National Laboratory in California, which uses AMD chips and is expected to be finished this year, also has exascale ambitions.

Intel in a statement called Aurora “the fastest AI system in the world dedicated to AI for open science, achieving 10.6 AI exaflops.” The company  also said the machine will play a “crucial role” in helping further the development of open ecosystems for AI-accelerated high performance computing (HPC).

The company also told Fierce Electronics via email, "Running at exascale doubles the amount of exascale systems available to researchers as they attempt to solve some of the world’s biggest challenges. Argonne National Laboratory’s goal is to provide a super powerful system to the world’s scientists and engineers, and we’re excited to see how science applications are taking advantage of the system." 

Aurora’s achievement has been expected for some time, as the system was announced in 2015, and was delayed more than once before becoming active late last year. The system features 166 racks, 10,624 compute blades, 21,248 Intel Xeon CPU Max Series processors and 63,744 Intel Data Center GPU Max Series units, making it one of the world's largest GPU clusters, according to Intel and Argonne. 

Aurora also includes the largest open, Ethernet-based supercomputing interconnect on a single system of 84,992 HPE slingshot fabric endpoints. While Aurora is listed second to Frontier in the most recent edition of the TOP500 List of supercomputers validated by the LINPACK benchmark, it is notable that it achieved its exascale mark using 9,234 nodes, “only 87% of the system,” Intel said. 

Intel described Aurora as an "AI-centric" supercomputer. Asked what defines such a machine, Intel answered at length:

"It’s important to first explain the high performance LINPACK  (HPL-MxP) benchmark as it is significant because it measures how well a computer system can handle heavy-duty number-crunching tasks. In simpler terms, it's like a fitness test for supercomputers. The benchmark specifically tests a computer's ability to solve complex mathematical equations quickly and efficiently. For real-world applications like weather forecasting or calculating numbers the faster and more accurately they can crunch numbers, the more useful they are for these critical applications.

At the heart of the Aurora supercomputer is the Intel Data Center GPU Max Series. The Intel Xe GPU architecture is foundational to the Max Series, featuring specialized hardware like matrix and vector compute blocks optimized for both AI and HPC tasks. The Intel Xe architecture’s design that delivers unparalleled compute performance is the reason the Aurora supercomputer secured the top spot in the high-performance LINPACK-mixed precision (HPL-MxP) benchmark – which best highlights the importance of AI workloads in HPC.

The Xe architecture's parallel processing capabilities excel in managing the intricate matrix-vector operations inherent in neural network AI computation. These compute cores are pivotal in accelerating matrix operations crucial for deep learning models. Complemented by Intel's suite of software tools, including Intel® oneAPI DPC++/C++ Compiler, a rich set of performance libraries, and optimized AI frameworks and tools, the Xe architecture fosters an open ecosystem for developers that is characterized by flexibility and scalability across various devices and form factors."

Ultimately, exaflops are just one measure of a supercomputer’s power. Argonne knows that its value can be measured in other meaningful ways by its users.

“Aurora is fundamentally transforming how we do science for our country,” Argonne Laboratory Director Paul Kearns said in a statement. ​“It will accelerate scientific discovery by combining high performance computing and AI to fight climate change, develop life-saving medical treatments, create new materials, understand the universe and so much more.”

Rick Stevens, Argonne’s associate lab director for Computing, Environment and Life Sciences, added “Aurora excels at tackling both traditional scientific computing problems and AI-powered research. As AI continues to reshape the scientific landscape, Aurora gives us a platform to develop new tools and approaches that will significantly accelerate the pace of research."