By Charles King. Pund-IT, Inc. June 19, 2019
There are numerous ways to quantify the performance of supercomputers. In fact, Top500.org, the organization that publishes semi-annual lists ranking the world’s most powerful supercomputing installations, includes data ranging from architectures to processors to accelerators to interconnects. Vendors’ contributions to specific systems are provided, as well as how their solutions fared in terms of overall performance and share of the list. The group also ranks systems by energy efficiency (the Green500) and High Performance Conjugate Gradients (HPCG).
While it is common for specific vendors and systems to enjoy relatively stable positions at the high end of the Top500, the Green500 list tends to be more diverse and fluid. In addition, supercomputers owned by businesses are seldom seen among the 20 or 30 highest ranked Top500 installations. That’s mainly due to high-end supercomputers’ costs and complexities which are far easier to manage in universities and government research facilities than in real world business settings.
These points are worth considering in light of the new Top500 and Green500 lists published this week. IBM announced that it is the only vendor to have multiple supercomputers in the top 10 of both lists – the Summit, Sierra, and Lassen systems deployed at U.S. Department of Energy (DOE) labs. Plus, IBM announced that the new Pangea III supercomputer the company built for Total Exploration, an oil and gas company with operations in 130 countries, was ranked #11 on the Top500 list and #8 on the Green500.
Let’s take a closer look at this trifecta of announcements and consider what IBM’s top ranked systems portend for supercomputing applications and customers.
The top of the Top500 and Green500
Do the DOE’s Summit and Sierra supercomputers have anything in common? Absolutely. Both leverage IBM’s AI-optimized IBM POWER9 processors and systems, along with NVIDIA’s V100 Tensor Core GPUs and NVIDIA NVLink interconnect technologies.
Why is that important for supercomputing? IBM POWER9 systems are designed from the ground up to move data throughout the system faster and with fewer bottlenecks, helping to improve compute performance and energy efficiency. According to IBM, its memory technologies enable data transference between the POWER9 CPUs and attached accelerators that is 9.5X faster than comparable x86-based systems. In addition, the CPU-to-GPU NVIDIA NVLink interconnect (developed jointly by IBM and NVIDIA) supports bandwidth between IBM POWER9 CPUs and NVIDIA V100 GPUs that is 5.6X faster than comparable x86-based systems. Those are crucial advantages in systems designed to work with and analyze massive volumes of data.
Those IBM and NVIDIA technologies are the reason that Summit and Sierra have remained highly ranked in Top500 lists since their entrance last June. But their positions among the Top 10 systems on the new Green500 list is equally impressive and proof that at least some modern supercomputers are considerably more flexible and adaptable than many might assume.
These points are important for customers to remember when considering the current state of supercomputing. Though x86 leads the Top500 in terms of the overall number systems and installations listed, innovative solutions from IBM and NVIDIA are leading the Top500 field by delivering both superior performance and better energy efficiency.
Total’s Pangea III
Those same performance and efficiency issues underlie the Pangea III system owned by Total Exploration. Using a competitive bidding process, Total selected IBM due to its AI-enabled, GPU-accelerated solutions. Accordingly, Pangea III was built with the same IBM POWER9 and NVIDIA architecture as the DOE’s Summit and Sierra systems. The new system delivers 25 petaflops of compute while supporting 50 petabytes of storage capacity, capabilities that are crucial given Total’s plans to utilize Pangea III for complex oil and gas exploration analysis.
Another factor is the radically higher mixed compute and energy efficiency that IBM was able to deliver. According to the company, Pangea III requires just 1/3 as much energy as the 4.5 Megawatts the previous Pangea supercomputer used at peak, an SGI-built system leveraging Intel Xeon E5 CPUs and Infiniband interconnects. Total also noted that combined with the increased performance of the POWER9/NVIDIA architecture, Pangea III is using less than 10% of the energy per petaflop of compute that the previous Pangea system demanded.
That significant price/performance improvement should be of interest to businesses balancing the cost and benefits of supercomputing-enabled research against financial and market realities.
So, what are the essential takeaways of this trio of announcements? First and foremost, despite the myriad benefits offered by industry standard components, vendors willing to invest in innovative research and development can deliver significant performance, efficiency and other benefits.
Those can be important for customers utilizing traditional data center solutions, but they become increasingly compelling in larger systems, including supercomputer installations. The muscular POWER9-based Summit and Sierra installations that IBM built for the DOE offer proof of that.
Second, Total Exploration’s new POWER9- and NVIDIA-based Pangea III system suggests that the future of commercial supercomputing may finally be near. Despite years of rosy predictions, significant CAPEX investments and painful energy related OPEX costs have limited businesses’ interest in these solutions. With Pangea III, IBM has proven that it can successfully deliver topline performance with bottom line benefits.
As a result, Total should be rightfully proud of the ranking Pangea III earned in the new Top500 list. At the same time, it will be no surprise if the company is soon joined by other commercial organizations that decide to avail themselves of the help IBM POWER9- and NVIDIA-based solutions can offer in the race to the future of supercomputing.
© 2019 Pund-IT, Inc. All rights reserved.