Lenovo DCG: “Exascale for Everyone” Meets HPC for All

By Charles King, Pund-IT, Inc.  July 2, 2019

In past commentaries on supercomputing and high-performance computing (HPC), including the results of the latest Top500.org lists, I noted the benefits that dynamically flow from high-end systems to smaller businesses. In large part, that is due to the continuing upward evolution of computing technologies that drives costs down across the spectrum of IT solutions. But that process has also been hastened by specific trends, including the rise and performance of supercomputers leveraging industry standard components.

How that impacts commercial markets is evident in Lenovo’s standing in the most recent Top500 list where the company increased its leadership position in the total number of systems listed. But the nature of those listings is equally important, as was a recent blog posted by Matt Ziegler, director of HPC and AI Product Management and Marketing at Lenovo, entitled “Exascale for Everyone.” Let’s take a closer look at Lenovo’s recent Top500 listings, Ziegler’s blog and what they mean for commercial markets and businesses.

Lenovo and the Top500

What was so special about Lenovo’s standings on the June 2019 Top500.org list? For one thing, the company substantially added to its previous leadership position, moving from 140 installations on the November 2018 list to 173 on the new list.

A respectable nine of those systems are in the top 100 performing systems, including one in the top 10 and two in the top 25, and the company has 54 entries among the top 200 listings. Additionally, two Lenovo-based installations were in the top 50 systems in the Green500 list of the world’s most energy efficient supercomputers.

But what Lenovo achieved in the lower end of the new Top500 is also worth consideration. The fact is that despite the rise of systems based on commercial off the shelf (COTS) commodity components, supercomputing is still largely considered a realm of government labs and university facilities that have the funding necessary to support pure research efforts.

But of the 119 Lenovo systems at the lower end of the new Top500 list, the vast majority are owned by businesses. These include IT-intensive organizations, including Internet, service and cloud providers, networking companies and hosting services, along with financial companies, energy players and software organizations.

In other words, Lenovo is an exemplar of the fact that even as supercomputing moves toward an exascale (a billion billion—i.e. a quintillion—calculations per second) future, achieving world-class performance is increasingly within the means of commercial organizations.

The journey to exascale

That was the focus of Lenovo’s Matt Ziegler in his “Exascale for Everyone” blog. He noted that when the petaflop compute (a million billion—i.e. a quadrillion—calculations per second) barrier was broken in 2008 by the hybrid Roadrunner supercomputer at the DOE’s Los Alamos National Lab, it was exactly eleven years to the day since the ASCI Red installation at Sandia National Labs broke the teraflop (a thousand billion—i.e. a trillion—calculations per second) barrier.

So, where do things stand today, eleven years after Roadrunner? The leader on the new Top500 list—the Oak Ridge Laboratory’s Summit system—is delivering just over 200 petaflops of peak performance while the #2, #3 and #4 listed systems are pushing somewhat more than 100 petaflops. After that, things fall way off with the #5 listed system delivering under 40 petaflops of performance.

In other words, exascale computing remains a far distant goal. Ziegler notes that this slowdown has been well-documented within the industry and reflects a number of issues, such as the slowing and extinction of Moore’s Law and achieving 7nm or better processor fabrication. Plus, there are daunting physical challenges, including power and space constraints, cooling capabilities, network scalability and systems management.

At the same time, the industry is doubling down on the hybrid systems approach that resulted in the 2008 Roadrunner system at Los Alamos. That approach requires vendors to closely collaborate with end customers, and contribute their expertise with designs leveraging COTS components, including x86 processors and GPUs, and open source software and tools.

As Ziegler stated, “At Lenovo, we recognize that co-development and partnerships between the best and brightest minds are needed to successfully tackle grand challenges like exascale. We believe that the approach taken in the development of Roadrunner, using commonly available components and driving performance of those technologies through collaboration and co-development with the end user, was the correct one.”

Ziegler also noted that, “The base technology in LRZ’s SuperMUC-NG, (the #9 system on the June 2019 Top500 list), is available to all our customers today worldwide.” That’s a critical point that reinforces the growing value of supercomputing for businesses, and also casts light on the substantial presence of commercial organizations in Lenovo’s Top500.org listings.

Final analysis

Matt Ziegler takes things a step further in his blog, noting that, “Lenovo’s approach to “cascade” computing advancements is at the core of who we are,” as is, “leveraging our deep partnerships and skills to advance computing, and then making those advancements available to all our customers…”

He concludes, “Instead of looking backwards and developing purpose-built, proprietary systems that only a few can afford, Lenovo will continue to ensure that all users reap the benefits of technology innovation as it happens. Our goal is to make exascale available to everyone.”

Is that grand vision practically achievable? No and yes. First, it must be acknowledged that exascale computing is still a distant dream, despite the plans that vendors and their public sector customers are touting to make it a reality. In addition, leading supercomputers that are delivering well-below petascale performance today are well-beyond the means and needs of the vast majority of today’s businesses.

At the same time, teraflop and petaflop computing were also once considered distant dreams, and systems supporting those levels of performance were nearly inconceivable. With that in mind, consider that on the new Top500.org list, the lowest ranked #500 system—a 60,000 core cluster that Lenovo built for a commercial networking business in China—is supporting over 2 petaflops of peak performance.

In other words, what is commonplace today was nearly always believed to be impossible at some earlier time. With “Exascale for everyone,” Lenovo may be setting its sights on a far distant destination. But the company has demonstrated, time and again, that it can more than match its vision with a deep understanding of the steps required to achieve its goals.

That farsighted yet practical approach to making the impossible real has proved invaluable to Lenovo, its customers and partners time and again.