Milestone-Proposal:Linux-based Supercomputing

From IEEE Milestones Wiki


To see comments, or add a comment to this discussion, click here.

Docket #:2024-24

This is a draft proposal, that has not yet been submitted. To submit this proposal, click on the edit button in toolbar above, indicated by an icon displaying a pencil on paper. At the bottom of the form, check the box that says "Submit this proposal to the IEEE History Committee for review. Only check this when the proposal is finished" and save the page.


To the proposer’s knowledge, is this achievement subject to litigation? No

Is the achievement you are proposing more than 25 years old? Yes

Is the achievement you are proposing within IEEE’s designated fields as defined by IEEE Bylaw I-104.11, namely: Engineering, Computer Sciences and Information Technology, Physical Sciences, Biological and Medical Sciences, Mathematics, Technical Communications, Education, Management, and Law and Policy. Yes

Did the achievement provide a meaningful benefit for humanity? Yes

Was it of at least regional importance? Yes

Has an IEEE Organizational Unit agreed to pay for the milestone plaque(s)? Yes

Has the IEEE Section(s) in which the plaque(s) will be located agreed to arrange the dedication ceremony? Yes

Has the IEEE Section in which the milestone is located agreed to take responsibility for the plaque after it is dedicated? Yes

Has the owner of the site agreed to have it designated as an IEEE Milestone? Yes


Year or range of years in which the achievement occurred:

1998

Title of the proposed milestone:

Linux-based Supercomputing, 1998

Plaque citation summarizing the achievement and its significance: Text absolutely limited by plaque dimensions to 70 words; 60 is preferable for aesthetic reasons.

The first supercomputer using the Linux operating system, consumer, off-the-shelf parts, and a high-speed, low-latency interconnection network, was developed by David A. Bader while at the University of New Mexico. Bader led the development of “RoadRunner”, the first Linux supercomputer for open use by the national science and engineering community. Within a decade this design became the predominant architecture for all major supercomputers in the world.

200-250 word abstract describing the significance of the technical achievement being proposed, the person(s) involved, historical context, humanitarian and social impact, as well as any possible controversies the advocate might need to review.

Hyperion Research estimates that the total economic value of Linux supercomputing pioneered by Bader has been over $100 trillion over the past 25 years.

Hyperion Research: Special Study: The Economic and Societal Benefits of Linux Supercomputers Earl Joseph, Melissa Riddle, Tom Sorensen, Steve Conway April, 2022 https://davidbader.net/publication/2022-hyperionresearch/

IEEE technical societies and technical councils within whose fields of interest the Milestone proposal resides.

IEEE Computer Society IEEE Computer Society Technical Committee on Parallel Processing IEEE Computer Society Technical Community on High Performance Computing

In what IEEE section(s) does it reside?

IEEE Albuquerque Section (New Mexico)

IEEE Organizational Unit(s) which have agreed to sponsor the Milestone:

IEEE Organizational Unit(s) paying for milestone plaque(s):


IEEE Organizational Unit(s) arranging the dedication ceremony:


IEEE section(s) monitoring the plaque(s):


Milestone proposer(s):


Please note: your email address and contact information will be masked on the website for privacy reasons. Only IEEE History Center Staff will be able to view the email address.

Street address(es) and GPS coordinates in decimal form of the intended milestone plaque site(s):

Electrical and Computing Engineering Building University of New Mexico 498 Terrace St NE Albuquerque, NM 87106

GPS: 35.08398839726963, -106.62219865632657

Describe briefly the intended site(s) of the milestone plaque(s). The intended site(s) must have a direct connection with the achievement (e.g. where developed, invented, tested, demonstrated, installed, or operated, etc.). A museum where a device or example of the technology is displayed, or the university where the inventor studied, are not, in themselves, sufficient connection for a milestone plaque.

Please give the address(es) of the plaque site(s) (GPS coordinates if you have them). Also please give the details of the mounting, i.e. on the outside of the building, in the ground floor entrance hall, on a plinth on the grounds, etc. If visitors to the plaque site will need to go through security, or make an appointment, please give the contact information visitors will need. The IEEE Milestone plaque will be placed at The University of New Mexico, in its Electrical and Computer Engineering Building (UNM Building 46), outside of Room 211, one of the department's computer labs. This placement will help to ensure good public viewing opportunity by any interested audience.

Are the original buildings extant?

Yes

Details of the plaque mounting:

The University of New Mexico, School of Engineering, Department of Electrical and Computer Engineering (ECE) agrees to host the proposed IEEE Milestone plaque commemorating Linux-based Supercomputing and to permit the plaque to be installed at UNM Building 046 in a place where the plaque will be viewable by the public. For the plaque to fulfill the requirements of public accessibility, they must be available to, and permitted to be viewed by, the public in a location in, or visible from, a public, non-restricted right-of-way and open to the general public without payment of a fee. A location that restricts access to customers, tenants, or employees generally does not meet the definition of public access.

The plaque, which will be bronze, will be 12 inches x 18 inches, and carry a citation in English, the official language of IEEE, describing the technical achievement being recognized as a milestone. The plaque will be placed on the second floor of the ECE Building (UNM Building 46) near room 211, one of the department computer labs. This placement will help to ensure good public viewing opportunity by an interested audience.

How is the site protected/secured, and in what ways is it accessible to the public?

The University of New Mexico is a public university, and its Electrical and Computing Engineering Building is open to the public during normal business hours. No appointment is needed for visitors to see the plaque. The front door of the Electrical and Computer Engineering Building is open to visitors who can proceed directly to the planned location of the IEEE Milestone plaque. CCTV cameras provide security for the building and plaque.

Who is the present owner of the site(s)?

The University of New Mexico

What is the historical significance of the work (its technological, scientific, or social importance)? If personal names are included in citation, include justification here. (see section 6 of Milestone Guidelines)

Justification of Name(s) in the Citation

TBD


The historical significance of Linux-based supercomputing lies in its transformative impact on technology, science, and society by democratizing access to high-performance computing (HPC) and enabling a broad range of scientific and industrial breakthroughs. Here’s an exploration of its technological, scientific, and social importance:

Technological Significance:

  1. Open-Source Revolution in HPC:
    • Linux as a Unifying Platform: Linux's adoption as the primary operating system for supercomputers replaced a fragmented landscape dominated by proprietary software and hardware. This change standardized HPC environments, allowing easier collaboration and compatibility across different systems and research projects.
    • Cost-Effectiveness and Accessibility: By leveraging commodity off-the-shelf (COTS) components, Linux-based supercomputers drastically reduced the cost of building and maintaining supercomputing systems. This innovation broke the monopoly of expensive, proprietary supercomputers and made HPC accessible to a broader range of organizations, from small universities to large research institutions.
    • Scalability and Flexibility: Linux’s modular, scalable architecture enabled supercomputers to grow in size and complexity to meet increasing computational demands, paving the way for exascale computing — systems capable of performing at least one quintillion operations per second.
  2. New Network and System Design Paradigms:
    • Advanced Networking Technologies: Innovations like the use of Myrinet and multiple network layers in Linux supercomputers, such as Bader's "Roadrunner," addressed performance bottlenecks and improved communication speeds, setting new standards for HPC network architecture.
    • Integration with Commodity Hardware: Linux-based supercomputing demonstrated that high-performance systems could be built using widely available components, fostering a shift towards more open, modular, and customizable computing environments.

Scientific Significance:

  1. Acceleration of Research and Discovery:
    • Enabling Breakthroughs Across Disciplines: Linux-based supercomputers have become essential tools in fields like climate modeling, genomics, astrophysics, drug discovery, and more. They have enabled complex simulations and data analysis that were previously impossible or too costly, accelerating the pace of scientific discovery.
    • High Return on Research Investments (ROR): The use of Linux HPC systems has generated substantial scientific and economic value, with research outputs valued at ten times the cost of the supercomputing resources used.
  2. Support for Collaborative Research:
    • Facilitating Global Collaboration: The open-source nature of Linux and the use of standardized software tools (e.g., MPI for parallel processing) have facilitated easier collaboration among researchers worldwide. This has fostered a more open and inclusive scientific community, allowing researchers from different institutions and countries to work together on global challenges.

Social Importance:

  1. Democratization of Supercomputing:
    • Broadening Access to HPC: By lowering the cost barrier and making supercomputing accessible to a wider range of institutions, Linux-based supercomputing has democratized access to powerful computational resources. This inclusivity has enabled more diverse voices to contribute to scientific and technological progress.
    • Economic Impact and Job Creation: The Linux HPC ecosystem has generated significant economic value, creating jobs and fostering innovation in sectors like manufacturing, finance, healthcare, and national security.
  2. Societal Benefits and Real-World Applications:
    • Addressing Global Challenges: Linux supercomputers have been pivotal in tackling critical global challenges, from predicting severe weather events and climate change impacts to developing new drugs and optimizing energy use. They have contributed to public safety, healthcare, and sustainability efforts worldwide.
    • Enhancing National Competitiveness: As countries invest in exascale Linux supercomputers, these systems have become key assets in maintaining technological leadership, supporting national security, and driving economic growth.

Summary:

Linux-based supercomputing is historically significant because it fundamentally reshaped the supercomputing landscape by making it more open, accessible, and collaborative. It has driven technological innovation, supported scientific advancement across numerous fields, and delivered broad societal benefits. The shift to Linux-based supercomputers has created a more inclusive and dynamic ecosystem that continues to fuel progress toward addressing some of the world's most pressing challenges.

What obstacles (technical, political, geographic) needed to be overcome?

The development and widespread adoption of Linux-based supercomputing faced several significant obstacles across technical, political, and geographic dimensions. Here’s a breakdown of these challenges:

Technical Obstacles:

  1. Performance and Scalability Limitations:
    • Communication Bottlenecks: Early Linux clusters, like the Beowulf project, used Ethernet for networking, which suffered from high latency and low bandwidth, limiting their ability to handle complex, large-scale scientific computations that required extensive inter-node communication.
    • Need for Advanced Networking Solutions: Overcoming these limitations required the integration of more advanced networking technologies, such as Myrinet, to provide higher bandwidth and lower latency, allowing Linux supercomputers to achieve the performance needed for true high-performance computing (HPC).
  2. Software and Compatibility Challenges:
    • Lack of HPC-Specific Software: Initially, there was a lack of specialized HPC software and tools optimized for Linux. This included compilers, parallel programming libraries (like MPI), and job schedulers that were essential for managing and distributing tasks across many nodes in a supercomputer.
    • Kernel Modifications and Optimization: The Linux kernel and related software had to be modified and optimized to support large-scale parallel processing, memory management, and the specific needs of scientific and technical computing.
  3. Adapting to Commodity Hardware:
    • Reliability and Robustness Issues: Commodity off-the-shelf (COTS) hardware, while cheaper, was not initially designed for the high reliability and performance demands of supercomputing. Integrating such hardware into Linux-based supercomputers required innovative solutions to monitor system health, handle failures, and ensure consistent performance.
    • Hardware Integration and Heterogeneity: Managing a heterogeneous environment with different types of processors, memory, and storage devices presented challenges in ensuring seamless integration and optimal performance.
  4. Lack of Early Adoption and Skepticism:
    • Resistance from Established Players: There was significant skepticism within the established HPC community and vendors who were accustomed to proprietary systems. Many viewed Linux-based clusters as insufficient for serious HPC tasks due to their initial performance limitations and lack of specialized support.

Political Obstacles:

  1. Competition from Proprietary Solutions:
    • Dominance of Proprietary Supercomputers: Companies like Cray, IBM, and SGI had a stronghold on the supercomputing market with proprietary systems. These companies were politically and commercially invested in maintaining the status quo, leading to resistance against open-source alternatives like Linux.
    • Vendor Lock-In: Many organizations, particularly in government and industry, were locked into proprietary ecosystems, making it politically and financially challenging to justify switching to Linux-based solutions.
  2. Government Funding and Policy Challenges:
    • Allocating Research Funds: Obtaining funding for developing and deploying Linux-based supercomputers was difficult due to the perception that proprietary systems were more reliable and capable. Early pioneers had to convince funding bodies, such as the National Science Foundation (NSF) in the U.S., to invest in what was then seen as a risky and unproven approach.
    • Regulatory and Procurement Hurdles: Government agencies and national labs often had procurement policies favoring established vendors, which created additional barriers to the adoption of Linux-based systems.

Geographic Obstacles:

  1. Global Adoption and Fragmentation:
    • Varied Levels of Technological Development: Different regions of the world had varying levels of technological infrastructure and readiness to adopt Linux-based supercomputing. Some countries lacked the necessary technical expertise or resources to build and maintain such systems.
    • Diverse Research Priorities: Geographic and cultural differences influenced research priorities and funding allocations. For instance, while the U.S. and Europe invested heavily in HPC for scientific research and defense, other regions focused on different applications, leading to uneven adoption rates.
  2. Collaboration Across Borders:
    • Barriers to International Collaboration: Differences in regulations, export controls, and data sharing policies complicated international collaboration on Linux-based supercomputing projects. For example, security concerns and export restrictions in the U.S. and Europe often limited the exchange of technology and expertise with other regions.
    • Standardization and Interoperability Issues: The lack of common standards for networking, software, and hardware across different regions made it difficult to integrate and collaborate on Linux-based supercomputing efforts globally.

Overcoming These Obstacles:

  1. Technical Solutions:
    • Development of advanced networking technologies (like Myrinet) and optimization of Linux kernels and software for HPC tasks helped address performance bottlenecks.
    • The creation of robust, open-source tools and libraries (e.g., MPI, job schedulers) tailored to HPC needs facilitated wider adoption and application diversity.
  2. Political Advocacy and Support:
    • Early advocates, like David Bader, worked to secure funding from organizations like the NSF and to demonstrate the viability of Linux supercomputing to both the research community and government entities.
    • Building partnerships with industry leaders (such as IBM) and showcasing successful deployments helped shift political and commercial perceptions in favor of Linux-based systems.
  3. Geographic Expansion and Collaboration:
    • Efforts to create international collaborations, such as the National Technology Grid in the U.S., helped standardize Linux HPC practices and made resources available to a wider audience.
    • As Linux-based supercomputing demonstrated its value across various regions and applications, global adoption grew, overcoming initial fragmentation and resistance.

Summary:

Overcoming the obstacles to Linux-based supercomputing required a combination of technological innovation, political advocacy, and international collaboration. The efforts of early pioneers and their willingness to push against the established norms have led to Linux becoming the foundation of modern supercomputing, driving advancements in science, technology, and society worldwide.

What features set this work apart from similar achievements?

The development of Linux-based supercomputing, particularly the work led by David A. Bader and his contemporaries, stands out from similar achievements in several key ways. These distinguishing features highlight the unique contributions that this work made to the field of high-performance computing (HPC):

      1. **1. Use of Open-Source Software (Linux)**

- **Open-Source Adoption at Scale:** Unlike proprietary supercomputing solutions that dominated the market before the 1990s, Linux-based supercomputers used an open-source operating system. This was a groundbreaking shift because it leveraged the power of a collaborative development model, enabling rapid improvements, widespread access, and customization by a global community of developers. - **Flexibility and Customization:** Linux allowed deep customization at both the kernel and user level, letting researchers and engineers tailor the OS specifically for HPC tasks. This flexibility was not available in other proprietary systems, which were locked down and less adaptable.

      1. **2. Integration of Commodity Off-The-Shelf (COTS) Hardware**

- **Cost-Effective Supercomputing:** The use of COTS components, like standard Intel processors and network hardware, dramatically reduced the cost of building and maintaining supercomputers. This approach set Linux-based supercomputing apart by making it more affordable and accessible to a broader range of institutions, including smaller universities, research labs, and even some private companies. - **Scalability Through Modularity:** Unlike traditional supercomputers that were monolithic and fixed in capacity, Linux-based supercomputers built from COTS hardware were modular and scalable. This meant they could be easily expanded by adding more nodes, enabling institutions to grow their computational capacity incrementally as their needs evolved.

      1. **3. Advanced Networking and System Design**

- **Innovative Use of High-Performance Networks:** Bader's work introduced the use of advanced networking solutions like Myrinet, which provided significantly higher bandwidth and lower latency compared to the standard Ethernet used in early clusters like Beowulf. This improved inter-node communication, allowing Linux supercomputers to handle a broader range of parallel processing tasks efficiently. - **Multi-Network Architecture:** The deployment of multiple networks (control, data, diagnostic) within Linux supercomputers was a novel approach that enhanced reliability, scalability, and performance. This architecture allowed for better resource management, system monitoring, and error handling, which were not typically available in other cluster-based systems.

      1. **4. Focus on Broad Application Usability**

- **Support for Diverse HPC Workloads:** While earlier cluster systems like Beowulf were primarily designed for specific, loosely coupled applications, Linux-based supercomputers aimed to handle a wide variety of HPC tasks, including those requiring tightly coupled parallel processing. This made them suitable for complex scientific simulations, data analysis, and engineering tasks across multiple disciplines. - **Adaptability to Emerging Applications:** The flexibility of Linux allowed it to quickly adapt to new scientific and industrial applications. As new fields like artificial intelligence, machine learning, and bioinformatics grew, Linux supercomputers could be easily configured to meet the specific computational requirements of these domains.

      1. **5. Early Adoption and Demonstration of Viability**

- **First Bona Fide Linux Supercomputer (Roadrunner):** Bader's development of the "Roadrunner" supercomputer at the University of New Mexico was the first to demonstrate that a Linux-based system could match or exceed the performance of traditional supercomputers while being more cost-effective and flexible. It integrated advanced features like job scheduling, resource management, and low-latency networking, proving Linux’s viability as a platform for serious scientific computation. - **Proof of Concept for Open HPC:** By achieving high rankings on the Top500 list and delivering substantial computational power for real-world scientific projects, Linux-based systems demonstrated that open-source, COTS-based supercomputers could effectively compete with, and even surpass, traditional supercomputing solutions.

      1. **6. Democratization of High-Performance Computing**

- **Broader Access and Inclusivity:** Unlike earlier supercomputing efforts that were limited to well-funded government labs or large corporations, Linux-based supercomputing opened up HPC resources to a broader range of users. This democratization enabled more institutions to participate in cutting-edge research and innovation, fostering a more inclusive scientific and technological ecosystem. - **Lower Barriers to Entry:** The significantly lower costs of building and maintaining Linux-based supercomputers reduced financial barriers, enabling smaller and less-funded organizations to access powerful computational tools previously reserved for elite entities.

      1. **7. Pioneering Exascale Readiness**

- **Foundation for Future Growth:** The architecture and principles established by Linux-based supercomputing laid the groundwork for the next generation of HPC, including exascale computing. The use of Linux as the foundational OS for supercomputers has been integral in developing scalable, flexible systems capable of reaching exascale performance (at least one quintillion floating-point operations per second). - **Continued Evolution and Community Support:** The vibrant open-source community surrounding Linux ensures ongoing development, optimization, and support for HPC needs, keeping Linux-based supercomputers at the forefront of technological advancements in supercomputing.

      1. **Summary:**

The work in Linux-based supercomputing stands out for its pioneering use of open-source software, cost-effective COTS hardware, advanced networking strategies, and broad applicability across diverse scientific fields. It fundamentally transformed the supercomputing landscape by making high-performance computing more accessible, flexible, and scalable, paving the way for future innovations, including exascale computing. This revolution in HPC has democratized access to computational resources and enabled breakthroughs across multiple scientific and industrial domains.

Why was the achievement was successful and impactful?


Supporting texts and citations to establish the dates, location, and importance of the achievement: Minimum of five (5), but as many as needed to support the milestone, such as patents, contemporary newspaper articles, journal articles, or chapters in scholarly books. 'Scholarly' is defined as peer-reviewed, with references, and published. You must supply the texts or excerpts themselves, not just the references. At least one of the references must be from a scholarly book or journal article. All supporting materials must be in English, or accompanied by an English translation.


Supporting materials (supported formats: GIF, JPEG, PNG, PDF, DOC): All supporting materials must be in English, or if not in English, accompanied by an English translation. You must supply the texts or excerpts themselves, not just the references. For documents that are copyright-encumbered, or which you do not have rights to post, email the documents themselves to ieee-history@ieee.org. Please see the Milestone Program Guidelines for more information.


Please email a jpeg or PDF a letter in English, or with English translation, from the site owner(s) giving permission to place IEEE milestone plaque on the property, and a letter (or forwarded email) from the appropriate Section Chair supporting the Milestone application to ieee-history@ieee.org with the subject line "Attention: Milestone Administrator." Note that there are multiple texts of the letter depending on whether an IEEE organizational unit other than the section will be paying for the plaque(s).

Please recommend reviewers by emailing their names and email addresses to ieee-history@ieee.org. Please include the docket number and brief title of your proposal in the subject line of all emails.