Milestone-Proposal talk:Linux-based Supercomputing

From IEEE Milestones Wiki

Advocates and reviewers will post their comments below. In addition, any IEEE member can sign in with their ETHW login (different from IEEE Single Sign On) and comment on the milestone proposal's accuracy or completeness as a form of public review.

-- Administrator4 (talk) 12:28, 30 September 2024 (UTC)

Advocates’ Checklist

  1. Is the proposal for an achievement rather than for a person? Yes. If the citation includes a person's name, have the proposers provided the required justification for inclusion of the person's name? Yes.
  2. Was the proposed achievement a significant advance rather than an incremental improvement to an existing technology? Yes.
  3. Were there prior or contemporary achievements of a similar nature? There were supercomputers prior to this accomplishment, but none of them used Linux as discussed in the proposal. If so, have they been properly considered in the background information and in the citation? Yes, non-Linux supercomputer environments are discussed.
  4. Has the achievement truly led to a functioning, useful, or marketable technology? Absolutely.
  5. Is the proposal adequately supported by significant references (minimum of five) such as patents, contemporary newspaper articles, journal articles, or citations to pages in scholarly books? Yes. At least one of the references should be from a peer-reviewed scholarly book or journal article. Reference 1 is a peer-reviewed scholarly journal article. The full text of the material, not just the references, shall be present. If the supporting texts are copyright-encumbered and cannot be posted on the ETHW for intellectual property reasons, the proposers shall email a copy to the History Center so that it can be forwarded to the Advocate. If the Advocate does not consider the supporting references sufficient, the Advocate may ask the proposer(s) for additional ones.
  6. Are the scholarly references sufficiently recent? Yes.
  7. Does the proposed citation explain why the achievement was successful and impactful? Yes.
  8. Does the proposed citation include important technical aspects of the achievement? Yes.
  9. Is the proposed citation readable and understandable by the general public? Yes.
  10. Will the citation be read correctly in the future by only using past tense? Yes. Does the citation wording avoid statements that read accurately only at the time that the proposal is written? Yes.
  11. Does the proposed plaque site fulfill the requirements? Yes.
  12. Is the proposal quality comparable to that of IEEE publications? Yes.
  13. Are any scientific and technical units correct (e.g., km, mm, hertz, etc.)? N/A Are acronyms correct and properly upper-cased or lower-cased? N/A Are the letters in any acronym explained in the title or the citation? N/A
  14. Are date formats correct as specified in Section 6 of Milestones Program Guidelines? Helpful Hints on Citations, plaque locations Yes.
  15. Do the year(s) appearing in the citation fall within the range of the year(s) included at the end of the title? Yes.
  16. Note that it is the Advocate's responsibility to confirm that the independent reviewers have no conflict of interest (e.g., that they do not work for a company or a team involved in the achievement being proposed, that they have not published with the proposer(s), and have not worked on a project related to the funding of the achievement). An example of a way to check for this would be to search reviewers' publications on IEEE Xplore. Done.

Independent Expert Reviewers’ Checklist

  1. Is suggested wording of the Plaque Citation accurate?
  2. Is evidence presented in the proposal of sufficient substance and accuracy to support the Plaque Citation?
  3. Does proposed milestone represent a significant technical achievement?
  4. Were there similar or competing achievements? If so, have the proposers adequately described these and their relationship to the achievement being proposed?
  5. Have proposers shown a clear benefit to humanity?


In answering these questions, the History Committee asks that independent expert reviewers apply a similar level of rigor to that used to peer-review an article, or evaluate a research proposal. Some elaboration is desirable. Of course the Committee would welcome any additional observations that you may have regarding this proposal.

Submission and Approval Log

Submitted date: 26 January 2025
Advocate approval date: 12 February 2025
History Committee approval date:
Board of Directors approval date:

Expert Review #1: Steven J. Wallach -- Bberg (talk) 23:35, 24 January 2025 (UTC)

  1. Is the suggested wording of the Plaque Citation accurate? Yes.
  2. Is the evidence presented in the proposal of sufficient substance and accuracy to support the Plaque Citation? Yes.
  3. Does the proposed milestone represent a significant technical achievement? Yes.
  4. Were there similar or competing achievements? Yes, but the Cray Research, SGI, IBM, Thinking Machines, and Fujitsu platforms each used its own operating system. If so, have the proposers adequately described these and their relationship to the achievement being proposed? Yes.
  5. Have the proposers shown a clear benefit to humanity? Yes, particularly with Reference 6.


In 1982, Steve Wallach co-founded Convex Computer Corp. to compete in the supercomputer industry, and where he was its CTO. Hewlett-Packard bought Convex in 1995, and Wallach became CTO of HP’s Enterprise Systems Group. For this work, Wallach was the 2008 recipient of IEEE Computer Society’s Seymour Cray Computer Engineering Award “for contribution to high-performance computing through design of innovative vector and parallel computing systems, notably the Convex mini-supercomputer series, a distinguished industrial career and acts of public service.”

Wallach was cited by HPCwire (the high-performance computing information nexus) as one of its 35 HPC legends for being “instrumental in creating the mini-supercomputer system category, bringing supercomputing to a wider audience.” Wallach served as a consultant to the U.S. Department of Energy’s ASC Program at Los Alamos National Laboratory (1998-2007, and he is currently a Visiting Scientist there. He is a member of the National Academy of Engineering, an IEEE Fellow “for contributions to high performance computing,” and was the 2002 recipient of the IEEE Computer Society’s Charles Babbage Award.

Expert Review #2: Dr. Larry Smarr -- Bberg (talk) 00:10, 25 January 2025 (UTC)

  1. Is suggested wording of the Plaque Citation accurate? Yes.
  2. Is evidence presented in the proposal of sufficient substance and accuracy to support the Plaque Citation? Yes.
  3. Does proposed milestone represent a significant technical achievement? Yes.
  4. Were there similar or competing achievements? No. If so, have the proposers adequately described these and their relationship to the achievement being proposed? Yes.
  5. Have proposers shown a clear benefit to humanity? Yes.


Larry Smarr is a national leader in scientific computing and Internet cyberinfrastructure. After earning his Physics PhD from the University of Texas at Austin in 1975, he was a postdoctoral fellow at Princeton, Harvard, and Yale, and in 1979 became a Professor of Physics and of Astronomy at the University of Illinois Urbana-Champaign (UI-UC). He then carried out pioneering computational research on the dynamics of black holes and astrophysical jets.

While at UI-UC, Smarr founded the National Center for Supercomputing Applications (NCSA), serving as its Director from 1985-2000. During the 1990s the WWW browser/server NCSA Mosaic (which led to Netscape Navigator, Microsoft’s Internet Explorer, and the Apache Web Server) originated at NCSA. He has continued to provide national leadership in networked cyberinfrastructure (CI), serving over the last two decades as PI on multiple NSF CI research grants. These were unified in 2023 to form the National Research Platform (NRP), NSF’s largest distributed academic AI/Machine Learning/Data Science CI, which is now led by the San Diego Supercomputer Center.

Since becoming a Professor in the Department of Computer Science and Engineering at UC San Diego in 2000, he served until 2020 as founding Director of the California Institute for Telecommunications and Information Technology (Calit2), a UC San Diego/UC Irvine partnership. Dr. Smarr is a member of the National Academy of Engineering, a Fellow of the American Physical Society, the American Association for the Advancement of Science, and the American Academy of Arts and Sciences, and an IEEE Fellow “for contributions to supercomputing and metacomputer cyberinfrastructure.” In 2006, he received the IEEE Computer Society Tsutomu Kanai Award for his lifetime achievements in distributed computing systems. For 35 years he has served on top-level advisory committees to NSF, DOE, NASA, and NIH, and he is now a UCSD Distinguished Professor Emeritus.

COTS -- Coronath (talk) 00:21, 8 February 2025 (UTC)

The term COTS is defined in the citation as "Commercial Off The Shelf", but then later in multiple places COTS is referred to as commodity-based off-the-shelf. Edit to be consistent.

Re: COTS -- Bberg (talk) 23:53, 8 February 2025 (UTC)

Thanks for this comment, and this point had actually already been brought to my attention. I was working with the proposer to address it around the time that you entered this comment, and this clarification paragraph has now been added at the very start of the "historical significance" section:

Please note that the acronym COTS appears in the Milestone citation and the supporting information, and that the "C" is used to indicate either "commercial," "commodity," or "consumer." As there is no consequential difference in their usage herein, all three interpretations are correct and consistent with each other.

Brian Berg, Advocate for this proposal

Re: Re: COTS -- Amy Bix (talk) 20:04, 19 February 2025 (UTC)

Is there some particular reason why the acronym "COTS" needs to be included in the citation at all? usually the reason to use an acronym is to make subsequent references more concise - but that isn't what's happening here. I think we can save a word by leaving out the acronym - the meaning comes through fine just saying, "Roadrunner, the first supercomputer using the Linux operating system

and commercial off-the-shelf parts, was developed...."

Support of Milestone Proposal -- Jbart64 (talk) 20:43, 21 February 2025 (UTC)

I read the proposal and the discussion, and I support this milestone. I agree, that the acronym COTS is not needed. Dave Bart

Re: Support of Milestone Proposal -- Bberg (talk) 00:52, 22 February 2025 (UTC)

I worked with the proposer and suggested the inclusion of "(COTS)" since this acronym is used for multiple wording variants on the "C" as shown in the supporting information ("commercial," "commodity," or "consumer"), but in particular since it is often used in literature to refer to this term. As such, even though the acronym is not used elsewhere in the citation, I feel that "(COTS)" should remain in the ctation.

Who was first? -- Dmichelson (talk) 09:59, 26 February 2025 (UTC)

I have grave concerns that we're setting a bad precedent by allowing an individual to nominate their own achievement.

Also, we need to acknowledge that others claim to have developed the world's first Linux-powered supercomputer.

From Linux Journal,

https://www.linuxjournal.com/content/linux-and-supercomputers

"In June 1998, that change arrived in the form of Linux. Known as the "Avalon Cluster", the world's first Linux-powered supercomputer was developed at the Los Alamos National Laboratory for the (comparatively) tiny cost of $152,000."

More details at

https://docs.huihoo.com/hpc-cluster/avalon/

Avalon was co-winner of the 1998 Gordon Bell Price/Performance Prize.

This matter needs to be resolved before this Milestone can go forward.

Re: Who was first? -- John Vardalas (talk) 20:26, 26 February 2025 (UTC)

David Michelson’s comments give me pause. As regards setting a precedent, I understand David Michelson’s concern. On the other hand, I do not know of any rules that prohibit self-nomination.

Re: Re: Who was first? -- Bberg (talk) 20:44, 26 February 2025 (UTC)

Please note that the webpage has been updated with extensive detailed information comparing Roadrunner with the Avalon project. As such, this should erase any questions about the importance of Roadrunner and this proposal.

I was not initially aware that the proposer would be named in the citation. We currently have no rule preventing this, but I have asked Dave Michelson to look into this as I agree that we should have a rule against this. However, this should not prevent this proposal and citation from being able to be accepted.

Brian Berg

Re: Re: Re: Who was first? -- Dmichelson (talk) 07:11, 27 February 2025 (UTC)
The extensive comparison has omitted that Avalon was awarded a 1998 Gordon Bell price/performance prize for significant achievement in parallel processing.

Avalon is a 140 processor Alpha/Linux Beowulf cluster constructed entirely from commodity personal computer technology and freely available software. Computational Physics simulations performed on Avalon resulted in the award of a 1998 Gordon Bell price/performance prize for significant achievement in parallel processing. Avalon ranked as the 113th fastest computer in the world on the November 1998 TOP500 list, obtaining a result of 47.8 Gigaflops on the parallel Linpack benchmark.

https://aric.hagberg.org/papers/warren-1999-avalon.pdf

I get that:

- Roadrunner's Myrinet-based inter-node communication was more sophisticated than Avalon's 
- Roadrunner's precedents influenced later work

However,

- Myrinet did not appear to be COTS:

Bader also partnered with Myricom's president and CEO Chuck Seitz to incorporate the first Myrinet interconnection network for Intel/Linux systems.

...my system design maximized performance per price per megaFLOP, and used both mass market commodity components and proprietary software and networks. Beowulf used only Ethernet for the system area network, and I engineered the first use of a proprietary scalable network, Myrinet, in a Linux system..

- Myrinet's influence rapidly declined.

"According to Myricom, 141 (28.2%) of the June 2005 TOP500 supercomputers used Myrinet technology. In the November 2005 TOP500, the number of supercomputers using Myrinet was down to 101 computers, or 20.2%, in November 2006, 79 (15.8%), and by November 2007, 18 (3.6%), a long way behind gigabit Ethernet at 54% and InfiniBand at 24.2%."

"In the June 2014 TOP500 list, the number of supercomputers using Myrinet interconnect was 1 (0.2%).[7][8]"

"In November, 2013, the assets of Myricom (including the Myrinet technology) were acquired by CSP Inc.[9] In 2016, it was reported that Google had also offered to buy the company.[10]"

https://en.wikipedia.org/wiki/Myrinet

I appreciate that Roadrunner was important, but can only support the following modified citation, which I believe captures the essence of Bader's claims and the timelines that Bader has reported in previous articles, e.g.,

https://davidbader.net/publication/2021-b/2021-b.pdf

NSF and NCSA, led by Larry Smarr, made a high risk, high payoff bet in my vision of the first Linux supercomputer widely available to national science communities by allocating $400,000, based on demonstrations of my 1998 16-processor Linux machine prototype. I assembled a team and we built Roadrunner, which entered production mode in April 1999.


Linux-based Supercomputing, 1998-99

Roadrunner, one of the first supercomputers based upon the Linux operating system and commercial off-the-shelf parts, demonstrated the value of incorporating high-performance system-area networks into Linux-based high-performance computing clusters. Developed by David A. Bader at the University of New Mexico in 1998-99, its cost-effective design, computational performance, and use of open-source code enabled important new scientific and industrial applications and influenced later work.

[63 words]


We need to be mindful that Bader is asking IEEE to endorse his claims, and that we risk having others, e.g., LANL, coming after us (not him) if the claims that we endorse are too broad.

I hope that the above represents a good compromise.

Re: Re: Re: Re: Who was first? -- Dmichelson (talk) 05:25, 28 February 2025 (UTC)
Alex Magoun has advised the proposer that Reference 1, an article written by the proposer about his own work, cited by the proposer as evidence of the validity of his claims, and listed as peer-reviewed is not actually peer-reviewed.

Note also that your Anecdote in IEEE Annals of the History of Computing is not a peer-reviewed article. Anecdotes appear in a non-peer reviewed Department in the magazine that I edit, unlike the peer-reviewed articles in each issue.

Alex has advised the proposer that:

Given the continuities of history and the ways people have stood on the shoulders of others going back to the mists of time, I and other staff historians always have misgivings about certifying anything as a first. The founders of IEEE's Milestone program understood that, and established Milestones as a means of stating simply that something important happened on or near the site of the plaque, on a regional, national, or international basis, rather than as a bronze version of an IEEE award or a patent.

I fully support recognizing Bader's work as a Milestone. It was clearly important and influential. We do, however, need to be very careful about endorsing claims of priority that are based on matters of opinion and which could be successfully challenged by others holding different opinions.

Re: Who was first? -- Coronath (talk) 00:54, 28 February 2025 (UTC)

Good discussion on this proposal, I would like to hear final thoughts this weekend.

Re: Re: Who was first? -- Bberg (talk) 19:45, 28 February 2025 (UTC)

David Bader has revised the proposal so that Reference 1 is now a peer-reviewed paper. It was an error on my part to say that the earlier Reference #1 was peer reviewed. He has also added a new section ("Early Commodity-Based Parallel Systems: Mid-1980s-1997") that shows that he built a DEC Alpha cluster in 1996-1997, which is more than a year prior to Los Alamos's June 1988 release of Avalon.

Brian Berg

Re: Re: Re: Who was first? -- Bberg (talk) 16:30, 1 March 2025 (UTC)

I respectfully note that some points raised by Dave Michelson are not correct as it can sometimes be difficult to find accurate nformation online. To prove this, new information has been added to the "What features set this work apart from similar achievements?" section:

(1) Myrnet did not appear to be COTS
This is incorrect, as detailed in the "Bader's Roadrunner: The First True Linux Supercomputer (1998-1999)" subsection.

(2) Myrnet's influence rapidly declined
I'm not sure how this relevant since we are citing a point in time for the accomplishment.

(3) Avalon won the Gordon Bell Award
Avalon actually came in second, as detailed in the "Los Alamos National Laboratory's Avalon: June-November 1998" subsection.

(4) Gordon Bell praised the work that David Bader had done, as shown in an email sent by Gordon to David included in the "Quotes from Supercomputing Experts" subsecton.

Brian Berg