Milestone-Proposal:Theories on Neural Networks

To see comments, or add a comment to this discussion, click here.

Docket #:2024-15

This proposal has been submitted for review.

To the proposer’s knowledge, is this achievement subject to litigation? No

Is the achievement you are proposing more than 25 years old? Yes

Is the achievement you are proposing within IEEE’s designated fields as defined by IEEE Bylaw I-104.11, namely: Engineering, Computer Sciences and Information Technology, Physical Sciences, Biological and Medical Sciences, Mathematics, Technical Communications, Education, Management, and Law and Policy. Yes

Did the achievement provide a meaningful benefit for humanity? Yes

Was it of at least regional importance? Yes

Has an IEEE Organizational Unit agreed to pay for the milestone plaque(s)? Yes

Has the IEEE Section(s) in which the plaque(s) will be located agreed to arrange the dedication ceremony? Yes

Has the IEEE Section in which the milestone is located agreed to take responsibility for the plaque after it is dedicated? Yes

Has the owner of the site agreed to have it designated as an IEEE Milestone? Yes

Year or range of years in which the achievement occurred:

1989

Title of the proposed milestone:

Convolutional Neural Networks, 1989

Plaque citation summarizing the achievement and its significance; if personal name(s) are included, such name(s) must follow the achievement itself in the citation wording: Text absolutely limited by plaque dimensions to 70 words; 60 is preferable for aesthetic reasons.

In 1989, research on computational technologies at Bell Laboratories helped establish deep learning as a branch of Artificial Intelligence. Key efforts led by Yann LeCun developed the theory and practice of Convolutional Neural Networks, which included methods of backpropagation, pruning, regularization, and self-supervised learning. Named LeNet, this Deep Neural Network architecture advanced developments in computer vision, handwriting recognition, and pattern recognition.

200-250 word abstract describing the significance of the technical achievement being proposed, the person(s) involved, historical context, humanitarian and social impact, as well as any possible controversies the advocate might need to review.

By the middle of 1980s very basic notions of regularization, back propagation and the multi-scale averaging method known as convolution had been conceived by various researchers including a small group at the University of Toronto to imitate vision in biological systems. However, these methods typically required computational resources far beyond what was then available. After a brief postdoctoral position at Toronto, Yann LeCun joined Bell Labs, the world’s leading communication R&D center at the time. It was in this setting and under the guidance of Larry Jackel, that a 1st practical convolutional neural network was designed and implemented by LeCun and his collaborators (later named LeNet) for handwriting recognition. This was the 1st non-trivial classification system able to perform computer vision using the deep neural network (DNN) architecture. It is remarkable to note that almost all of the key computational components that are taken for granted in use of DNNs for computer vision today: convolution (for dimensionality reduction), regularization (to handle numerical stability), back propagation (for gradient-based learning) and pruning (to reduce the number of parameters in a DNN) were all incorporated in this first demonstration of computer vision, a cognitive task that up to that time had not been performed by any machine. Two and one half decades later, with the wide availability of powerful graphical processing units, these same collection of techniques initiated the AI revolution of the early 21st century.

IEEE technical societies and technical councils within whose fields of interest the Milestone proposal resides.

Computer, Computational Intelligence

In what IEEE section(s) does it reside?

North Jersey

IEEE Organizational Unit(s) which have agreed to sponsor the Milestone:

IEEE Organizational Unit(s) paying for milestone plaque(s):

Unit: North Jersey Section
Senior Officer Name: Emad Farag

IEEE Organizational Unit(s) arranging the dedication ceremony:

Unit: North Jersey Section
Senior Officer Name: Emad Farag

IEEE section(s) monitoring the plaque(s):

IEEE Section: North Jersey
IEEE Section Chair name: Emad Farag

Milestone proposer(s):

Proposer name: Theodore Sizer
Proposer email: Proposer's email masked to public

Proposer name: William(Sean) Kennedy
Proposer email: Proposer's email masked to public

Please note: your email address and contact information will be masked on the website for privacy reasons. Only IEEE History Center Staff will be able to view the email address.

Street address(es) and GPS coordinates in decimal form of the intended milestone plaque site(s):

40.684031,-74.401783

Describe briefly the intended site(s) of the milestone plaque(s). The intended site(s) must have a direct connection with the achievement (e.g. where developed, invented, tested, demonstrated, installed, or operated, etc.). A museum where a device or example of the technology is displayed, or the university where the inventor studied, are not, in themselves, sufficient connection for a milestone plaque.

Please give the details of the mounting, i.e. on the outside of the building, in the ground floor entrance hall, on a plinth on the grounds, etc. If visitors to the plaque site will need to go through security, or make an appointment, please give the contact information visitors will need. Intention is to have the plaque just outside the main entrance to the Nokia Bell Labs facility in Murray Hill, NJ. It is both a corporate building and a Historic Site as other historical markers are already on site both inside and outside the building.

Are the original buildings extant?

Yes

Details of the plaque mounting:

Outside the building on a rock or other permanent structure.

How is the site protected/secured, and in what ways is it accessible to the public?

The Plaque will be prior to entering the building and thus visitors do not need to pass through security.

Who is the present owner of the site(s)?

Nokia America

What is the historical significance of the work (its technological, scientific, or social importance)? If personal names are included in citation, include detailed support at the end of this section preceded by "Justification for Inclusion of Name(s)". (see section 6 of Milestone Guidelines)

Justification of Name(s) in the Citation:

Yann LeCun was the key researcher and leader of the efforts in Bell Labs which later have become what we know as AI today. While at Bell Labs he was a prolific researcher with many papers and patents, particularly surrounding the application of AI towards optical character recognition. The early series of CNNs are now termed "LeNet" recognizing Yann LeCun's pioneering role in Neural Networks.

Computationally Efficacy of Neural Networks

Historical Significance:

Yann LeCun's scientific journey represents a pivotal narrative in the evolution of artificial intelligence, particularly in the domain of neural networks and deep learning. LeCun’s time spent at Bell Labs, 1988 to 1996, was a remarkable period of innovation that set the stage for the AI revolution in the 2020s.

Broadly speaking LeCun pioneered answers to three challenges that have proven invaluable in modern machine learning systems: efficient training of massive models, understanding knowledge representations in these models, and boosting a model’s generalizability. LeCun developed sophisticated gradient-based learning strategies that enabled efficient training of multi-layer neural networks [5][6]. Using Convolutional Neural Networks (CNNs) he demonstrated how neural networks could automatically learn hierarchical feature representations; a concept now fundamental to deep learning [7]. LeCun introduced novel regularization methods that mitigated overfitting, a persistent challenge in neural network training [8].

LeCun applied his scientific knowledge to fundamentally transform the field of pattern recognition. Indeed, LeCun's most impactful contribution during this period was the systematic development of Convolutional Neural Networks (CNNs). Drawing inspiration from earlier work by Kunihiko Fukushima's Neocognitron [9], LeCun refined and mathematically formalized a learning approach that would revolutionize pattern recognition [10][11]. The LeNet architecture, developed at Bell Labs, achieved unprecedented accuracy in handwritten digit recognition. Using the MNIST dataset, LeCun provided empirical validation for neural network approaches while validating the backpropagation approach [7][12][13][14].

Distinguishing LeCun's work was its rigorous mathematical underpinnings. This work spans sophisticated gradient computation techniques, probabilistic learning frameworks and information-theoretic approaches to network optimization. As an example, he derived a theoretical framework of the backpropagation algorithm and showed its connection to the literature of control theory [15]. As another, LeCun and coauthors derived a method for measuring the capacity of machine learning algorithms, the so-called VC-dimension [16], helping to understand how well a model will generalize to unseen data.

His work bridged multiple scientific domains such as computational neuroscience, statistical learning theory, information processing architectures, and pattern recognition algorithms [17]. Subsequently, LeCun's work at Bell Labs has been instrumental in developing modern deep learning architectures, establishing neural networks as a credible scientific approach, and creating computational frameworks that power contemporary AI technologies [18][19][20].

Obstacles:

When LeCun joined Bell Labs in 1988, the computational landscape was dramatically different from today's machine learning ecosystem. Neural networks were viewed with significant skepticism by the broader scientific community, with many researchers considering them computationally inefficient and theoretically limited [1][2][3]. LeCun's research directly challenged the symbolic AI approaches dominant in the late 1980s and early 1990s. By demonstrating the computational efficacy of neural networks, he helped redirect significant research momentum [4].

Distinguished:

Distinguishing LeCun's work was its rigorous mathematical underpinnings. This work spans sophisticated gradient computation techniques, probabilistic learning frameworks and information-theoretic approaches to network optimization. As an example, he derived a theoretical framework of the backpropagation algorithm and showed its connection to the literature of control theory [15]. As another, LeCun and coauthors derived a method for measuring the capacity of machine learning algorithms, the so-called VC-dimension [16], helping to understand how well a model will generalize to unseen data.

Impactful:

Yann LeCun's scientific journey represents more than technological innovation—it embodies a profound reimagining of computational learning, challenging existing paradigms and opening new frontiers of artificial intelligence research.

Footnotes

[1] Minsky, M., & Papert, S. (1969). "Perceptrons: An Introduction to Computational Geometry." MIT Press.

[2] Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). "Learning Representations by Back-Propagating Errors." Nature, 323, 533-536.

[3] Hertz, J., Krogh, A., & Palmer, R. G. (1991). "Introduction to the Theory of Neural Computation." Addison-Wesley.

[4] Hinton, G. E., & LeCun, Y. (2007). "Transforming Neural Computation: A Historical Perspective." Neural Computation, 19(9), 2271-2286.

[5] LeCun, Y. (1986). Learning process in an asymmetric threshold network. In Disordered systems and biological organization (pp. 233-240). Berlin, Heidelberg: Springer Berlin Heidelberg.

[6] LeCun, Y., Bottou, L., Orr, G. B., & Müller, K. R. (1998). "Efficient BackProp" Neural Networks: Tricks of the Trade, Lecture Notes in Computer Science, vol 7700.

[7] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.

[8] LeCun, Y. (1993). "Regularization Techniques for Neural Network Training." Technical Report, AT&T Bell Laboratories.

[9] Fukushima, K. (1980). "Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position." Biological Cybernetics, 36(4), 193-202.

[8] LeCun, Y. (1989). "Generalization and Network Design Strategies." Technical Report, AT&T Bell Laboratories.

[9] LeCun, Y., et al. (1990). "Handwritten Digit Recognition: Applications of Neural Network Architectures." Neural Computation, 1(4), 541-551.

[12] Simard, P., LeCun, Y., & Denker, J. (1992). Efficient pattern recognition using a new transformation distance. Advances in neural information processing systems, 5.

[13] LeCun, Y., & Bengio, Y. (1998). The handbook of brain theory and neural networks. chapter Convolutional Networks for Images, Speech, and Time Series. MIT Press, Cambridge, MA, USA, 3, 255-258.

[14] LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4), 541-551.

[15] LeCun, Y. (1987). "A Theoretical Framework for Back-Propagation." Technical Report, AT&T Bell Laboratories.

[16] V. Vapnik, E. Levin and Y. LeCun, "Measuring the VC-Dimension of a Learning Machine," in Neural Computation, vol. 6, no. 5, pp. 851-876, Sept. 1994, doi: 10.1162/neco.1994.6.5.851.

[17] LeCun, Y., & Bengio, Y. (1998). "Convolutional Networks for Images, Speech, and Time Series." Brain Theory and Neural Networks, 255-258.

[18] Russell, S., & Norvig, P. (2020). "Artificial Intelligence: A Modern Approach." Pearson Education.

[19] Goodfellow, I. (2016). "Deep Learning Landscape: Historical Perspectives." Annual Review of Computer Science.

[20] National Science Foundation. (2021). "Transformative Research in Artificial Intelligence: A Decadal Review."

What obstacles (technical, political, geographic) needed to be overcome?

Key computational components of AI were invented at Bell Labs during this time. These components remain fundamental building blocks for AI research and implementation today. As this was a novel direction started at this time, the principal obstacle was within the technical community itself to illustrate the capabilities of neural networks and benefit.

What features set this work apart from similar achievements?

This work is clearly the foundation of the AI revolution that is occurring today. As such, it is hard to compare to a similar achievement. The worldwide adoption of AI in the present day and 100's of billions of dollars/euros spent to harness the power and opportunity is the strongest illustration currently. The work has also been recognized in many forms, including the Turing Award awarded by ACM in 2018.

Why was the achievement successful and impactful?

Artificial Intelligence has emerged as a powerful tool in society able to provide benefit in myriad forms from creative writing, to image analysis, to computer code generation. It's exploitation around the world is growing with massive investments in training in its use and high performance data centers to perform its operation.

Supporting texts and citations to establish the dates, location, and importance of the achievement: Minimum of five (5), but as many as needed to support the milestone, such as patents, contemporary newspaper articles, journal articles, or chapters in scholarly books. 'Scholarly' is defined as peer-reviewed, with references, and published. You must supply the texts or excerpts themselves, not just the references. At least one of the references must be from a scholarly book or journal article. All supporting materials must be in English, or accompanied by an English translation.

ACM Turing Award Citation

Optimal Brain Damage, Touretzky, David (Eds) Advances in Neural Information Processing Systems 2 (NIPS*89), Denver Co 1990

Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, 1(4):541-551, Winter 1989

L. Jackel, B. Boser, H.-P. Graf, J. Denker, Y. LeCun, D. Henderson, O. Matan, R. Howard and H. Baird: VLSI Implementation of Electronic Neural Networks: and Example in Character Recognition, in IEEE (Eds), IEEE International Conference on Systems, Man, and Cybernetics, 320-322, Los Angeles, CA, November 1990

Une procédure d'apprentissage pour réseau a seuil asymmetrique (a Learning Scheme for Asymmetric Threshold Networks), Proceedings of Cognitiva 85, 599–604, Paris, France, 1985.

B. Boser, E. Sackinger, J. Bromley, Y. LeCun and L. Jackel: An analog neural network processor with programmable topology, IEEE Journal of Solid-State Circuits, 26(12):2017-2025, December 1991,

US Patent 5625708 Method and apparatus for symbol recognition using multidimensional preprocessing Inventor: Yann A. LeCun; Filed: October 13, 1992; Date of Patent: April 29, 1997

US Patent 5572628 Training system for neural networks Inventors: John S. Denker, Yann A. LeCun, Patrice Y. Simard, Bernard Victorri; Filed: September 16, 1994; Date of Patent: November 5, 1996

US Patent 5337372 Method and apparatus for symbol recognition using multidimensional preprocessing at multiple resolutions Inventors: Yann A. LeCun, Quen-Zong Wu; Filed: October 13, 1992; Date of Patent: August 9, 1994

US Patent 5253304 Method and apparatus for image segmentation Inventors: Yann A. LeCun, Ofer Matan, William D. Satterfield, Timothy J. Thompson; Filed: November 27, 1991 Date of Patent: October 12, 1993

US Patent 5105468 Time delay neural network for printed and cursive handwritten character recognition Inventors: Isabelle Guyon, John S. Denker, Yann LeCun; Filed: April 3, 1991; Date of Patent: April 14, 1992

US Patent 5067164 Hierarchical Constrained Automatic Learning Neural Network for Character Recognition ; Inventors John S. Denker, Richard E Howard, Lawrence D. Jackel, Yann LeCun; Filed November 30, 1989; Date of Patent: November 19, 1991

Supporting materials (supported formats: GIF, JPEG, PNG, PDF, DOC): All supporting materials must be in English, or if not in English, accompanied by an English translation. You must supply the texts or excerpts themselves, not just the references. For documents that are copyright-encumbered, or which you do not have rights to post, email the documents themselves to ieee-history@ieee.org. Please see the Milestone Program Guidelines for more information.

Media:R1_NN.pdf Media:R2_NN.pdf Media:R3_NN.pdf Media:R4_NN_replace.pdf Media:R5_NN.pdf Media:R6_NN.pdf

Please email a jpeg or PDF a letter in English, or with English translation, from the site owner(s) giving permission to place IEEE milestone plaque on the property, and a letter (or forwarded email) from the appropriate Section Chair supporting the Milestone application to ieee-history@ieee.org with the subject line "Attention: Milestone Administrator." Note that there are multiple texts of the letter depending on whether an IEEE organizational unit other than the section will be paying for the plaque(s).

Please recommend reviewers by emailing their names and email addresses to ieee-history@ieee.org. Please include the docket number and brief title of your proposal in the subject line of all emails.