{"id":6041,"date":"2024-11-19T08:10:06","date_gmt":"2024-11-19T08:10:06","guid":{"rendered":"https:\/\/tech.newat9.com\/index.php\/2024\/11\/19\/the-unexpected-journey-of-neural-networks\/"},"modified":"2024-11-19T08:10:06","modified_gmt":"2024-11-19T08:10:06","slug":"the-unexpected-journey-of-neural-networks","status":"publish","type":"post","link":"https:\/\/tech.newat9.com\/index.php\/2024\/11\/19\/the-unexpected-journey-of-neural-networks\/","title":{"rendered":"The Unexpected Journey of Neural Networks"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div>\n<p class=\"dropcap\" dir=\"ltr\"><span>When federal agencies issue a research grant, they never know if their investment will reap rewards for society. This was almost certainly true in the late 1970s and early 1980s, when the National Science Foundation and the Office of Naval Research funded projects by\u00a0<\/span><a href=\"https:\/\/profiles.stanford.edu\/jay-mcclelland\" target=\"_blank\" rel=\"noopener\"><span>James \u201cJay\u201d McClelland<\/span><\/a><span>, David Rumelhart, and\u00a0<\/span><a href=\"https:\/\/www.cs.toronto.edu\/~hinton\/\" target=\"_blank\" rel=\"noopener\"><span>Geoffrey Hinton<\/span><\/a><span> to model human cognitive abilities.<\/span><\/p>\n<p dir=\"ltr\"><span>Yet that investment led to a cascade of research progress: a neural network model of how humans perceive letters and words; two volumes published in 1986 describing the team\u2019s theory of how neural networks in our brains function as parallel distributed processing systems; and a seminal article in\u00a0<\/span><em>Nature<\/em><span> by Rumelhart, Hinton and a student named Ronald J. Williams demonstrating the power of what\u2019s called the backpropagation algorithm \u2014 a way of training neural network models to learn from their mistakes.<\/span><\/p>\n<p dir=\"ltr\"><span>And that research in turn spawned much of modern AI. \u201cToday, the backpropagation algorithm forms the basis for all of the deep learning systems that have been developed since, and for virtually all of the AI systems that have become drivers of the modern tech industry,\u201d says McClelland, the Lucie Stern Professor in the Social Sciences in the Stanford\u00a0<\/span><a href=\"https:\/\/humsci.stanford.edu\/\" target=\"_blank\" rel=\"noopener\"><span>School of Humanities and Sciences<\/span><\/a><span> and director of the\u00a0<\/span><a href=\"https:\/\/neuroscience.stanford.edu\/initiatives-centers\/center-mind-brain-computation-and-technology\" target=\"_blank\" rel=\"noopener\"><span>Center for Mind, Brain, Computation and Technology<\/span><\/a><span> at Stanford\u2019s Wu Tsai Neurosciences Institute.<\/span><\/p>\n<p dir=\"ltr\"><span>It\u2019s an outcome that earned the threesome a 2024<\/span><a href=\"https:\/\/www.goldengooseaward.org\/01awardees\/pdp\" target=\"_blank\" rel=\"noopener\"><span>\u00a0Golden Goose Award<\/span><\/a><span> in recognition of the impact their basic science research has had on the world.<\/span><\/p>\n<p dir=\"ltr\"><span>McClelland \u2014 like the NSF and ONR \u2014 never anticipated such a result. As a cognitive scientist, \u201cI was never thinking about building an AI,\u201d he says. But now the progress in AI has come full circle. \u201cI\u2019m drawing inspiration from what\u2019s been learned in AI and deep learning to help me think about the human mind, while also asking what the mind and brain have to teach AI.\u201d<\/span><\/p>\n<h2>From Letter Perception to Neural Networks<\/h2>\n<p dir=\"ltr\"><span>In the 1970s, when McClelland and Rumelhart began collaborating, their ideas about how the brain works diverged from the mainstream. Researchers such as Noam Chomsky and Jerry Fodor at MIT believed that language processing was an inherently symbolic process that involves manipulating organized arrangements of symbols according to clear rules.<\/span><\/p>\n<p dir=\"ltr\"><span>McClelland had a different view. With a background in sensory neurophysiology and animal learning, he couldn\u2019t reconcile the abstractions that people like Chomsky and Fodor talked about with what he\u2019d seen in animal experiments. For example, experiments that measured single neurons in the cortex of a cat as it responded to line segments showed that perception didn\u2019t seem to follow clear rules. \u201cIt\u2019s continuous and doesn\u2019t happen in discrete steps. And it\u2019s sensitive to context,\u201d he says. McClelland wanted to build a model that captured that sensitivity.<\/span><\/p>\n<p dir=\"ltr\"><span>Meanwhile, Rumelhart published a paper in 1977 proposing that whenever we\u2019re trying to understand a letter, a word, a phrase, or the meaning of a word in a sentence, we\u2019re using all of the available information simultaneously to constrain the problem. Again: Context matters.<\/span><\/p>\n<p dir=\"ltr\"><span>After McClelland read Rumelhart\u2019s paper, the two met and soon realized they could formalize their ideas in a computational neural network model \u2014 a set of layered, simple computing elements (sometimes referred to as \u201cneurons\u201d) that receive inputs from each other (i.e., take context into account) and update their states accordingly.<\/span><\/p>\n<p dir=\"ltr\"><span>\u201cWe wanted to develop a neural network model that could capture some of the features of how the brain perceives letters in different contexts,\u201d says McClelland. For example, we recognize letters faster when they are in a word than when they are in a string of random letters; and we can intuitively determine what a word is likely to be even if part of it is obscured, distorted, or masked, he says.<\/span><\/p>\n<p dir=\"ltr\"><span>Their initial model produced results similar to those seen in language experiments with human subjects \u2014 McClelland\u2019s primary goal. This suggested that neural network models, which are parallel processing systems, are appropriate models of human cognition.<\/span><\/p>\n<p dir=\"ltr\"><span>But the team\u2019s initial model treated letters and words as discrete units (\u201cneurons\u201d) with connections between them. When Hinton joined the team in the early 1980s, he suggested the team should back away from the idea that each unit, or neuron, represents a letter, word, or some other symbol recognizable or meaningful to a human. Instead, he proposed, the symbolic representation of a letter, word, or other symbol should be thought of as only existing in the combined activity of many neurons in the model network.\u00a0<\/span><em>Parallel Distributed Processing<\/em><span>, a two-volume book published by the group in 1986, set forth these theories.\u00a0<\/span><\/p>\n<p dir=\"ltr\"><span>Next came the coup de gras: The backpropagation algorithm that Rumelhart, Hinton and Williams presented in\u00a0<\/span><em>Nature,\u00a0<\/em><span>also in 1986.<\/span><\/p>\n<p dir=\"ltr\"><span>Until then, neural network models\u2019 learning capabilities had been fairly limited: Errors were only adjusted in the final output layer of the network, limiting how effectively experience could shape the model\u2019s performance. To overcome that limitation, Hinton suggested Rumelhart set minimizing error as a specific goal or \u201cobjective function,\u201d and derive a procedure to optimize the network to meet that goal. From that inspiration, Rumelhart found a way to send the error signal backward to teach neurons at lower levels of a model how to adjust the intensity of their connections. And he and Hinton showed that such networks could learn to perform computations that couldn\u2019t be solved with a single layer of modifiable connections. \u201cOthers developed backpropagation at around the same time,\u201d McClelland notes, \u201cbut it was Dave and Geoff\u2019s demonstrations of what backprop could do that struck a responsive chord.\u201d\u00a0<\/span><\/p>\n<p dir=\"ltr\"><span>At the time, Rumelhart was using backpropagation with networks that had a very small number of input units and one layer of units in between the inputs and the output, McClelland says. By contrast, today\u2019s models may have thousands of intermediate layers of neurons that are learning the same way.\u00a0<\/span><\/p>\n<p dir=\"ltr\"><span>Despite the elegance of the backpropagation algorithm, neural network models didn\u2019t immediately take off. Indeed, it wasn\u2019t until 25 years later that Hinton and his students leveraged Fei-Fei Li\u2019s ImageNet dataset \u2014 using computers that were many orders of magnitude more powerful than the computers Rumelhart had at his disposal \u2014 to demonstrate convolutional neural networks\u2019 impressive ability to classify images. \u201cBefore then, it was very hard to train networks that were deep enough or had sufficient training data,\u201d McClelland says.<\/span><\/p>\n<h2>From the Brain to AI and Back Again<\/h2>\n<p dir=\"ltr\"><span>Meanwhile, McClelland continued to use neural nets to model human cognition, consistently finding that these models effectively capture data from human experiments. He remains fascinated by the ways human cognition both resembles and differs from computerized neural networks. \u201cThe neural networks in our brains that allow us to function, speak, and communicate with each other in continuous sentences are clearly neural networks similar in some ways to these AI systems.\u201d<\/span><\/p>\n<p dir=\"ltr\"><span>Today&#8217;s language models, which use distributed representations and are trained using back propagation, have also achieved human-like fluency in translation, he says. \u201cThey can translate from one language to another in ways that no symbolic, rule-based system ever could.\u201d\u00a0<\/span><\/p>\n<p dir=\"ltr\"><span>In addition, unlike the models that preceded them, large language models that rely on the so-called transformer architecture exhibit an interesting brain-like feature: They can hold information in context as new information is provided. \u201cThese models are using the information in context as though it were sort of hanging in mind \u2014 like the last sentence somebody said to you,\u201d McClelland says.\u00a0\u00a0<\/span><\/p>\n<p dir=\"ltr\"><span>And that development inspired McClelland to join collaborators at Google DeepMind to explore whether neural network models, like humans, reason more accurately when they have prior contextual knowledge compared to when they are given completely abstract topics requiring symbolic logic.\u00a0<\/span><\/p>\n<p dir=\"ltr\"><span>For example, people struggle with a question like \u201cIf some A are B, and all B are C, are any C A?\u201d But phrase the same question in a specific context using familiar concepts (\u201cIf some cows are Herefords and all Herefords are mammals, are any mammals cows?\u201d), and they are more likely to give the correct answer. \u201cOur research found that that\u2019s also what these models do,\u201d McClelland says. \u201cThey are not pure logic machines. Humans and models alike infuse their thinking with their prior knowledge and beliefs.\u201d They are also biased toward factually true or widely believed conclusions, even when they don\u2019t follow from the given premises, he says. These results were published in\u00a0<\/span><a href=\"https:\/\/academic.oup.com\/pnasnexus\/article\/3\/7\/pgae233\/7712372\" target=\"_blank\" rel=\"noopener\"><span>a 2024 paper<\/span><\/a><span> in\u00a0<\/span><em>PNAS Nexus<\/em><span>.<\/span><\/p>\n<p dir=\"ltr\"><span>\u201cThis research helps me convince others that the way we humans think is less strictly logical and more grounded in the kind of intuitive knowledge that comes from adjusting connection strengths across a neural network,\u201d he says.<\/span><\/p>\n<p dir=\"ltr\"><span>Despite these similarities, McClelland notes that there are differences. One that separates humans from machines is our ability to learn both fast and with little data. \u201cThese language models need approximately 100,000 times more data than a human would need to learn a language. That\u2019s a lot!\u201d he says. \u201cSo, we\u2019re interested in understanding how the biological brain is capable of learning with far less data than today\u2019s AI systems.\u201d<\/span><\/p>\n<p dir=\"ltr\"><span>Rumelhart\u2019s backpropagation algorithm is part of the problem: \u201cIt\u2019s why these AI systems are so slow and require so much data,\u201d he says. Neural networks have nearly countless connections, and \u2014 compared with humans \u2014 they require lots of extra data to determine which connections matter most.\u00a0<\/span><\/p>\n<p dir=\"ltr\"><span>For example, if a large language model makes a mistake in predicting the last word in a sentence such as \u201cJohn likes coffee with cream and honey,\u201d it might learn to make the word \u201csugar\u201d less likely in general, rather than learning that it\u2019s just John who has unusual taste.<\/span><\/p>\n<p dir=\"ltr\"><span>\u201cAll these connections are getting little changes to try to reduce the error, but to figure out which ones are important, you have to include many training sentences in which the common preference for sugar is maintained \u2014 and that\u2019s inefficient,\u201d McClelland says.<\/span><\/p>\n<p dir=\"ltr\"><span>It\u2019s also not the way the brain works. \u201cBackpropagation was a wonderful solution to a computational problem,\u201d McClelland says. \u201cBut no one ever thought it captured an accurate view of how the brain works.\u201d In backpropagation, the network is activated in one direction and the errors are propagated backward across the same network, McClelland says. By contrast, in the brain, activation itself is bi-directional, and many different parts of the brain are interacting \u2014 including multiple senses perceiving the world simultaneously \u2014 to provide an integrated perceptual experience of the world.\u00a0<\/span><\/p>\n<p dir=\"ltr\"><span>Hinton was well aware that backpropagation failed to capture the way the brain works, and he went on to develop several other algorithms that are much closer to being biologically plausible, McClelland says. And now McClelland is taking on the same task but in a different way: by going back to studies of neuron activation in animals and humans.\u00a0<\/span><\/p>\n<p dir=\"ltr\"><span>\u201cI\u2019ve become inspired to find ways of understanding how our brains so efficiently target the right connections to adjust,\u201d he says.\u00a0<\/span><\/p>\n<\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/hai.stanford.edu\/news\/brain-machine-unexpected-journey-neural-networks\" target=\"_blank\" rel=\"noopener\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>When federal agencies issue a research grant, they never know if their investment will reap rewards for society. This was [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":6042,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/posts\/6041"}],"collection":[{"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/comments?post=6041"}],"version-history":[{"count":0,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/posts\/6041\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/media\/6042"}],"wp:attachment":[{"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/media?parent=6041"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/categories?post=6041"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/tags?post=6041"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}