Computational Intelligence
Michael I. Jordan and Stuart Russell
There are two complementary views of artificial intelligence (AI): one as an engineer- ing discipline concerned with the creation of intelligent machines, the other as an empirical science concerned with the computational modeling of human intelligence. When the field was young, these two views were seldom distinguished. Since then, a substantial divide has opened up, with the former view dominating modern AI and the latter view characterizing much of modern cognitive science. For this reason, we have adopted the more neutral term “computational intelligence” as the title of this arti- cle—both communities are attacking the problem of understanding intelligence in computational terms.
It is our belief that the differences between the engineering models and the cogni- tively inspired models are small compared to the vast gulf in competence between these models and human levels of intelligence. For humans are, to a first approxima- tion, intelligent; they can perceive, act, learn, reason, and communicate successfully despite the enormous difficulty of these tasks. Indeed, we expect that as further progress is made in trying to emulate this success, the engineering and cognitive mod- els will become more similar. Already, the traditionally antagonistic “connectionist” and “symbolic” camps are finding common ground, particularly in their understand- ing of reasoning under uncertainty and learning. This sort of cross-fertilization was a central aspect of the early vision of cognitive science as an interdisciplinary enter- prise.
1 Machines and Cognition
The conceptual precursors of AI can be traced back many centuries. LOGIC, the formal theory of deductive reasoning, was studied in ancient Greece, as were ALGORITHMS for mathematical computations. In the late seventeenth century, Wilhelm Leibniz actually constructed simple “conceptual calculators,” but their representational and combinatorial powers were far too limited. In the nineteenth century, Charles Babbage designed (but did not build) a device capable of universal computation, and his collab- orator Ada Lovelace speculated that the machine might one day be programmed to play chess or compose music. Fundamental work by ALAN TURING in the 1930s for- malized the notion of universal computation; the famous CHURCH–TURING THESIS pro- posed that all sufficiently powerful computing devices were essentially identical in the sense that any one device could emulate the operations of any other. From here it was a small step to the bold hypothesis that human cognition was a form of COMPUTATION in exactly this sense, and could therefore be emulated by computers.
By this time, neurophysiology had already established that the brain consisted largely of a vast interconnected network of NEURONS that used some form of electrical signalling mechanism. The first mathematical model relating computation and the brain appeared in a seminal paper entitled “A logical calculus of the ideas immanent in nervous activity,” by WARREN MCCULLOCH and WALTER PITTS (1943). The paper proposed an abstract model of neurons as linear threshold units—logical “gates” that output a signal if the weighted sum of their inputs exceeds a threshold value (see COM– PUTING IN SINGLE NEURONS). It was shown that a network of such gates could repre- sent any logical function, and, with suitable delay components to implement memory, would be capable of universal computation. Together with HEBB’s model of learning in networks of neurons, this work can be seen as a precursor of modern NEURAL NET– WORKS and connectionist cognitive modeling. Its stress on the representation of logi- cal concepts by neurons also provided impetus to the “logicist” view of AI.
The emergence of AI proper as a recognizable field required the availability of usable computers; this resulted from the wartime efforts led by Turing in Britain and by JOHN VON NEUMANN in the United States. It also required a banner to be raised;
Michael I. Jordan and Stuart Russell
There are two complementary views of artificial intelligence (AI): one as an engineer- ing discipline concerned with the creation of intelligent machines, the other as an empirical science concerned with the computational modeling of human intelligence. When the field was young, these two views were seldom distinguished. Since then, a substantial divide has opened up, with the former view dominating modern AI and the latter view characterizing much of modern cognitive science. For this reason, we have adopted the more neutral term “computational intelligence” as the title of this arti- cle—both communities are attacking the problem of understanding intelligence in computational terms.
It is our belief that the differences between the engineering models and the cogni- tively inspired models are small compared to the vast gulf in competence between these models and human levels of intelligence. For humans are, to a first approxima- tion, intelligent; they can perceive, act, learn, reason, and communicate successfully despite the enormous difficulty of these tasks. Indeed, we expect that as further progress is made in trying to emulate this success, the engineering and cognitive mod- els will become more similar. Already, the traditionally antagonistic “connectionist” and “symbolic” camps are finding common ground, particularly in their understand- ing of reasoning under uncertainty and learning. This sort of cross-fertilization was a central aspect of the early vision of cognitive science as an interdisciplinary enter- prise.
1 Machines and Cognition
The conceptual precursors of AI can be traced back many centuries. LOGIC, the formal theory of deductive reasoning, was studied in ancient Greece, as were ALGORITHMS for mathematical computations. In the late seventeenth century, Wilhelm Leibniz actually constructed simple “conceptual calculators,” but their representational and combinatorial powers were far too limited. In the nineteenth century, Charles Babbage designed (but did not build) a device capable of universal computation, and his collab- orator Ada Lovelace speculated that the machine might one day be programmed to play chess or compose music. Fundamental work by ALAN TURING in the 1930s for- malized the notion of universal computation; the famous CHURCH–TURING THESIS pro- posed that all sufficiently powerful computing devices were essentially identical in the sense that any one device could emulate the operations of any other. From here it was a small step to the bold hypothesis that human cognition was a form of COMPUTATION in exactly this sense, and could therefore be emulated by computers.
By this time, neurophysiology had already established that the brain consisted largely of a vast interconnected network of NEURONS that used some form of electrical signalling mechanism. The first mathematical model relating computation and the brain appeared in a seminal paper entitled “A logical calculus of the ideas immanent in nervous activity,” by WARREN MCCULLOCH and WALTER PITTS (1943). The paper proposed an abstract model of neurons as linear threshold units—logical “gates” that output a signal if the weighted sum of their inputs exceeds a threshold value (see COM– PUTING IN SINGLE NEURONS). It was shown that a network of such gates could repre- sent any logical function, and, with suitable delay components to implement memory, would be capable of universal computation. Together with HEBB’s model of learning in networks of neurons, this work can be seen as a precursor of modern NEURAL NET– WORKS and connectionist cognitive modeling. Its stress on the representation of logi- cal concepts by neurons also provided impetus to the “logicist” view of AI.
The emergence of AI proper as a recognizable field required the availability of usable computers; this resulted from the wartime efforts led by Turing in Britain and by JOHN VON NEUMANN in the United States. It also required a banner to be raised;
lxxiv Computational Intelligence
this was done with relish by Turing’s (1950) paper “Computing Machinery and Intel- ligence,” wherein an operational definition for intelligence was proposed (the Turing test) and many future developments were sketched out.
One should not underestimate the level of controversy surrounding AI’s initial phase. The popular press was only too ready to ascribe intelligence to the new “elec- tronic super-brains,” but many academics refused to contemplate the idea of intelli- gent computers. In his 1950 paper, Turing went to great lengths to catalogue and refute many of their objections. Ironically, one objection already voiced by Kurt Gödel, and repeated up to the present day in various forms, rested on the ideas of incompleteness and undecidability in formal systems to which Turing himself had contributed (see GÖDEL’S THEOREMS and FORMAL SYSTEMS, PROPERTIES OF). Other objectors denied the possibility of CONSCIOUSNESS in computers, and with it the pos- sibility of intelligence. Turing explicitly sought to separate the two, focusing on the objective question of intelligent behavior while admitting that consciousness might remain a mystery—as indeed it has.
The next step in the emergence of AI was the formation of a research community; this was achieved at the 1956 Dartmouth meeting convened by John McCarthy. Per- haps the most advanced work presented at this meeting was that of ALLEN NEWELL and Herb Simon, whose program of research in symbolic cognitive modeling was one of the principal influences on cognitive psychology and information-processing psy- chology. Newell and Simon’s IPL languages were the first symbolic programming languages and among the first high-level languages of any kind. McCarthy’s LISP language, developed slightly later, soon became the standard programming language of the AI community and in many ways remains unsurpassed even today.
Contemporaneous developments in other fields also led to a dramatic increase in the precision and complexity of the models that could be proposed and analyzed. In linguistics, for example, work by Chomsky (1957) on formal grammars opened up new avenues for the mathematical modeling of mental structures. NORBERT WIENER developed the field of cybernetics (see CONTROL THEORY and MOTOR CONTROL) to provide mathematical tools for the analysis and synthesis of physical control systems. The theory of optimal control in particular has many parallels with the theory of ratio- nal agents (see below), but within this tradition no model of internal representation was ever developed.
As might be expected from so young a field with so broad a mandate that draws on so many traditions, the history of AI has been marked by substantial changes in fash- ion and opinion. Its early days might be described as the “Look, Ma, no hands!” era, when the emphasis was on showing a doubting world that computers could play chess, learn, see, and do all the other things thought to be impossible. A wide variety of methods was tried, ranging from general-purpose symbolic problem solvers to simple neural networks. By the late 1960s, a number of practical and theoretical setbacks had convinced most AI researchers that there would be no simple “magic bullet.” The gen- eral-purpose methods that had initially seemed so promising came to be called weak methods because their reliance on extensive combinatorial search and first-principles knowledge could not overcome the complexity barriers that were, by that time, seen as unavoidable. The 1970s saw the rise of an alternative approach based on the applica- tion of large amounts of domain-specific knowledge, expressed in forms that were close enough to the explicit solution as to require little additional computation. Ed Feigenbaum’s gnomic dictum, “Knowledge is power,” was the watchword of the boom in industrial and commercial application of expert systems in the early 1980s.
When the first generation of expert system technology turned out to be too fragile for widespread use, a so-called AI Winter set in—government funding of AI and pub- lic perception of its promise both withered in the late 1980s. At the same time, a revival of interest in neural network approaches led to the same kind of optimism as had characterized “traditional” AI in the early 1980s. Since that time, substantial progress has been made in a number of areas within AI, leading to renewed commer- cial interest in fields such as data mining (applied machine learning) and a new wave of expert system technology based on probabilistic inference. The 1990s may in fact come to be seen as the decade of probability. Besides expert systems, the so-called
One should not underestimate the level of controversy surrounding AI’s initial phase. The popular press was only too ready to ascribe intelligence to the new “elec- tronic super-brains,” but many academics refused to contemplate the idea of intelli- gent computers. In his 1950 paper, Turing went to great lengths to catalogue and refute many of their objections. Ironically, one objection already voiced by Kurt Gödel, and repeated up to the present day in various forms, rested on the ideas of incompleteness and undecidability in formal systems to which Turing himself had contributed (see GÖDEL’S THEOREMS and FORMAL SYSTEMS, PROPERTIES OF). Other objectors denied the possibility of CONSCIOUSNESS in computers, and with it the pos- sibility of intelligence. Turing explicitly sought to separate the two, focusing on the objective question of intelligent behavior while admitting that consciousness might remain a mystery—as indeed it has.
The next step in the emergence of AI was the formation of a research community; this was achieved at the 1956 Dartmouth meeting convened by John McCarthy. Per- haps the most advanced work presented at this meeting was that of ALLEN NEWELL and Herb Simon, whose program of research in symbolic cognitive modeling was one of the principal influences on cognitive psychology and information-processing psy- chology. Newell and Simon’s IPL languages were the first symbolic programming languages and among the first high-level languages of any kind. McCarthy’s LISP language, developed slightly later, soon became the standard programming language of the AI community and in many ways remains unsurpassed even today.
Contemporaneous developments in other fields also led to a dramatic increase in the precision and complexity of the models that could be proposed and analyzed. In linguistics, for example, work by Chomsky (1957) on formal grammars opened up new avenues for the mathematical modeling of mental structures. NORBERT WIENER developed the field of cybernetics (see CONTROL THEORY and MOTOR CONTROL) to provide mathematical tools for the analysis and synthesis of physical control systems. The theory of optimal control in particular has many parallels with the theory of ratio- nal agents (see below), but within this tradition no model of internal representation was ever developed.
As might be expected from so young a field with so broad a mandate that draws on so many traditions, the history of AI has been marked by substantial changes in fash- ion and opinion. Its early days might be described as the “Look, Ma, no hands!” era, when the emphasis was on showing a doubting world that computers could play chess, learn, see, and do all the other things thought to be impossible. A wide variety of methods was tried, ranging from general-purpose symbolic problem solvers to simple neural networks. By the late 1960s, a number of practical and theoretical setbacks had convinced most AI researchers that there would be no simple “magic bullet.” The gen- eral-purpose methods that had initially seemed so promising came to be called weak methods because their reliance on extensive combinatorial search and first-principles knowledge could not overcome the complexity barriers that were, by that time, seen as unavoidable. The 1970s saw the rise of an alternative approach based on the applica- tion of large amounts of domain-specific knowledge, expressed in forms that were close enough to the explicit solution as to require little additional computation. Ed Feigenbaum’s gnomic dictum, “Knowledge is power,” was the watchword of the boom in industrial and commercial application of expert systems in the early 1980s.
When the first generation of expert system technology turned out to be too fragile for widespread use, a so-called AI Winter set in—government funding of AI and pub- lic perception of its promise both withered in the late 1980s. At the same time, a revival of interest in neural network approaches led to the same kind of optimism as had characterized “traditional” AI in the early 1980s. Since that time, substantial progress has been made in a number of areas within AI, leading to renewed commer- cial interest in fields such as data mining (applied machine learning) and a new wave of expert system technology based on probabilistic inference. The 1990s may in fact come to be seen as the decade of probability. Besides expert systems, the so-called