When will machines learn?

Machine Learning, Dec 1989

Douglas B. Lenat

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1007%2FBF00130713.pdf

When will machines learn?

0 DOUGLAS B. LENAT Principal Scientist and Director of AI, MCC, 3500 West Balcones Center Drive , Austin, Texas 78759 Why don't our learning programs just keep on going and become generally intelligent? The source of the problem is that most of our learning occurs at the fringe of what we already know. The more you know, the more (and faster) you can learn. Unfortunately, fringe (analogical) reasoning is frequently employed purely as a dramatic device. For example, a news reporter talks about a child's valiant battle against disease; or a government issues a clinical-sounding report of a military containment and sterilization operation. This use obscures the fact that analogical reasoning is a critical component of human intelligence; it can help discover new concepts (e.g., is there a military analogue of vaccination? is there a medical analogue of propaganda?) and help flesh them out, as well as helping us to cope with novel situations. The inverse of "the more your know..." is the real culprit: not knowing much implies slow learning. Even the largest machine learning programs (e.g., Eurisko) know only a tiny, tiny fraction of what even a six-year-old child knows (10"4 things versus 10"'9 things). So Learning is fueled by Knowledge, and human-scale learning demands a human-scale amount of knowledge. I see two ways to get it: 1. The 100% Natural Approach: Figure out all the instincts, skills, needs, drives, and predispositions to learning that Nature (Evolution, God . . . . ) has hard-wired into human brains and spinal cords and sense organs, and figure out how neonates' raw perception refines into usable knowledge. Then build such a system incorporating all of those things, plus, of course, the right sort of "body" and allow it to "live" for years in the real world: nurture it, let it play, let it bump into walls, teach it to talk, let it go to kindergarten, etc. 2. The Prime the Pump Approach: Codify, in one immense knowledge base, the tens of millions of facts, algorithms, heuristics, stories, representations, etc., that "everybody knows'--the things that the writer of a newspaper article can safely assume that the reader already knows (consensus reality knowledge). - Once the large consensus reality knowledge base exists, either via methodology (1) or (2), then the everyday sort of fringe learning takes over, and the system should be educable in the usual ways: by giving it carefully graded series of readings to do, asking it thoughtprovoking questions, and helping it over novel or difficult parts by posing a good metaphor drawn from its existing knowledge base. There are many researchers who are working on limited forms of approach (1)--e.g., the CMU World Modeling Project--and approach (2)--e.g., the Stanford KSL Engineering Design Project. The CYC project, which Mary Shepherd and I have been working on at MCC since late 1984, is aiming at the fully scaled-up approach (2). We knew when we started that we would have to overcome many representation thorns (e.g., how to deal with time, space, belief, awareness, causality, emotion, stuffs, etc.) and methodological thorns (e.g., how to have tons of knowledge enterers simultaneously editing the same KB, and how to keep their semantics from diverging). Overcoming those thorns meant finding an adequate way to handle the 99 % of the common cases that crop up in everyday life. For example, CYC only represents pieces of time that can be expressed using a simple grammar; those pieces of time are interrelated using a set of 50 relations (such as ends-during) derived by R.V. Guha. We have developed two dozen specialized inference methods (such as inheritance, automatic classification, slot-value subsumption, Horn clause rules) rather than having a system that relies on one general inference procedure. CYC can't easily represent or reason about "the Cantor set of moments from three to four p.m.'--but then again, neither can most people! Time and again, that pragmatic focus (not always scruffy, by the way) has pulled us through. Lenat and Guha [1988] describes the CYC project in great detail and explains our solutions to each thorn. Since 1984, we've been building and organizing and reorganizing our growing consensus reality KB in CYC. We now have about half a million entries in it, and we expect it to increase by one order of magnitude by mid-1990 and one more by the end of 1994. We expect that at roughly that point, a kind of crossover will occur, and it will be cost-effective to enlarge the system from that point onward by having it learn mostly on its own and from online texts. Naturally, we must build up the CYC KB from some sort of primitives. We have assumed that it must be built from deeply understood knowledge rather than from complex "impenetrable" predicates (or slots or whatever). That is, you can't have LaysEggsInWater unless you also have eggs, water, and so on. At first, doing this just made life difficult; having a deep but small KB didn't pa (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2FBF00130713.pdf

Douglas B. Lenat. When will machines learn?, Machine Learning, 1989, pp. 255-257, Volume 4, Issue 3-4, DOI: 10.1007/BF00130713