Pinker
PENGUIN BOOKS
HOW THE MIND WORKS
‘A witty, erudite, stimulating and provocative book that throws much new light on the machinery of the mind. An important book’ Kenan Malik, Independent on Sunday ‘He has a great deal to say, much of it ground-breaking, some of it highly controversial... a primer in remarkable science writing’ Simon Garfield, Mail on Sunday ‘As lengthy as it is, it will produce a book in the reader's head that is even longer. For it alters completely the way one thinks about thinking, and its unforeseen consequences probably can't be contained by a book’ Christopher Lehmann-Haupt, The New York Times ‘Witty popular science that you enjoy reading for the writing as well as for the science. No other science writer makes me laugh so much ... He is a top-rate writer, and deserves the superlatives that are lavished on him’ Mark Ridley, The New York Times Book Review ‘The humour, breadth and clarity of thought... make this work essential reading for anyone curious about the human mind’ Raymond Dolan, Observer |
<< | >> |
Steven Pinker, a native of Montreal, studied experimental psychology at McGill University and Harvard University. After serving on the faculties of Harvard and Stanford universities he moved to the Massachusetts Institute of Technology, where he is currently Professor of Psychology and Director of the Centre for Cognitive Neuroscience. Pinker has studied many aspects of language and of visual cognition, with a focus on language acquisition in children. He is a fellow of several scientific societies, and has been awarded research prizes from the the National Academy of Sciences and the American Psychological Association, a teaching prize from MIT, and book prizes from the American Psychological Association, the Linguistics Society of America and the Los Angeles Times. His classic The Language Instinct is also available in Penguin.
<< | {i} | >> |
HOW
THE MIND
WORKS
Steven Pinker
PENGUIN BOOKS
<< | {ii} | >> |
PENGUIN BOOKS
Published by the Penguin Group
Penguin Books Ltd, 27 Wrights Lane, London W8 5TZ, England
Penguin Putnam Inc., 375 Hudson Street, New York, New York 10014, USA
Penguin Books Australia Ltd, Ringwood, Victoria, Australia
Penguin Books Canada Ltd, 10 Alcorn Avenue, Toronto, Ontario, Canada M4V 3B2
Penguin Books (NZ) Ltd, 182-190 Wairau Road, Auckland 10, New Zealand
Penguin Books Ltd, Registered Offices: Harmondsworth, Middlesex, England
First published in the USA by W. W. Norton 1997
First published in Great Britain by Allen Lane The Penguin Press 1998
Published in Penguin Books 1998
1 3 5 7 9 10 8 6 4 2
Copyright © Stephen Pinker, 1997
All rights reserved
The notices on page 627 constitute an extension of this copyright page
The moral right of the author has been asserted
Printed in England by Clays Ltd, St Ives pic
Except in the United States of America, this book is sold subject to the condition that it shall not, by way of trade or otherwise, be lent, re-sold, hired out, or otherwise circulated without the publisher's prior consent in any form of binding or cover other than that in which it is published and without a similar condition including this condition being imposed on the subsequent purchaser
<< | {vii} | >> |
CONTENTS
Preface ix
2 Thinking Machines 59
3 Revenge of the Nerds 149
4 The Mind's Eye 211
5 Good Ideas 299
6 Hotheads 363
7 Family Values 425
8 The Meaning of Life 521
Notes 567
References 589
Index 629
<< | {ix} | >> |
Any book called How the Mind Works had better begin on a note of humility, and I will begin with two.
First, we don't understand how the mind works — not nearly as well as we understand how the body works, and certainly not well enough to design Utopia or to cure unhappiness. Then why the audacious title? The linguist Noam Chomsky once suggested that our ignorance can be divided into problems and mysteries. When we face a problem, we may not know its solution, but we have insight, increasing knowledge, and an inkling of what we are looking for. When we face a mystery, however, we can only stare in wonder and bewilderment, not knowing what an explanation would even look like. I wrote this book because dozens of mysteries of the mind, from mental images to romantic love, have recently been upgraded to problems (though there are still some mysteries, too!). Every idea in the book may turn out to be wrong, but that would be progress, because our old ideas were too vapid to be wrong.
Second, I have not discovered what we do know about how the mind works. Few of the ideas in the pages to follow are mine. I have selected, from many disciplines, theories that strike me as offering a special insight into our thoughts and feelings, that fit the facts and predict new ones, and that are consistent in their content and in their style of explanation. My goal was to weave the ideas into a cohesive picture using two even bigger ideas that are not mine: the computational theory of mind and the theory of the natural selection of replicators. {x}
The opening chapter presents the big picture: that the mind is a system of organs of computation designed by natural selection to solve the problems faced by our evolutionary ancestors in their foraging way of life. Each of the two big ideas — computation and evolution — then gets a chapter. I dissect the major faculties of the mind in chapters on perception, reasoning, emotion, and social relations (family, lovers, rivals, friends, acquaintances, allies, enemies). A final chapter discusses our higher callings: art, music, literature, humor, religion, and philosophy. There is no chapter on language; my previous book The Language Instinct covers the topic in a complementary way.
This book is intended for anyone who is curious about how the mind works. I didn't write it only for professors and students, but I also didn't write it only to “popularize science.” I am hoping that scholars and general readers both might profit from a bird's-eye view of the mind and how it enters into human affairs. At this high altitude there is little difference between a specialist and a thoughtful layperson because nowadays we specialists cannot be more than laypeople in most of our own disciplines, let alone neighboring ones. I have not given comprehensive literature reviews or an airing of all sides to every debate, because they would have made the book unreadable, indeed, unliftable. My conclusions come from assessments of the convergence of evidence from different fields and methods, and I have provided detailed citations so readers can follow them up.
I have intellectual debts to many teachers, students, and colleagues, but most of all to John Tooby and Leda Cosmides. They forged the synthesis between evolution and psychology that made this book possible, and thought up many of the theories I present (and many of the better jokes). By inviting me to spend a year as a Fellow of the Center for Evolutionary Psychology at the University of California, Santa Barbara, they provided an ideal environment for thinking and writing and immeasurable friendship and advice.
I am deeply grateful to Michael Gazzaniga, Marc Hauser, David Kemmerer, Gary Marcus, John Tooby, and Margo Wilson for their relading of the entire manuscript and their invaluable criticism and encouragement. Other colleagues generously commented on chapters in their areas of expertise: Edward Adelson, Barton Anderson, Simon Baron-Cohen, Ned Block, Paul Bloom, David Brainard, David Buss, John Constable, Leda Cosmides, Helena Cronin, Dan Dennett, David Epstein, Alan Fridlund, Gerd Gigerenzer, Judith Harris, Richard Held, Ray Jackendoff, Alex Kacelnik, Stephen Kosslyn, Jack Loomis, Charles Oman, Bernalrd Sherman, {xi} Paul Smolensky, Elizabeth Spelke, Frank Sulloway, Donald Symons, and Michael Tarr. Many others answered queries and offered profitable suggestions, including Robert Boyd, Donald Brown, Napoleon Chagnon, Martin Daly, Richard Dawkins, Robert Hadley, James Hillenbrand, Don Hoffman, Kelly Olguin Jaakola, Timothy Ketelaar, Robert Kurzban, Dan Montello, Alex Pentland, Roslyn Pinker, Robert Provine, Whitman Richards, Daniel Schacter, Devendra Singh, Pawan Sinha, Christopher Tyler, Jeremy Wolfe, and Robert Wright.
This book is a product of the stimulating environments at two institutions, the Massachusetts Institute of Technology and the University of California, Santa Barbara. Special thanks go to Emilio Bizzi of the Department of Brain and Cognitive Sciences at MIT for enabling me to take a sabbatical leave, and to Loy Lytle and Aaron Ettenberg of the Department of Psychology and to Patricia Clancy and Marianne Mithun of the Department of Linguistics at UCSB for inviting me to be a Visiting Scholar in their departments.
Patricia Claffey of MIT's Teuber Library knows everything, or at least knows where to find it, which is just as good. I am grateful for her indefatigable efforts to track down the obscurest material with swiftness and good humor. My secretary, the well-named Eleanor Bonsaint, offered professional, cheerful help in countless matters. Thanks go also to Marianne Teuber and to Sabrina Detmar and Jennifer Riddell of MITs List Visual Arts Center for advice on the jacket art.
My editors, Drake McFeely (Norton), Howard Boyer (now at the University of California Press), Stefan McGrath (Penguin), and Ravi Mirchandani (now at Orion), offered fine advice and care throughout. I am also grateful to my agents, John Brockman and Katinka Matson, for their efforts on my behalf and their dedication to science writing. Special appreciation goes to Katya Rice, who has now worked with me on four books over fourteen years. Her analytical eye and masterly touch have improved the books and have taught me much about clarity and style.
My heartfelt gratitude goes to my family for their encouragement and suggestions: to Harry, Roslyn, Robert, and Susan Pinker, Martin, Eva, Carl, and Eric Boodman, Saroja Subbiah, and Stan Adams. Thanks, too, to Windsor, Wilfred, and Fiona.
Greatest thanks of all go to my wife, Ilavenil Subbiah, who designed the figures, provided invaluable comments on the manuscript, offered constant advice, support, and kindness, and shared in the adventure. This book is dedicated to her, with love and gratitude. {xii}
My research on mind and language has been supported by the National Institutes of Health (grant HD 18381), the National Science Foundation (grants 82-09540, 85-18774, and 91-09766), and the McDonnell-Pew Center for Cognitive Neuroscience at MIT.
<< | {1} | >> |
HOW
THE MIND
WORKS
<< | {3} | >> |
W |
hy are there so many robots in fiction, but none in real life? I would pay a lot for a robot that could put away the dishes or run simple errands. But I will not have the opportunity in this century, and probably not in the next one either. There are, of course, robots that weld or spray-paint on assembly lines and that roll through laboratory hallways; my question is about the machines that walk, talk, see, and think, often better than their human masters. Since 1920, when Karel Capek coined the word robot in his play R.U.R., dramatists have freely conjured them up: Speedy, Cutie, and Dave in Isaac Asimov's I, Robot, Robbie in Forbidden Planet, the flailing canister in Lost in Space, the daleks in Dr. Who, Rosie the Maid in The Jetsons, Nomad in Star Trek, Hymie in Get Smart, the vacant butlers and bickering haberdashers in Sleeper, R2D2 and C3PO in Star Wars, the Terminator in The Terminator, Lieutenant Commander Data in Star Trek: The Next Generation, and the wisecracking film critics in Mystery Science Theater 3000.
This book is not about robots; it is about the human mind. I will try to explain what the mind is, where it came from, and how it lets us see, think, feel, interact, and pursue higher callings like art, religion, and philosophy. On the way I will try to throw light on distinctively human quirks. Why do memories fade? How does makeup change the look of a face? Where do ethnic stereotypes come from, and when are they irrational? Why do people lose their tempers? What makes children bratty? Why do fools fall in love? What makes us laugh? And why do people believe in ghosts and spirits? {4}
But the gap between robots in imagination and in reality is my starting point, for it shows the first step we must take in knowing ourselves: appreciating the fantastically complex design behind feats of mental life we take for granted. The reason there are no humanlike robots is not that the very idea of a mechanical mind is misguided. It is that the engineering problems that we humans solve as we see and walk and plan and make it through the day are far more challenging than landing on the moon or sequencing the human genome. Nature, once again, has found ingenious solutions that human engineers cannot yet duplicate. When Hamlet says, “What a piece of work is a man! how noble in reason! how infinite in faculty! in form and moving how express and admirable!” we should direct our awe not at Shakespeare or Mozart or Einstein or Kareem Abdul-Jabbar but at a four-year old carrying out a request to put a toy on a shelf.
In a well-designed system, the components are black boxes that perform their functions as if by magic. That is no less true of the mind. The faculty with which we ponder the world has no ability to peer inside itself or our other faculties to see what makes them tick. That makes us the victims of an illusion: that our own psychology comes from some divine force or mysterious essence or almighty principle. In the Jewish legend of the Golem, a clay figure was animated when it was fed an inscription of the name of God. The archetype is echoed in many robot stories. The statue of Galatea was brought to life by Venus’ answer to Pygmalion's prayers; Pinocchio was vivified by the Blue Fairy. Modern versions of the Golem archetype appear in some of the less fanciful stories of science. All of human psychology is said to be explained by a single, omnipotent cause: a large brain, culture, language, socialization, learning, complexity, self-organization, neural-network dynamics.
I want to convince you that our minds are not animated by some godly vapor or single wonder principle. The mind, like the Apollo spacecraft, is designed to solve many engineering problems, and thus is packed with high-tech systems each contrived to overcome its own obstacles. I begin by laying out these problems, which are both design specs for a robot and the subject matter of psychology. For I believe that the discovery by cognitive science and artificial intelligence of the technical challenges overcome by our mundane mental activity is one of the great revelations of science, an awakening of the imagination comparable to learning that the universe is made up of billions of galaxies or that a drop of pond water teems with microscopic life. {5}
What does it take to build a robot? Let's put aside superhuman abilities like calculating planetary orbits and begin with the simple human ones: seeing, walking, grasping, thinking about objects and people, and planning how to act.
In movies we are often shown a scene from a robot's-eye view, with the help of cinematic conventions like fish-eye distortion or crosshairs. That is fine for us, the audience, who already have functioning eyes and brains. But it is no help to the robot's innards. The robot does not house an audience of little people — homunculi — gazing at the picture and telling the robot what they are seeing. If you could see the world through a robot's eyes, it would look not like a movie picture decorated with crosshairs but something like this:
225 221 216 219 219 214 207 218 219 220 207 155 136 135 213 206 213 223 208 217 223 221 223 216 195 156 141 130 206 217 210 216 224 223 228 230 234 216 207 157 136 132 211 213 221 223 220 222 237 216 219 220 176 149 137 132 221 229 218 230 228 214 213 209 198 224 161 140 133 127 220 219 224 220 219 215 215 206 206 221 159 143 133 131 221 215 211 214 220 218 221 212 218 204 148 141 131 130 214 211 211 218 214 220 226 216 223 209 143 141 141 124 211 208 223 213 216 226 231 230 241 199 153 141 136 125 200 224 219 215 217 224 232 241 240 211 150 139 128 132 204 206 208 205 233 241 241 252 242 192 151 141 133 130 200 205 201 216 232 248 255 246 231 210 149 141 132 126 191 194 209 238 245 255 249 235 238 197 146 139 130 132 189 199 200 227 239 237 235 236 247 192 145 142 124 133 198 196 209 211 210 215 236 240 232 177 142 137 135 124 198 203 205 208 211 224 226 240 210 160 139 132 129 130 216 209 214 220 210 231 245 219 169 143 148 129 128 136 211 210 217 218 214 227 244 221 162 140 139 129 133 131 215 210 216 216 209 220 248 200 156 139 131 129 139 128 219 220 211 208 205 209 240 217 154 141 127 130 124 142 229 224 212 214 220 229 234 208 151 145 128 128 142 122 252 224 222 224 233 244 228 213 143 141 135 128 131 129 255 235 230 249 253 240 228 193 147 139 132 128 136 125 250 245 238 245 246 235 235 190 139 136 134 135 126 130 |
Each number represents the brightness of one of the millions of tiny patches making up the visual field. The smaller numbers come from darker patches, the larger numbers from brighter patches. The numbers shown in the array are the actual signals coming from an electronic camera trained on a person's hand, though they could just as well be the firing rates of some of the nerve fibers coming from the eye to the brain as a person looks at a hand. For a robot brain — or a human brain — to recognize objects and not bump into them, it must crunch these numbers and guess what kinds of objects in the world reflected the light that gave rise to them. The problem is humblingly difficult.
First, a visual system must locate where an object ends and the backdrop begins. But the world is not a coloring book, with black outlines around solid regions, The world as it is projected into our eyes is a mosaic of tiny shaded patches. Perhaps, one could guess, the visual brain looks for regions where a quilt of large numbers (a brighter region) abuts a quilt of small numbers (a darker region). You can discern such a boundary in the square of numbers; it runs diagonally from the top right to the bottom center. Most of the time, unfortunately, you would not have found the edge of an object, where it gives way to empty space. The juxtaposition of large and small numbers could have come from many distinct arrangements of matter. This drawing, devised by the psychologists Pawan Sinha and Edward Adelson, appears to show a ring of light gray and dark gray tiles. {7}
In fact, it is a rectangular cutout in a black cover through which you are looking at part of a scene. In the next drawing the cover has been removed, and you can see that each pair of side-by-side gray squares comes from a different arrangement of objects.
Big numbers next to small numbers can come from an object standing in front of another object, dark paper lying on light paper, a surface painted two shades of gray, two objects touching side by side, gray cellophane on a white page, an inside or outside corner where two walls meet, or a shadow. Somehow the brain must solve the chicken-and-egg problem of identifying three-dimensional objects from the patches on the retina and determining what each patch is (shadow or paint, crease or overlay, clear or opaque) from knowledge of what object the patch is part of.
The difficulties have just begun. Once we have carved the visual world into objects, we need to know what they are made of, say, snow versus coal. At first glance the problem looks simple. If large numbers come from bright regions and small numbers come from dark regions, then large number equals white equals snow and small number equals black equals coal, right? Wrong. The amount of light hitting a spot on the retina depends not only on how pale or dark the object is but also on how bright or dim the light illuminating the object is. A photographer's light meter would show you that more light bounces off a lump of coal outdoors than off a snowball indoors. That is why people are so often disappointed by their snapshots and why photography is such a complicated craft. The camera does not lie; left to its own devices, it renders outdoor {8} scenes as milk and indoor scenes as mud. Photographers, and sometimes microchips inside the camera, coax a realistic image out of the film with tricks like adjustable shutter timing, lens apertures, film speeds, flashes, and darkroom manipulations.
Our visual system does much better. Somehow it lets us see the bright outdoor coal as black and the dark indoor snowball as white. That is a happy outcome, because our conscious sensation of color and lightness matches the world as it is rather than the world as it presents itself to the eye. The snowball is soft and wet and prone to melt whether it is indoors or out, and we see it as white whether it is indoors or out. The coal is always hard and dirty and prone to burn, and we always see it as black. The harmony between how the world looks and how the world is must be an achievement of our neural wizardry, because black and white don't simply announce themselves on the retina. In case you are still skeptical, here is an everyday demonstration. When a television set is off, the screen is a pale greenish gray. When it is on, some of the phosphor dots give off light, painting in the bright areas of the picture. But the other dots do not suck light and paint in the dark areas; they just stay gray. The areas that you see as black are in fact just the pale shade of the picture tube when the set was off. The blackness is a figment, a product of the brain circuitry that ordinarily allows you to see coal as coal. Television engineers exploited that circuitry when they designed the screen.
The next problem is seeing in depth. Our eyes squash the three-dimensional world into a pair of two-dimensional retinal images, and the third dimension must be reconstituted by the brain. But there are no telltale signs in the patches on the retina that reveal how far away a surface is. A stamp in your palm can project the same square on your retina as a chair across the room or a building miles away (first drawing, page 9). A cutting board viewed head-on can project the same trapezoid as various irregular shards held at a slant (second drawing, page 9).
You can feel the force of this fact of geometry, and of the neural mechanism that copes with it, by staring at a lightbulb for a few seconds or looking at a camera as the flash goes off, which temporarily bleaches a patch onto your retina. If you now look at the page in front of you, the afterimage adheres to it and appears to be an inch or two across. If you look up at the wall, the afterimage appears several feet long. If you look at the sky, it is the size of a cloud.
Finally, how might a vision module recognize the objects out there in the world, so that the robot can name them or recall what they do? The {9}
obvious solution is to build a template or cutout for each object that duplicates its shape. When an object appears, its projection on the retina would fit its own template like a round peg in a round hole. The template would be labeled with the name of the shape — in this case, “the letter P” — and whenever a shape matches it, the template announces the name:
Alas, this simple device malfunctions in both possible ways. It sees P's that aren't there; for example, it gives a false alarm to the R shown in the first square below. And it fails to see P's that are there; for example, it misses the letter when it is shifted, tilted, slanted, too far, too near, or too fancy: {10}
And these problems arise with a nice, crisp letter of the alphabet. Imagine trying to design a recognizer for a shirt, or a face! To be sure, after four decades of research in artificial intelligence, the technology of shape recognition has improved. You may own software that scans in a page, recognizes the printing, and converts it with reasonable accuracy to a file of bytes. But artificial shape recognizers are still no match for the ones in our heads. The artificial ones are designed for pristine, easy-to-recognize worlds and not the squishy, jumbled real world. The funny numbers at the bottom of checks were carefully drafted to have shapes that don't overlap and are printed with special equipment that positions them exactly so that they can be recognized by templates. When the first face recognizers are installed in buildings to replace doormen, they will not even try to interpret the chiaroscuro of your face but will scan in the hard-edged, rigid contours of your iris or your retinal blood vessels. Our brains, in contrast, keep a record of the shape of every face we know (and every letter, animal, tool, and so on), and the record is somehow matched with a retinal image even when the image is distorted) in all the ways we have been examining. In Chapter 4 we will explore how the brain accomplishes this magnificent feat.
Let's take a look at another everyday miracle: getting a body from place to place. When we want a machine to move, we put it on wheels. The invention of the wheel is often held up as the proudest accomplishment of civilization. Many textbooks point out that no animal has evolved wheels and cite the fact as an example of how evolution is often incapable of finding the optimal solution to an engineering problem. But it is not a good example at all. Even if nature could have evolved a moose on wheels, it surely would have opted not to. Wheels are good only in a world with roads and rails. They bog down in any terrain that is soft, slippery, steep, or uneven. Legs are better. Wheels have to roll along an unbroken supporting ridge, but legs can be placed on a series of separate footholds, an extretne example being a ladder. Legs can also be placed to minimize lurching and to step over obstacles. Even today, when it seems as if the world has become a parking lot, only about half of the earth's land is accessible to vehicles with wheels or tracks, but most of the earths land is accessible to vehicles with feet: animals, the vehicles designed by natural selection. {11}
But legs come with a high price: the software to control them. A wheel, merely by turning, changes its point of support gradually and can bear weight the whole time. A leg has to change its point of support all at once, and the weight has to be unloaded to do so. The motors controlling a leg have to alternate between keeping the foot on the ground while it bears and propels the load and taking the load off to make the leg free to move. All the while they have to keep the center of gravity of the body within the polygon defined by the feet so the body doesn't topple over. The controllers also must minimize the wasteful up-and-down motion that is the bane of horseback riders. In walking windup toys, these problems are crudely solved by a mechanical linkage that converts a rotating shaft into a stepping motion. But the toys cannot adjust to the terrain by rinding the best footholds.
Even if we solved these problems, we would have figured out only how to control a walking insect. With six legs, an insect can always keep one tripod on the ground while it lifts the other tripod. At any instant, it is stable. Even four-legged beasts, when they aren't moving too quickly, can keep a tripod on the ground at all times. But as one engineer has put it, “the upright two-footed locomotion of the human being seems almost a recipe for disaster in itself, and demands a remarkable control to make it practicable.” When we walk, we repeatedly tip over and break our fall in the nick of time. When we run, we take off in bursts of flight. These aerobatics allow us to plant our feet on widely or erratically spaced footholds that would not prop us up at rest, and to squeeze along narrow paths and jump over obstacles. But no one has yet figured out how we do it.
Controlling an arm presents a new challenge. Grab the shade of an architects lamp and move it along a straight diagonal path from near you, low on the left, to far from you, high on the right. Look at the rods and hinges as the lamp moves. Though the shade proceeds along a straight line, each rod swings through a complicated arc, swooping rapidly at times, remaining almost stationary at other times, sometimes reversing from a bending to a straightening motion. Now imagine having to do it in reverse: without looking at the shade, you must choreograph the sequence of twists around each joint that would send the shade along a straight path. The trigonometry is frightfully complicated. But your arm is an architect's lamp, and your brain effortlessly solves the equations every time you point. And if you have ever held an architect's lamp by its clamp, you will appreciate that the problem is even harder than what I have described. The lamp flails under its weight as if it had a mind of its {12} own; so would your arm if your brain did not compensate for its weight, solving a near-intractable physics problem.
A still more remarkable feat is controlling the hand. Nearly two thousand years ago, the Greek physician Galen pointed out the exquisite natural engineering behind the human hand. It is a single tool that manipulates objects of an astonishing range of sizes, shapes, and weights, from a log to a millet seed. “Man handles them all,” Galen noted, “as well as if his hands had been made for the sake of each one of them alone.” The hand can be configured into a hook grip (to lift a pail), a scissors grip (to hold a cigarette), a five-jaw chuck (to lift a coaster), a three-jaw chuck (to hold a pencil), a two-jaw pad-to-pad chuck (to thread a needle), a two-jaw pad-to-side chuck (to turn a key), a squeeze grip (to hold a hammer), a disc grip (to open a jar), and a spherical grip (to hold a ball). Each grip needs a precise combination of muscle tensions that mold the hand into the right shape and keep it there as the load tries to bend it back. Think of lifting a milk carton. Too loose a grasp, and you drop it; too tight, and you crush it; and with some gentle rocking, you can even use the tugging on your fingertips as a gauge of how much milk is inside! And I won't even begin to talk about the tongue, a boneless water balloon controlled only by squeezing, which can loosen food from a back tooth or perform the ballet that articulates words like thrilling and sixths.
“A common man marvels at uncommon things; a wise man marvels at the commonplace.” Keeping Confucius’ dictum in mind, let's continue to look at commonplace human acts with the fresh eye of a robot designer seeking to duplicate them. Pretend that we have somehow built a robot that can see and move. What will it do with what it sees? How should it decide how to act?
An intelligent being cannot treat every object it sees as a unique entity unlike anything else in the universe. It has to put objects in categories so that it may apply its hard-won knowledge about similar objects, encountered in the past, to the object at hand.
But whenever one tries to program a set of criteria to capture the members of a category, the category disintegrates. Leaving aside slippery concepts like “beauty” or “dialectical materialism,” let's look at a textbook {13} example of a well-defined one: “bachelor.” A bachelor, of course, is simply an adult human male who has never been married. But now imagine that a friend asks you to invite some bachelors to her party. What would happen if you used the definition to decide which of the following people to invite?
Arthur has been living happily with Alice for the last five years. They have a two-year-old daughter and have never officially married.
Bruce was going to be drafted, so he arranged with his friend Barbara to have a justice of the peace marry them so he would be exempt. They have never lived together. He dates a number of women, and plans to have the marriage annulled as soon as he finds someone he wants to marry.
Charlie is 17 years old. He lives at home with his parents and is in high school.
David is 17 years old. He left home at 13, started a small business, and is now a successful young entrepreneur leading a playboy's lifestyle in his penthouse apartment.
Eli and Edgar are homosexual lovers who have been living together for many years.
Faisal is allowed by the law of his native Abu Dhabi to have three wives. He currently has two and is interested in meeting another potential fiancee.
Father Gregory is the bishop of the Catholic cathedral at Groton upon Thames.
The list, which comes from the computer scientist Terry Winograd, shows that the straightforward definition of “bachelor” does not capture our intuitions about who fits the category.
Knowing who is a bachelor is just common sense, but there's nothing common about common sense. Somehow it must find its way into a human or robot brain. And common sense is not simply an almanac about life that can be dictated by a teacher or downloaded like an enormous database. No database could list all the facts we tacitly know, and no one ever taught them to us. You know that when Irving puts the dog in the car, it is no longer in the yard. When Edna goes to church, her head goes with her. If Doug is in the house, he must have gone in through some opening unless he was born there and never left. If Sheila is alive {14} at 9 A.M. and is alive at 5 P.M., she was also alive at noon. Zebras in the wild never wear underwear. Opening a jar of a new brand of peanut butter will not vaporize the house. People never shove meat thermometers in their ears. A gerbil is smaller than Mt. Kilimanjaro.
An intelligent system, then, cannot be stuffed with trillions of facts. It must be equipped with a smaller list of core truths and a set of rules to deduce their implications. But the rules of common sense, like the categories of common sense, are frustratingly hard to set down. Even the most straightforward ones fail to capture our everyday reasoning. Mavis lives in Chicago and has a son named Fred, and Millie lives in Chicago and has a son named Fred. But whereas the Chicago that Mavis lives in is the same Chicago that Millie lives in, the Fred who is Mavis’ son is not the same Fred who is Millie's son. If there's a bag in your car, and a gallon of milk in the bag, there is a gallon of milk in your car. But if there's a person in your car, and a gallon of blood in a person, it would be strange to conclude that there is a gallon of blood in your car.
Even if you were to craft a set of rules that derived only sensible conclusions, it is no easy matter to use them all to guide behavior intelligently. Clearly a thinker cannot apply just one rule at a time. A match gives light; a saw cuts wood; a locked door is opened with a key. But we laugh at the man who lights a match to peer into a fuel tank, who saws off the limb he is sitting on, or who locks his keys in the car and spends the next hour wondering how to get his family out. A thinker has to compute not just the direct effects of an action but the side effects as well.
But a thinker cannot crank out predictions about all the side effects, either. The philosopher Daniel Dennett asks us to imagine a robot designed to fetch a spare battery from a room that also contained a time bomb. Version 1 saw that the battery was on a wagon and that if it pulled the wagon out of the room, the battery would come with it. Unfortunately, the bomb was also on the wagon, and the robot failed to deduce that pulling the wagon out brought the bomb out, too. Version 2 was programmed to consider all the side effects of its actions. It had just finished computing that pulling the wagon would not change the color of the room's walls and was proving that the wheels would turn more revolutions than there are wheels on the wagon, when the bomb went off. Version 3 was programmed to distinguish between relevant implications and irrelevant ones. It sat there cranking out millions of implications and putting all the relevant ones on a list of facts to consider and all the irrelevant ones on a list of facts to ignore, as the bomb ticked away. {15}
An intelligent being has to deduce the implications of what it knows, but only the relevant implications. Dennett points out that this requirement poses a deep problem not only for robot design but for epistemology, the analysis of how we know. The problem escaped the notice of generations of philosophers, who were left complacent by the illusory effortlessness of their own common sense. Only when artificial intelligence researchers tried to duplicate common sense in computers, the ultimate blank slate, did the conundrum, now called “the frame problem,” come to light. Yet somehow we all solve the frame problem whenever we use our common sense.
Imagine that we have somehow overcome these challenges and have a machine with sight, motor coordination, and common sense. Now we must figure out how the robot will put them to use. We have to give it motives.
What should a robot want? The classic answer is Isaac Asimov's Fundamental Rules of Robotics, “the three rules that are built most deeply into a robot's positronic brain.”
1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
2. A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
Asimov insightfully noticed that self-preservation, that universal biological imperative, does not automatically emerge in a complex system. It has to be programmed in (in this case, as the Third Law). After all, it is just as easy to build a robot that lets itself go to pot or eliminates a malfunction by committing suicide as it is to build a robot that always looks out for Number One. Perhaps easier; robot-makers sometimes watch in horror as their creations cheerfully shear off limbs or flatten themselves against walls, and a good proportion of the world's most intelligent machines are kamikaze cruise missiles and smart bombs.
But the need for the other two laws is far from obvious. Why give a {16} robot an order to obey orders — why aren't the original orders enough? Why command a robot not to do harm — wouldn't it be easier never to command it to do harm in the first place? Does the universe contain a mysterious force pulling entities toward malevolence, so that a positronic brain must be programmed to withstand it? Do intelligent beings inevitably develop an attitude problem?
In this case Asimov, like generations of thinkers, like all of us, was unable to step outside his own thought processes and see them as artifacts of how our minds were put together rather than as inescapable laws of the universe. Man's capacity for evil is never far from our minds, and it is easy to think that evil just comes along with intelligence as part of its very essence. It is a recurring theme in our cultural tradition: Adam and Eve eating the fruit of the tree of knowledge, Promethean fire and Pandora's box, the rampaging Golem, Faust's bargain, the Sorcerer's Apprentice, the adventures of Pinocchio, Frankenstein's monster, the murderous apes and mutinous HAL of 2001: A Space Odyssey. From the 1950s through the 1980s, countless films in the computer-runs-amok genre captured a popular fear that the exotic mainframes of the era would get smarter and more powerful and someday turn on us.
Now that computers really have become smarter and more powerful, the anxiety has waned. Today's ubiquitous, networked computers have an unprecedented ability to do mischief should they ever go to the bad. But the only mayhem comes from unpredictable chaos or from human malice in the form of viruses. We no longer worry about electronic serial killers or subversive silicon cabals because we are beginning to appreciate that malevolence — like vision, motor coordination, and common sense — does not come free with computation but has to be programmed in. The computer running WordPerfect on your desk will continue to fill paragraphs for as long as it does anything at all. Its software will not insidiously mutate into depravity like the picture of Dorian Gray.
Even if it could, why would it want to? To get — what? More floppy disks? Control over the nation's railroad system? Gratification of a desire to commit senseless violence against laser-printer repairmen? And wouldn't it have to worry about reprisals from technicians who with the turn of a screwdriver could leave it pathetically singing “A Bicycle Built for Two”? A network of computers, perhaps, could discover the safety in numbers and plot an organized takeover — but what would make one computer volunteer to fire the data packet heard round the world and risk early martyrdom? And what would prevent the coalition from being {17} undermined by silicon draft-dodgers and conscientious objectors? Aggression, like every other part of human behavior we take for granted, is a challenging engineering problem!
But then, so are the kinder, gentler motives. How would you design a robot to obey Asimov's injunction never to allow a human being to come to harm through inaction? Michael Frayn's 1965 novel The Tin Men is set in a robotics laboratory, and the engineers in the Ethics Wing, Macintosh, Goldwasser, and Sinson, are testing the altruism of their robots. They have taken a bit too literally the hypothetical dilemma in every moral philosophy textbook in which two people are in a lifeboat built for one and both will die unless one bails out. So they place each robot in a raft with another occupant, lower the raft into a tank, and observe what happens.
[The] first attempt, Samaritan I, had pushed itself overboard with great alacrity, but it had gone overboard to save anything which happened to be next to it on the raft, from seven stone of lima beans to twelve stone of wet seaweed. After many weeks of stubborn argument Macintosh had conceded that the lack of discrimination was unsatisfactory, and he had abandoned Samaritan I and developed Samaritan II, which would sacrifice itself only for an organism at least as complicated as itself.
The raft stopped, revolving slowly, a few inches above the water. “Drop it,” cried Macintosh.
The raft hit the water with a sharp report. Sinson and Samaritan sat perfectly still. Gradually the raft settled in the water, until a thin tide began to wash over the top of it. At once Samaritan leaned forward and seized Sinson's head. In four neat movements it measured the size of his skull, then paused, computing. Then, with a decisive click, it rolled sideways off the raft and sank without hesitation to the bottom of the tank.
But as the Samaritan II robots came to behave like the moral agents in the philosophy books, it became less and less clear that they were really moral at all. Macintosh explained why he did not simply tie a rope around the self-sacrificing robot to make it easier to retrieve: “I don't want it to know that it's going to be saved. It would invalidate its decision to sacrifice itself. . . . So, every now and then I leave one of them in instead of fishing it out. To show the others I mean business. I've written off two this week.” Working out what it would take to program goodness into a robot shows not only how much machinery it takes to be good but how slippery the concept of goodness is to start with.
And what about the most caring motive of all? The weak-willed {18} computers of 1960s pop culture were not tempted only by selfishness and power, as we see in the comedian Allan Sherman's song “Automation,” sung to the tune of “Fascination”:
It was automation, I know.
That was what was making the factory go.
It was IBM, it was Univac,
It was all those gears going clickety clack, dear.
I thought automation was keen
Till you were replaced by a ten-ton machine.
It was a computer that tore us apart, dear,
Automation broke my heart. . . .
It was automation, I'm told,
That's why I got fired and I'm out in the cold.
How could I have known, when the 503
Started in to blink, it was winking at me, dear?
I thought it was just some mishap
When it sidled over and sat on my lap.
But when it said “I love you” and gave me a hug, dear,
That's when I pulled out... its ... plug.
But for all its moonstruck madness, love is no bug or crash or malfunction. The mind is never so wonderfully concentrated as when it turns to love, and there must be intricate calculations that carry out the peculiar logic of attraction, infatuation, courtship, coyness, surrender, commitment, malaise, philandering, jealousy, desertion, and heart break. And in the end, as my grandmother used to say, every pot finds a cover; most people — including, significantly, all of our ancestors — manage to pair up long enough to produce viable children. Imagine how many lines of programming it would take to duplicate that!
Robot design is a kind of consciousness-raising. We tend to be blasé about our mental lives. We open our eyes, and familiar articles present themselves; we will our limbs to move, and objects and bodies float into place; we awaken from a dream, and return to a comfortingly predictable {19} world; Cupid draws back his bow, and lets his arrow go. But think of what it takes for a hunk of matter to accomplish these improbable outcomes, and you begin to see through the illusion. Sight and action and common sense and violence and morality and love are no accident, no inextricable ingredients of an intelligent essence, no inevitability of information processing. Each is a tour de force, wrought by a high level of targeted design. Hidden behind the panels of consciousness must lie fantastically complex machinery — optical analyzers, motion guidance systems, simulations of the world, databases on people and things, goal-schedulers, conflict-resolvers, and many others. Any explanation of how the mind works that alludes hopefully to some single master force or mind-bestowing elixir like “culture,” “learning,” or “self-organization” begins to sound hollow, just not up to the demands of the pitiless universe we negotiate so successfully.
The robot challenge hints at a mind loaded with original equipment, but it still may strike you as an argument from the armchair. Do we actually find signs of this intricacy when we look directly at the machinery of the mind and at the blueprints for assembling it? I believe we do, and what we see is as mind-expanding as the robot challenge itself.
When the visual areas of the brain are damaged, for example, the visual world is not simply blurred or riddled with holes. Selected aspects of visual experience are removed while others are left intact. Some patients see a complete world but pay attention only to half of it. They eat food from the right side of the plate, shave only the right cheek, and draw a clock with twelve digits squished into the right half. Other patients lose their sensation of color, but they do not see the world as an arty black-and-white movie. Surfaces look grimy and rat-colored to them, killing their appetite and their libido. Still others can see objects change their positions but cannot see them move — a syndrome that a philosopher once tried to convince me was logically impossible! The stream from a teapot does not flow but looks like an icicle; the cup does not gradually fill with tea but is empty and then suddenly full.
Other patients cannot recognize the objects they see: their world is like handwriting they cannot decipher. They copy a bird faithfully but identify it as a tree stump. A cigarette lighter is a mystery until it is lit. When they try to weed the garden, they pull out the roses. Some patients can recognize inanimate objects but cannot recognize faces. The patient deduces that the visage in the mirror must be his, but does not viscerally recognize himself. He identifies John F. Kennedy as Martin Luther King, {20} and asks his wife to wear a ribbon at a party so he can find her when it is time to leave. Stranger still is the patient who recognizes the face but not the person: he sees his wife as an amazingly convincing impostor.
These syndromes are caused by an injury, usually a stroke, to one or more of the thirty brain areas that compose the primate visual system. Some areas specialize in color and form, others in where an object is, others in what an object is, still others in how it moves. A seeing robot cannot be built with just the fish-eye viewfinder of the movies, and it is no surprise to discover that humans were not built that way either. When we gaze at the world, we do not fathom the many layers of apparatus that underlie our unified visual experience, until neurological disease dissects them for us.
Another expansion of our vista comes from the startling similarities between identical twins, who share the genetic recipes that build the mind. Their minds are astonishingly alike, and not just in gross measures like IQ and personality traits like neuroticism and introversion. They are alike in talents such as spelling and mathematics, in opinions on questions such as apartheid, the death penalty, and working mothers, and in their career choices, hobbies, vices, religious commitments, and tastes in dating. Identical twins are far more alike than fraternal twins, who share only half their genetic recipes, and most strikingly, they are almost as alike when they are reared apart as when they are reared together. Identical twins separated at birth share traits like entering the water backwards and only up to their knees, sitting out elections because they feel insufficiently informed, obsessively counting everything in sight, becoming captain of the volunteer fire department, and leaving little love notes around the house for their wives.
People find these discoveries arresting, even incredible. The discoveries cast doubt on the autonomous “I” that we all feel hovering above our bodies, making choices as we proceed through life and affected only by our past and present environments. Surely the mind does not come equipped with so many small parts that it could predestine us to flush the toilet before and after using it or to sneeze playfully in crowded elevators, to take two other traits shared by identical twins reared apart. But apparently it does. The far-reaching effects of the genes have been documented in scores of studies and show up no matter how one tests for them: by comparing twins reared apart and reared together, by comparing identical and fraternal twins, or by comparing adopted and biological children. And despite what critics sometimes claim, the effects are not {21} products of coincidence, fraud, or subtle similarities in the family environments (such as adoption agencies striving to place identical twins in homes that both encourage walking into the ocean backwards). The findings, of course, can be misinterpreted in many ways, such as by imagining a gene for leaving little love notes around the house or by concluding that people are unaffected by their experiences. And because this research can measure only the ways in which people differ, it says little about the design of the mind that all normal people share. But by showing how many ways the mind can vary in its innate structure, the discoveries open our eyes to how much structure the mind must have.
The complex structure of the mind is the subject of this book. Its key idea can be captured in a sentence: The mind is a system of organs of computation, designed by natural selection to solve the kinds of problems our ancestors faced in their foraging way of life, in particular, understanding and outmaneuvering objects, animals, plants, and other people. The summary can be unpacked into several claims. The mind is what the brain does; specifically, the brain processes information, and thinking is a kind of computation. The mind is organized into modules or mental organs, each with a specialized design that makes it an expert in one arena of interaction with the world. The modules’ basic logic is specified by our genetic program. Their operation was shaped by natural selection to solve the problems of the hunting and gathering life led by our ancestors in most of our evolutionary history. The various problems for our ancestors were subtasks of one big problem for their genes, maximizing the number of copies that made it into the next generation.
On this view, psychology is engineering in reverse. In forward-engineering, one designs a machine to do something; in reverse-engineering, one figures out what a machine was designed to do. Reverse-engineering is what the boffins at Sony do when a new product is announced by Panasonic, or vice versa. They buy one, bring it back to the lab, take a screwdriver to it, and try to figure out what all the parts are for and how they combine to make the device work. We all engage in reverse-engineering when we face an interesting new gadget. In rummaging through {22} an antique store, we may find a contraption that is inscrutable until we figure out what it was designed to do. When we realize that it is an olive-pitter, we suddenly understand that the metal ring is designed to hold the olive, and the lever lowers an X-shaped blade through one end, pushing the pit out through the other end. The shapes and arrangements of the springs, hinges, blades, levers, and rings all make sense in a satisfying rush of insight. We even understand why canned olives have an X-shaped incision at one end.
In the seventeenth century William Harvey discovered that veins had valves and deduced that the valves must be there to make the blood circulate. Since then we have understood the body as a wonderfully complex machine, an assembly of struts, ties, springs, pulleys, levers, joints, hinges, sockets, tanks, pipes, valves, sheaths, pumps, exchangers, and filters. Even today we can be delighted to learn what mysterious parts are for. Why do we have our wrinkled, asymmetrical ears? Because they filter sound waves coming from different directions in different ways. The nuances of the sound shadow tell the brain whether the source of the sound is above or below, in front of or behind us. The strategy of reverse-engineering the body has continued in the last half of this century as we have explored the nanotechnology of the cell and of the molecules of life. The stuff of life turned out to be not a quivering, glowing, wondrous gel but a contraption of tiny jigs, springs, hinges, rods, sheets, magnets, zippers, and trapdoors, assembled by a data tape whose information is copied, downloaded, and scanned.
The rationale for reverse-engineering living things comes, of course, from Charles Darwin. He showed how “organs of extreme perfection and complication, which justly excite our admiration” arise not from God's foresight but from the evolution of replicators over immense spans of time. As replicators replicate, random copying errors sometimes crop up, and those that happen to enhance the survival and reproduction rate of the replicator tend to accumulate over the generations. Plants and animals are replicators, and their complicated machinery thus appears to have been engineered to allow them to survive and reproduce.
Darwin insisted that his theory explained not just the complexity of an animal's body but the complexity of its mind. “Psychology will be based on a new foundation,” he famously predicted at the end of The Origin of Species. But Darwin's prophecy has not yet been fulfilled. More than a century after he wrote those words, the study of the mind is still mostly Darwin-free, often defiantly so. Evolution is said to be irrelevant, {23} sinful, or fit only for speculation over a beer at the end of the day. The allergy to evolution in the social and cognitive sciences has been, I think, a barrier to understanding. The mind is an exquisitely organized system that accomplishes remarkable feats no engineer can duplicate. How could the forces that shaped that system, and the purposes for which it was designed, be irrelevant to understanding it? Evolutionary thinking is indispensable, not in the form that many people think of — dreaming up missing links or narrating stories about the stages of Man — but in the form of careful reverse-engineering. Without reverse-engineering we are like the singer in Tom Paxton's “The Marvelous Toy” reminiscing about a childhood present: “It went ZIP! when it moved, and POP! when it stopped, and WHIRRR! when it stood still; I never knew just what it was, and I guess I never will.”
Only in the past few years has Darwin's challenge been taken up, by a new approach christened “evolutionary psychology” by the anthropologist John Tooby and the psychologist Leda Cosmides. Evolutionary psychology brings together two scientific revolutions. One is the cognitive revolution of the 1950s and 1960s, which explains the mechanics of thought and emotion in terms of information and computation. The other is the revolution in evolutionary biology of the 1960s and 1970s, which explains the complex adaptive design of living things in terms of selection among replicators. The two ideas make a powerful combination. Cognitive science helps us to understand how a mind is possible and what kind of mind we have. Evolutionary biology helps us to understand why we have the kind of mind we have.
The evolutionary psychology of this book is, in one sense, a straightforward extension of biology, focusing on one organ, the mind, of one species, Homo sapiens. But in another sense it is a radical thesis that discards the way issues about the mind have been framed for almost a century. The premises of this book are probably not what you think they are. Thinking is computation, I claim, but that does not mean that the computer is a good metaphor for the mind. The mind is a set of modules, but the modules are not encapsulated boxes or circumscribed swatches on the surface of the brain. The organization of our mental modules comes from our genetic program, but that does not mean that there is a gene for every trait or that learning is less important than we used to think. The mind is an adaptation designed by natural selection, but that does not mean that everything we think, feel, and do is biologically adaptive. We evolved from apes, but that does not mean we have the same minds as {24} apes. And the ultimate goal of natural selection is to propagate genes, but that does not mean that the ultimate goal of people is to propagate genes. Let me show you why not.
This book is about the brain, but I will not say much about neurons, hormones, and neurotransmitters. That is because the mind is not the brain but what the brain does, and not even everything it does, such as metabolizing fat and giving off heat. The 1990s have been named the Decade of the Brain, but there will never be a Decade of the Pancreas. The brain's special status comes from a special thing the brain does, which makes us see, think, feel, choose, and act. That special thing is information processing, or computation.
Information and computation reside in patterns of data and in relations of logic that are independent of the physical medium that carries them. When you telephone your mother in another city, the message stays the same as it goes from your lips to her ears even as it physically changes its form, from vibrating air, to electricity in a wire, to charges in silicon, to flickering light in a fiber optic cable, to electromagnetic waves, and then back again in reverse order. In a similar sense, the message stays the same when she repeats it to your father at the other end of the couch after it has changed its form inside her head into a cascade of neurons firing and chemicals diffusing across synapses. Likewise, a given program can run on computers made of vacuum tubes, electromagnetic switches, transistors, integrated circuits, or well-trained pigeons, and it accomplishes the same things for the same reasons.
This insight, first expressed by the mathematician Alan Turing, the computer scientists Alan Newell, Herbert Simon, and Marvin Minsky, and the philosophers Hilary Putnam and Jerry Fodor, is now called the computational theory of mind. It is one of the great ideas in intellectual history, for it solves one of the puzzles that make up the “mind-body problem”: how to connect the ethereal world of meaning and intention, the stuff of our mental lives, with a physical hunk of matter like the brain. Why did Bill get on the bus? Because he wanted to visit his grandmother and knew the bus would take him there. No other answer will do. If he hated the sight of his grandmother, or if he knew the route had changed, his body would not be on that bus. For millennia this has been {25} a paradox. Entities like “wanting to visit one's grandmother” and “knowing the bus goes to Grandma's house” are colorless, odorless, and tasteless. But at the same time they are causes of physical events, as potent as any billiard ball clacking into another.
The computational theory of mind resolves the paradox. It says that beliefs and desires are information, incarnated as configurations of symbols. The symbols are the physical states of bits of matter, like chips in a computer or neurons in the brain. They symbolize things in the world because they are triggered by those things via our sense organs, and because of what they do once they are triggered. If the bits of matter that constitute a symbol are arranged to bump into the bits of matter constituting another symbol in just the right way, the symbols corresponding to one belief can give rise to new symbols corresponding to another belief logically related to it, which can give rise to symbols corresponding to other beliefs, and so on. Eventually the bits of matter constituting a symbol bump into bits of matter connected to the muscles, and behavior happens. The computational theory of mind thus allows us to keep beliefs and desires in our explanations of behavior while planting them squarely in the physical universe. It allows meaning to cause and be caused.
The computational theory of mind is indispensable in addressing the questions we long to answer. Neuroscientists like to point out that all parts of the cerebral cortex look pretty much alike — not only the different parts of the human brain, but the brains of different animals. One could draw the conclusion that all mental activity in all animals is the same. But a better conclusion is that we cannot simply look at a patch of brain and read out the logic in the intricate pattern of connectivity that makes each part do its separate thing. In the same way that all books are physically just different combinations of the same seventy-five or so characters, and all movies are physically just different patterns of charges along the tracks of a videotape, the mammoth tangle of spaghetti of the brain may all look alike when examined strand by strand. The content of a book or a movie lies in the pattern of ink marks or magnetic charges, and is apparent only when the piece is read or seen. Similarly, the content of brain activity lies in the patterns of connections and patterns of activity among the neurons. Minute differences in the details of the connections may cause similar-looking brain patches to implement very different programs. Only when the program is run does the coherence become evident. As Tooby and Cosmides have written, {26}
There are birds that migrate by the stars, bats that echolocate, bees that compute the variance of flower patches, spiders that spin webs, humans that speak, ants that farm, lions that hunt in teams, cheetahs that hunt alone, monogamous gibbons, polyandrous seahorses, polygynous gorillas. There are millions of animal species on earth, each with a different set of cognitive programs. The same basic neural tissue embodies all of these programs, and it could support many others as well. Facts about the properties of neurons, neurotransmitters, and cellular development cannot tell you which of these millions of programs the human mind contains. Even if all neural activity is the expression of a uniform process at the cellular level, it is the arrangement of neurons — into bird song templates or web-spinning programs — that matters.
That does not imply, of course, that the brain is irrelevant to understanding the mind! Programs are assemblies of simple information-processing units — tiny circuits that can add, match a pattern, turn on some other circuit, or do other elementary logical and mathematical operations. What those microcircuits can do depends only on what they are made of. Circuits made from neurons cannot do exactly the same things as circuits made from silicon, and vice versa. For example, a silicon circuit is faster than a neural circuit, but a neural circuit can match a larger pattern than a silicon one. These differences ripple up through the programs built from the circuits and affect how quickly and easily the programs do various things, even if they do not determine exactly which things they do. My point is not that prodding brain tissue is irrelevant to understanding the mind, only that it is not enough. Psychology, the analysis of mental software, will have to burrow a considerable way into the mountain before meeting the neurobiologists tunneling through from the other side.
The computational theory of mind is not the same thing as the despised “computer metaphor.” As many critics have pointed out, computers are serial, doing one thing at a time; brains are parallel; doing millions of things at once. Computers are fast; brains are slow. Computer parts are reliable; brain parts are noisy. Computers have a limited number of connections; brains have trillions. Computers are assembled according to a blueprint; brains must assemble themselves. Yes, and computers come in putty-colored boxes and have AUTOEXEC.BAT files and run screen-savers with flying toasters, and brains do not. The claim is not that the brain is like commercially available computers. Rather, the claim is that brains and computers embody intelligence for some of the same {27} reasons. To explain how birds fly, we invoke principles of lift and drag and fluid mechanics that also explain how airplanes fly. That does not commit us to an Airplane Metaphor for birds, complete with jet engines and complimentary beverage service.
Without the computational theory, it is impossible to make sense of the evolution of the mind. Most intellectuals think that the human mind must somehow have escaped the evolutionary process. Evolution, they think, can fabricate only stupid instincts and fixed action patterns: a sex drive, an aggression urge, a territorial imperative, hens sitting on eggs and ducklings following hulks. Human behavior is too subtle and flexible to be a product of evolution, they think; it must come from somewhere else — from, say, “culture.” But if evolution equipped us not with irresistible urges and rigid reflexes but with a neural computer, everything changes. A program is an intricate recipe of logical and statistical operations directed by comparisons, tests, branches, loops, and subroutines embedded in subroutines. Artificial computer programs, from the Macintosh user interface to simulations of the weather to programs that recognize speech and answer questions in English, give us a hint of the finesse and power of which computation is capable. Human thought and behavior, no matter how subtle and flexible, could be the product of a very complicated program, and that program may have been our endowment from natural selection. The typical imperative from biology is not “Thou shalt . . . ,” but “If . . . then . . . else.”
The mind, I claim, is not a single organ but a system of organs, which we can think of as psychological faculties or mental modules. The entities now commonly evoked to explain the mind — such as general intelligence, a capacity to form culture, and multipurpose learning strategies — will surely go the way of protoplasm in biology and of earth, air, fire, and water in physics. These entities are so formless, compared to the exacting phenomena they are meant to explain, that they must be granted near-magical powers. When the phenomena are put under the microscope, we discover that the complex texture of the everyday world is supported not by a single substance but by many layers of elaborate machinery. Biologists long ago replaced the concept of an all-powerful protoplasm with the concept of functionally specialized {28} mechanisms. The organ systems of the body do their jobs because each is built with a particular structure tailored to the task. The heart circulates the blood because it is built like a pump; the lungs oxygenate the blood because they are built like gas exchangers. The lungs cannot pump blood and the heart cannot oxygenate it. This specialization goes all the way down. Heart tissue differs from lung tissue, heart cells differ from lung cells, and many of the molecules making up heart cells differ from those making up lung cells. If that were not true, our organs would not work.
A jack-of-all-trades is master of none, and that is just as true for our mental organs as for our physical organs. The robot challenge makes that clear. Building a robot poses many software engineering problems, and different tricks are necessary to solve them.
Take our first problem, the sense of sight. A seeing machine must solve a problem called inverse optics. Ordinary optics is the branch of physics that allows one to predict how an object with a certain shape, material, and illumination projects the mosaic of colors we call the retinal image. Optics is a well-understood subject, put to use in drawing, photography, television engineering, and more recently, computer graphics and virtual reality. But the brain must solve the opposite problem. The input is the retinal image, and the output is a specification of the objects in the world and what they are made of — that is, what we know we are seeing. And there's the rub. Inverse optics is what engineers call an “ill-posed problem.” It literally has no solution. Just as it is easy to multiply some numbers and announce the product but impossible to take a product and announce the numbers that were multiplied to get it, optics is easy but inverse optics impossible. Yet your brain does it every time you open the refrigerator and pull out a jar. How can this be?
The answer is that the brain supplies the missing information, information about the world we evolved in and how it reflects light. If the visual brain “assumes” that it is living in a certain kind of world — an evenly lit world made mostly of rigid parts with smooth, uniformly colored surfaces — it can make good guesses about what is out there. As we saw earlier, it's impossible to distinguish coal from snow by examining the brightnessies of their retinal projections. But say there is a module for perceiving the properties of surfaces, and built into it is the following assumption: “The world is smoothly and uniformly lit.” The module can solve the coal-versus-snow problem in three steps: subtract out any gradient of brightness from one edge of the scene to the other; estimate the average level of brightness of {29} the whole scene; and calculate the shade of gray of each patch by subtracting its brightness from the average brightness. Large positive deviations from the average are then seen as white things, large negative deviations as black things. If the illumination really is smooth and uniform, those perceptions will register the surfaces of the world accurately. Since Planet Earth has, more or less, met the even-illumination assumption for eons, natural selection would have done well by building the assumption in.
The surface-perception module solves an unsolvable problem, but at a price. The brain has given up any pretense of being a general problem-solver. It has been equipped with a gadget that perceives the nature of surfaces in typical earthly viewing conditions because it is specialized for that parochial problem. Change the problem slightly and the brain no longer solves it. Say we place a person in a world that is not blanketed with sunshine but illuminated by a cunningly arranged patchwork of light. If the surface-perception module assumes that illumination is even, it should be seduced into hallucinating objects that aren't there. Could that really happen? It happens every day. We call these hallucinations slide shows and movies and television (complete with the illusory black I mentioned earlier). When we watch TV, we stare at a shimmering piece of glass, but our surface-perception module tells the rest of our brain that we are seeing real people and places. The module has been unmasked; it does not apprehend the nature of things but relies on a cheat-sheet. That cheat-sheet is so deeply embedded in the operation of our visual brain that we cannot erase the assumptions written on it. Even in a lifelong couch potato, the visual system never “learns” that television is a pane of glowing phosphor dots, and the person never loses the illusion that there is a world behind the pane.
Our other mental modules need their own cheat-sheets to solve their unsolvable problems. A physicist who wants to figure out how the body moves when muscles are contracted has to solve problems in kinematics (the geometry of motion) and dynamics (the effects of forces). But a brain that has to figure out how to contract muscles to get the body to move has to solve problems in inverse kinematics and inverse dynamics — what forces to apply to an object to get it to move in a certain trajectory. Like inverse optics, inverse kinematics and dynamics are ill-posed problems. Our motor modules solve them by making extraneous but reasonable assumptions — not assumptions about illumination, of course, but assumptions about bodies in motion.
Our common sense about other people is a kind of intuitive psychology {30} — we try to infer people's beliefs and desires from what thley do, and try to predict what they will do from our guesses about their beliefs and desires. Our intuitive psychology, though, must make the assumption that other people have beliefs and desires; we cannot sense a belief or desire in another person's head the way we smell oranges. If we did not see the social world through the lens of that assumption, we would be like the Samaritan I robot, which sacrificed itself for a bag of lima beans, or like Samaritan II, which went overboard for any object with a humanlike head, even if the head belonged to a large wind-up toy. (Later we shall see that people suffering from a certain syndrome lack the assumption that people have minds and do treat other people as wind-up toys.) Even our feelings of love for our family members embody a specific assumption about the laws of the natural world, in this case an inverse of the ordinary laws of genetics. Family feelings are designed to help our genes replicate themselves, but we cannot see or smell genes. Scientists use forward genetics to deduce how genes get distributed among organisms (for example, meiosis and sex cause the offspring of two people to have fifty percent of their genes in common); our emotions about kin use a kind of inverse genetics to guess which of the organisms we interact with are likely to share our genes (for example, if someone appears to have the same parents as you do, treat the person as if their genetic well-being overlaps with yours). I will return to all these topics in later chapters.
The mind has to be built out of specialized parts because it has to solve specialized problems. Only an angel could be a general problem-solver; we mortals have to make fallible guesses from fragmentary information. Each of our mental modules solves its unsolvable problem by a leap of faith about how the world works, by making assumptions that are indispensable but indefensible — the only defense being that the assumptions worked well enough in the world of our ancestors.
The word “module” brings to mind detachable, snap-in components, and that is misleading. Mental modules are not likely to be visible to the naked eye as circumscribed territories on the surface of the brain, like the flank steak and the rump roast on the supermarket cow display. A mental module probably looks more like roadkill, sprawling messily over the bulges and crevasses of the brain. Or it may be broken into regions that are interconnected by fibers that make the regions act as a unit. The beauty of information processing is the flexibility of its demand for real estate. Just as a corporation's management can be scattered across sites {31} linked by a telecommunications network, or a computer program can be fragmented into different parts of the disk or memory, the circuitry underlying a psychological module might be distributed across the brain in a spatially haphazard manner. And mental modules need not be tightly sealed off from one another, communicating only through a few narrow pipelines. (That is a specialized sense of “module” that many cognitive scientists have debated, following a definition by Jerry Fodor.) Modules are defined by the special things they do with the information available to them, not necessarily by the kinds of information they have available. So the metaphor of the mental module is a bit clumsy; a better one is Noam Chomsky's “mental organ.” An organ of the body is a specialized structure tailored to carry out a particular function. But our organs do not come in a bag like chicken giblets; they are integrated into a complex whole. The body is composed of systems divided into organs assembled from tissues built out of cells. Some kinds of tissues, like the epithelium, are used, with modifications, in many organs. Some organs, like the blood and the skin, interact with the rest of the body across a widespread, convoluted interface, and cannot be encircled by a dotted line. Sometimes it is unclear where one organ leaves off and another begins, or how big a chunk of the body we want to call an organ. (Is the hand an organ? the finger? a bone in the finger?) These are all pedantic questions of terminology, and anatomists and physiologists have not wasted their time on them. What is clear is that the body is not made of Spam but has a heterogeneous structure of many specialized parts. All this is likely to be true of the mind. Whether or not we establish exact boundaries for the components of the mind, it is clear that it is not made of mental Spam but has a heterogeneous structure of many specialized parts.
Our physical organs owe their complex design to the information in the human genome, and so, I believe, do our mental organs. We do not learn to have a pancreas, and we do not learn to have a visual system, language acquisition, common sense, or feelings of love, friendship, and fairness. No single discovery proves the claim (just as no single discovery proves that the pancreas is innately structured), but many lines of evidence converge on it. The one that most impresses me is the Robot Challenge. Each of the major engineering problems solved by the mind is unsolvable {32} without built-in assumptions about the laws that hold in that arena of interaction with the world. All of the programs designed by artificial intelligence researchers have been specially engineered for a particular domain, such as language, vision, movement, or one of many different kinds of common sense. Within artificial intelligence research, the proud parent of a program will sometimes tout it as a mere demo of an amazingly powerful general-purpose system to be built in the future, but everyone else in the field routinely writes off such hype. I predict that no one will ever build a humanlike robot — and I mean a really humanlike robot — unless they pack it with computational systems tailored to different problems.
Throughout the book we will run into other lines of evidence that our mental organs owe their basic design to our genetic program. I have already mentioned that much of the fine structure of our personality and intelligence is shared by identical twins reared apart and hence charted by the genes. Infants and young children, when tested with ingenious methods, show a precocious grasp of the fundamental categories of the physical and social world, and sometimes command information that was never presented to them. People hold many beliefs that are at odds with their experience but were true in the environment in which we evolved, and they pursue goals that subvert their own well-being but were adaptive in that environment. And contrary to the widespread belief that cultures can vary arbitrarily and without limit, surveys of the ethnographic literature show that the peoples of the world share an astonishingly detailed universal psychology.
But if the mind has a complex innate structure, that does not mean that learning is unimportant. Framing the issue in such a way that innate structure and learning are pitted against each other, either as alternatives or, almost as bad, as complementary ingredients or interacting forces, is a colossal mistake. It's not that the claim that there is an interaction between innate structure and learning (or between heredity and environment, nature and nurture, biology and culture) is literally wrong. Rather, it falls into the category of ideas that are so bad they are not even wrong.
Imagine the following dialogue:
“This new computer is brimming with sophisticated technology. It has a 500 megahertz processor, a gigabyte of RAM, a terabyte of disk storage, a 3-D color virtual reality display, speech output, wireless access to the World Wide Web, expertise in a dozen subjects, and built-in editions of {33} the Bible, the Encyclopaedia Britannica, Bartlett's Famous Quotations, and the complete works of Shakespeare. Tens of thousands of hacker-hours went into its design.”
“Oh, so I guess you're saying that it doesn't matter what I type into the computer. With all that built-in structure, its environment can't be very important. It will always do the same thing, regardless of what I type in.”
The response is patently senseless. Having a lot of built-in machinery should make a system respond more intelligently and flexibly to its inputs, not less. Yet the reply captures how centuries of commentators have reacted to the idea of a richly structured, high-tech mind.
And the “interactionist” position, with its phobia of ever specifying the innate part of the interaction, is not much better. Look at these claims.
The behavior of a computer comes from a complex interaction between the processor and the input.
When trying to understand how a car works, one cannot neglect the engine or the gasoline or the driver. All are important factors.
The sound coming out of this CD player represents the inextricably intertwined mixture of two crucial variables: the structure of the machine, and the disk you insert into it. Neither can be ignored.
These statements are true but useless — so blankly uncomprehending, so defiantly incurious, that it is almost as bad to assert them as to deny them. For minds, just as for machines, the metaphors of a mixture of two ingredients, like a martini, or a battle between matched forces, like a tug-of-war, are wrongheaded ways of thinking about a complex device designed to process information. Yes, every part of human intelligence involves culture and learning. But learning is not a surrounding gas or force field, and it does not happen by magic. It is made possible by innate machinery designed to do the learning. The claim that there are several innate modules is a claim that there are several innate learning machines, each of which learns according to a particular logic. To understand learning, we need new ways of thinking to replace the prescientific metaphors — the mixtures and forces, the writing on slates and sculpting of blocks of marble. We need ideas that capture the ways a complex device can tune itself to unpredictable aspects of the world and take in the kinds of data it needs to function.
The idea that heredity and environment interact is not always {34} meaningless, but I think it confuses two issues: what all minds have in common, and how minds can differ. The vapid statements above can be made intelligible by replacing “How X works” with “What makes X work better than Y”:
The usefulness of a computer depends on both the power of its processor and the expertise of the user.
The speed of a car depends on the engine, the fuel, and the skill of the driver. All are important factors.
The quality of sound coming from a CD player depends on two crucial variables: the player's mechanical and electronic design, and the quality of the original recording. Neither can be ignored.
When we are interested in how much better one system functions than a similar one, it is reasonable to gloss over the causal chains inside each system and tally up the factors that make the whole thing fast or slow, hi-fi or low-fi. And this ranking of people — to determine who enters medical school, or who gets the job — is where the framing of niture versus nurture comes from.
But this book is about how the mind works, not about why some people's minds might work a bit better in certain ways than other people's minds. The evidence suggests that humans everywhere on the planet see, talk, and think about objects and people in the same basic way. The difference between Einstein and a high school dropout is trivial compared to the difference between the high school dropout and the best robot in existence, or between the high school dropout and a chimpanzee. That is the mystery I want to address. Nothing could be farther from my subject matter than a comparison between the means of overlapping bell curves for some crude consumer index like IQ. And for this reason, the relative importance of innateness and learning is a phony issue.
An emphasis on innate design should not, by the way, be confused with the search for “a gene for” this or that mental organ. Think of the genes and putative genes that have made the headlines: genes for muscular dystrophy, Huntington's disease, Alzheimer's, alcoholism, schizophrenia, manic-depressive disorder, obesity, violent outbursts, dyslexia, bed-wetting, and some kinds of retardation. They are disorders, all of them. There have been no discoveries of a gene for civility, language, memory, motor control, intelligence, or other complete mental systems, and there probably won't ever be. The reason was summed up by the politician Sam Rayburn: Any jackass can kick down a barn, but it takes a {35} carpenter to build one. Complex mental organs, like complex physical organs, surely are built by complex genetic recipes, with many genes cooperating in as yet unfathomable ways. A defect in any one of them could corrupt the whole device, just as a defect in any part of a complicated machine (like a loose distributor cable in a car) can bring the machine to a halt.
The genetic assembly instructions for a mental organ do not specify every connection in the brain as if they were a wiring schematic for a Heathkit radio. And we should not expect each organ to grow under a particular bone of the skull regardless of what else happens in the brain. The brain and all the other organs differentiate in embryonic development from a ball of identical cells. Every part of the body, from the toe-nails to the cerebral cortex, takes on its particular shape and substance when its cells respond to some kind of information in its neighborhood that unlocks a different part of the genetic program. The information may come from the taste of the chemical soup that a cell finds itself in, from the shapes of the molecular locks and keys that the cell engages, from mechanical tugs and shoves from neighboring cells, and other cues still poorly understood. The families of neurons that will form the different mental organs, all descendants of a homogeneous stretch of embryonic tissue, must be designed to be opportunistic as the brain assembles itself, seizing any available information to differentiate from one another. The coordinates in the skull may be one trigger for differentiation, but the pattern of input firings from connected neurons is another. Since the brain is destined to be an organ of computation, it would be surprising if the genome did not exploit the capacity of neural tissue to process information during brain assembly.
In the sensory areas of the brain, where we can best keep track of what is going on, we know that early in fetal development neurons are wired according to a rough genetic recipe. The neurons are born in appropriate numbers at the right times, migrate to their resting places, send out connections to their targets, and hook up to appropriate cell types in the right general regions, all under the guidance of chemical trails and molecular locks and keys. To make precise connections, though, the baby neurons must begin to function, and their firing pattern carries information downstream about their pinpoint connections. This isn't “experience,” as it all can take place in the pitch-black womb, sometimes before the rods and cones are functioning, and many mammals can see almost perfectly as soon as they are born. It is {36} more like a kind of genetic data compression or a set of internally generated test patterns. These patterns can trigger the cortex at the receiving end to differentiate, at least one step of the way, into the kind of cortex that is appropriate to processing the incoming information. (For example, in animals that have been cross-wired so that the eyes are connected to the auditory brain, that area shows a few hints of the properties of the visual brain.) How the genes control brain development is still unknown, but a reasonable summary of what we know so far is that brain modules assume their identity by a combination of what kind of tissue they start out as, where they are in the brain, and what patterns of triggering input they get during critical periods in development.
Our organs of computation are a product of natural selection. The biologist Richard Dawkins called natural selection the Blind Watchmaker; in the case of the mind, we can call it the Blind Programmer. Our mental programs work as well as they do because they were shaped by selection to allow our ancestors to master rocks, tools, plants, animals, and each other, ultimately in the service of survival and reproduction.
Natural selection is not the only cause of evolutionary change. Organisms also change over the eons because of statistical accidents in who lives and who dies, environmental catastrophes that wipe out whole families of creatures, and the unavoidable by-products of changes that are the product of selection. But natural selection is the only evolutionary force that acts like an engineer, “designing” organs that accomplish improbable but adaptive outcomes (a point that has been made forcefully by the biologist George Williams and by Dawkins). The textbook argument for natural selection, accepted even by those who feel that selection has been overrated (such as the paleontologist Stephen Jay Gould), comes from the vertebrate eye. Just as a watch has too many finely meshing parts (gears, springs, pivots, and so on) to have been assembled by a tornado or a river eddy, entailing instead the design of a watchmaker, the eye has too many finely meshing parts (lens, iris, retina, and so on) to have arisen from a random evolutionary force like a big mutation, statistical drift, or the fortuitous shape of the nooks and crannies between other organs. The design of the eye must be a product of {37} natural selection of replicators, the only nonmiraculous natural process we know of that can manufacture well-functioning machines. The organism appears as if it was designed to see well now because it owes its existence to the success of its ancestors in seeing well in the past. (This point will be expanded in Chapter 3.)
Many people acknowledge that natural selection is the artificer of the body but draw the line when it comes to the human mind. The mind, they say, is a by-product of a mutation that enlarged the head, or is a clumsy programmer's hack, or was given its shape by cultural rather than biological evolution. Tooby and Cosmides point out a delicious irony. The eye, that most uncontroversial example of fine engineering by natural selection, is not just any old organ that can be sequestered with flesh and bone, far away from the land of the mental. It doesn't digest food or, except in the case of Superman, change anything in the physical world. What does the eye do? The eye is an organ of information processing, firmly connected to — anatomically speaking, a part of — the brain. And all those delicate optics and intricate circuits in the retina do not dump information into a yawning empty orifice or span some Cartesian chasm from a physical to a mental realm. The receiver of this richly structured message must be every bit as well engineered as the sender. As we have seen in comparing human vision and robot vision, the parts of the mind that allow us to see are indeed well engineered, and there is no reason to think that the quality of engineering progressively deteriorates as the information flows upstream to the faculties that interpret and act on what we see.
The adaptationist program in biology, or the careful use of natural selection to reverse-engineer the parts of an organism, is sometimes ridiculed as an empty exercise in after-the-fact storytelling. In the satire of the syndicated columnist Cecil Adams, “the reason our hair is brown is that it enabled our monkey ancestors to hide amongst the coconuts.” Admittedly, there is no shortage of bad evolutionary “explanations.” Why do men avoid asking for directions? Because our male ancestors might have been killed if they approached a stranger. What purpose does music serve? It brings the community together. Why did happiness evolve? Because happy people are pleasant to be around, so they attracted more allies. What is the function of humor? To relieve tension. Why do people overestimate their chance of surviving an illness? Because it helps them to operate effectively in life.
These musings strike us as glib and lame, but it is not because they {38} dare to seek an evolutionary explanation of how some part of the mind works. It is because they botch the job. First, many of them never bother to establish the facts. Has anyone ever documented that women like to ask for directions? Would a woman in a foraging society not have come to harm when she approached a stranger? Second, even if the facts had been established, the stories try to explain one puzzling fact by taking for granted some other fact that is just as much of a puzzle, getting us nowhere. Why do rhythmic noises bring a community together? Why do people like to be with happy people? Why does humor relieve tension? The authors of these explanations treat some parts of our mental life as so obvious — they are, after all, obvious to each of us, here inside our heads — that they don't need to be explained. But all parts of the mind are up for grabs — every reaction, every pleasure, every taste — when we try to explain how it evolved. We could have evolved like the Samaritan I robot, which sacrificed itself to save a sack of lima beans, or like dung beetles, which must find dung delicious, or like the masochist in the old joke about sadomasochism (Masochist: “Hit me!” Sadist: “No!”).
A good adaptationist explanation needs the fulcrum of an engineering analysis that is independent of the part of the mind we are trying to explain. The analysis begins with a goal to be attained and a world of causes and effects in which to attain it, and goes on to specify what kinds of designs are better suited to attain it than others. Unfortunately for those who think that the departments in a university reflect meaningful divisions of knowledge, it means that psychologists have to look outside psychology if they want to explain what the parts of the mind are for. To understand sight, we have to look to optics and computer vision systems. To understand movement, we have to look to robotics. To understand sexual and familial feelings, we have to look to Mendelian genetics. To understand cooperation and conflict, we have to look to the mathematics of games and to economic modeling.
Once we have a spec sheet for a well-designed mind, we can see whether Homo sapiens has that kind of mind. We do the experiments or surveys to get the facts down about a mental faculty, and then see whether the faculty meets the specs: whether it shows signs of precision, complexity, efficiency, reliability, and specialization in solving its assigned problem, especially in comparison with the vast number of alternative designs that are biologically growable.
The logic of reverse-engineering has guided researchers in visual perception for over a century, and that may be why we understand vision {39} better than we understand any other part of the mind. There is no reason that reverse-engineering guided by evolutionary theory should not bring insight about the rest of the mind. An interesting example is a new theory of pregnancy sickness (traditionally called “morning sickness”) by the biologist Margie Profet. Many pregnant women become nauseated and avoid certain foods. Though their sickness is usually explained away as a side effect of hormones, there is no reason that hormones should induce nausea and food aversions rather than, say, hyperactivity, aggressiveness, or lust. The Freudian explanation is equally unsatisfying: that pregnancy sickness represents the woman's loathing of her husband and her unconscious desire to abort the fetus orally.
Profet predicted that pregnancy sickness should confer some benefit that offsets the cost of lowered nutrition and productivity. Ordinarily, nausea is a protection against eating toxins: the poisonous food is ejected from the stomach before it can do much harm, and our appetite for similar foods is reduced in the future. Perhaps pregnancy sickness protects women against eating or digesting foods with toxins that might harm the developing fetus. Your local Happy Carrot Health Food Store notwithstanding, there is nothing particularly healthy about natural foods. Your cabbage, a Darwinian creature, has no more desire to be eaten than you do, and since it can't very well defend itself through behavior, it resorts to chemical warfare. Most plants have evolved dozens of toxins in their tissues: insecticides, insect repellents, irritants, paralytics, poisons, and other sand to throw in herbivores’ gears. Herbivores have in turn evolved countermeasures, such as a liver to detoxify the poisons and the taste sensation we call bitterness to deter any further desire to ingest them. But the usual defenses may not be enough to protect a tiny embryo.
So far this may not sound much better than the barf-up-your-baby theory, but Profet synthesized hundreds of studies, done independently of each other and of her hypothesis, that support it. She meticulously documented that (1) plant toxins in dosages that adults tolerate can cause birth defects and induce abortion when ingested by pregnant women; (2) pregnancy sickness begins at the point when the embryo's organ systems are being laid down and the embryo is most vulnerable to teratogens (birth defect — inducing chemicals) but is growing slowly and has only a modest need for nutrients; (3) pregnancy sickness wanes at the stage when the embryo's organ systems are nearly complete and its biggest need is for nutrients to allow it to grow; (4) women with pregnancy sickness selectively avoid bitter, pungent, highly flavored, and {40} novel foods, which are in fact the ones most likely to contain toxins; (5) women's sense of smell becomes hypersensitive during the window of pregnancy sickness and less sensitive than usual thereafter; (6) foraging peoples (including, presumably, our ancestors) are at even higher risk of ingesting plant toxins, because they eat wild plants rather than domesticated crops bred for palatability; (7) pregnancy sickness is universal across human cultures; (8) women with more severe pregnancy sickness are less likely to miscarry; (9) women with more severe pregnancy sickness are less likely to bear babies with birth defects. The fit between how a baby-making system in a natural ecosystem ought to work and how the feelings of modern women do work is impressive, and gives a measure of confidence that Profet's hypothesis is correct.
The human mind is a product of evolution, so our mental organs are either present in the minds of apes (and perhaps other mammals and vertebrates) or arose from overhauling the minds of apes, specifically, the common ancestors of humans and chimpanzees that lived about six million years ago in Africa. Many titles of books on human evolution remind us of this fact: The Naked Ape, The Electric Ape, The Scented Ape, The Lopsided Ape, The Aquatic Ape, The Thinking Ape, The Human Ape, The Ape That Spoke, The Third Chimpanzee, Tne Chosen Primate. Some authors are militant that humans are barely different from chimpanzees and that any focus on specifically human talents is arrogant chauvinism or tantamount to creationism. For some readers that is a reductio ad absurdum of the evolutionary framework. If the theory says that man “at best is only a monkey shaved,” as Gilbert and Sullivan put it in Princess Ida, then it fails to explain the obvious fact that men and monkeys have different minds.
We are naked, lopsided apes that speak, but we also have minds that differ considerably from those of apes. The outsize brain of Homo sapiens sapiens is, by any standard, an extraordinary adaptation. It has allowed us to inhabit every ecosystem on earth, reshape the planet, walk on the moon, and discover the secrets of the physical universe. Chimpanzees, for all their vaunted intelligence, are a threatened species clinging to a few patches of forest and living as they did millions of years ago. Our about this difference demands more than repeating that we {41} share most of our DNA with chimpanzees and that small changes can have big effects. Three hundred thousand generations and up to ten megabytes of potential genetic information are enough to revamp a mind considerably. Indeed, minds are probably easier to revamp than bodies because software is easier to modify than hardware. We should not be surprised to discover impressive new cognitive abilities in humans, language being just the most obvious one.
None of this is incompatible with the theory of evolution. Evolution is a conservative process, to be sure, but it can't be all that conservative or we would all be pond scum. Natural selection introduces differences into descendants by fitting them with specializations that adapt them to different niches. Any museum of natural history has examples of complex organs unique to a species or to a group of related species: the elephant's trunk, the narwhal's tusk, the whale's baleen, the platypus’ duckbill, the armadillo's armor. Often they evolve rapidly on the geological timescale. The first whale evolved in something like ten million years from its common ancestor with its closest living relatives, ungulates such as cows and pigs. A book about whales could, in the spirit of the human-evolution books, be called The Naked Cow, but it would be disappointing if the book spent every page marveling at the similarities between whales and cows and never got around to discussing the adaptations that make them so different.
To say that the mind is an evolutionary adaptation is not to say that all behavior is adaptive in Darwin's sense. Natural selection is not a guardian angel that hovers over us making sure that our behavior always maximizes biological fitness. Until recently, scientists with an evolutionary bent felt a responsibility to account for acts that seem like Darwinian suicide, such as celibacy, adoption, and contraception. Perhaps, they ventured, celibate people have more time to raise large broods of nieces and nephews and thereby propagate more copies of their genes than they would if they had their own children. This kind of stretch is unnecessary, however. The reasons, first articulated by the anthropologist Donald Symons, distinguish evolutionary psychology from the school of thought in the 1970s and 1980s called sociobiology (though there is much overlap between the approaches as well). {42}
First, selection operates over thousands of generations. For ninety-nine percent of human existence, people lived as foragers in small nomadic bands. Our brains are adapted to that long-vanished way of life, not to brand-new agricultural and industrial civilizations. They are not wired to cope with anonymous crowds, schooling, written language, government, police, courts, armies, modern medicine, formal social institutions, high technology, and other newcomers to the human experience. Since the modern mind is adapted to the Stone Age, not the computer age, there is no need to strain for adaptive explanations for everything we do. Our ancestral environment lacked the institutions that now entice us to nonadaptive choices, such as religious orders, adoption agencies, and pharmaceutical companies, so until very recently there was never a selection pressure to resist the enticements. Had the Pleistocene savanna contained trees bearing birth-control pills, we might have evolved to find them as terrifying as a venomous spider.
Second, natural selection is not a puppetmaster that pulls the strings of behavior directly. It acts by designing the generator of behavior: the package of information-processing and goal-pursuing mechanisms called the mind. Our minds are designed to generate behavior that would have been adaptive, on average, in our ancestral environment, but any particular deed done today is the effect of dozens of causes. Behavior is the outcome of an internal struggle among many mental modules, and it is played out on the chessboard of opportunities and constraints defined by other people's behavior. A recent cover story in Time asked, “Adultery: Is It in Our Genes?” The question makes no sense because neither adultery nor any other behavior can be in our genes. Conceivably a desire for adultery can be an indirect product of our genes, but the desire may be overridden by other desires that are also indirect products of our genes, such as the desire to have a trusting spouse. And the desire, even if it prevails in the rough-and-tumble of the mind, cannot be consummated as overt behavior unless there is a partner around in whom that desire has also prevailed. Behavior itself did not evolve; what evolved was the mind.
Reverse-engineering is possible only when one has a hint of what the device was designed to accomplish. We do not understand the olive-pitter until we catch on that it was designed as a machine for pitting olives {43} rather than as a paperweight or wrist-exerciser. The goals of the designer must be sought for every part of a complex device and for the device as a whole. Automobiles have a component, the carburetor, that is designed to mix air and gasoline, and mixing air and gasoline is a subgoal of the ultimate goal, carting people around. Though the process of natural selection itself has no goal, it evolved entities that (like the automobile) are highly organized to bring about certain goals and subgoals. To reverse-engineer the mind, we must sort them out and identify the ultimate goal in its design. Was the human mind ultimately designed to create beauty? To discover truth? To love and to work? To harmonize with other human beings and with nature?
The logic of natural selection gives the answer. The ultimate goal that the mind was designed to attain is maximizing the number of copies of the genes that created it. Natural selection cares only about the long-term fate of entities that replicate; that is, entities that retain a stable identity across many generations of copying. It predicts only that replicators whose effects tend to enhance the probability of their own replication come to predominate. When we ask questions like “Who or what is supposed to benefit from an adaptation?” and “What is a design in living things a design for?” the theory of natural selection provides the answer: the long-term stable replicators, genes. Even our bodies, our selves, are not the ultimate beneficiary of our design. As Gould has said, “What is the ‘individual reproductive success’ of which Darwin speaks? It cannot be the passage of one's body into the next generation — for, truly, you can't take it with you in this sense above all!” The criterion by which genes get selected is the quality of the bodies they build, but it is the genes making it into the next generation, not the perishable bodies, that are selected to live and fight another day.
Though there are some holdouts (such as Gould himself), the gene's-eye view predominates in evolutionary biology and has been a stunning success. It has asked, and is finding answers to, the deepest questions about life, such as how life arose, why there are cells, why there are bodies, why there is sex, how the genome is structured, why animals interact socially, and why there is communication. It is as indispensable to researchers in animal behavior as Newton's laws are to mechanical engineers.
But almost everyone misunderstands the theory. Contrary to popular belief, the gene-centered theory of evolution does not imply that the point of all human striving is to spread our genes. With the exception of {44} the fertility doctor who artificially inseminated patients with his own semen, the donors to the sperm bank for Nobel Prize winners, and other kooks, no human being (or animal) strives to spread his or her genes. Dawkins explained the theory in a book called The Selfish Gene, and the metaphor was chosen carefully. People don't selfishly spread their genes; genes selfishly spread themselves. They do it by the way they build our brains. By making us enjoy life, health, sex, friends, and children, the genes buy a lottery ticket for representation in the next generation, with odds that were favorable in the environment in which we evolved. Our goals are subgoals of the ultimate goal of the genes, replicating themselves. But the two are different. As far as we are concerned, our goals, conscious or unconscious, are not about genes at all, but about health and lovers and children and friends.
The confusion between our goals and our genes’ goals has spawned one muddle after another. A reviewer of a book about the evolution of sexuality protests that human adultery, unlike the animal equivalent, cannot be a strategy to spread the genes because adulterers take steps to prevent pregnancy. But whose strategy are we talking about? Sexual desire is not people's strategy to propagate their genes. It's people's strategy to attain the pleasures of sex, and the pleasures of sex are the genes’ strategy to propagate themselves. If the genes don't get propagated, it's because we are smarter than they are. A book on the emotional life of animals complains that if altruism according to biologists is just helping kin or exchanging favors, both of which serve the interests of one's genes, it would not really be altruism after all, but some kind of hypocrisy. This too is a mixup. Just as blueprints don't necessarily specify blue buildings, selfish genes don't necessarily specify selfish organisms. As we shall see, sometimes the most selfish thing a gene can do is to build a selfless brain. Genes are a play within a play, not the interior monologue of the players.
The evolutionary psychology of this book is a departure from the dominant view of the human mind in our intellectual tradition, which Tooby and Cosmides have dubbed the Standard Social Science Model (SSSM). The SSSM proposes a fundamental division between biology and culture. {45} Biology endows humans with the five senses, a few drives like hunger and fear, and a general capacity to learn. But biological evolution, according to the SSSM, has been superseded by cultural evolution. Culture is an autonomous entity that carries out a desire to perpetuate itself by setting up expectations and assigning roles, which can vary arbitrarily from society to society. Even the reformers of the SSSM have accepted its framing of the issues. Biology is “just as important as” culture, say the reformers; biology imposes “constraints” on behavior, and all behavior is a mixture of the two.
The SSSM not only has become an intellectual orthodoxy but has acquired a moral authority. When sociobiologists first began to challenge it, they met with a ferocity that is unusual even by the standards of academic invective. The biologist E. O. Wilson was doused with a pitcher of ice water at a scientific convention, and students yelled for his dismissal over bullhorns and put up posters urging people to bring noisemakers to his lectures. Angry manifestos and book-length denunciations were published by organizations with names like Science for the People and The Campaign Against Racism, IQ, and the Class Society. In Not in Our Genes, Richard Lewontin, Steven Rose, and Leon Kamin dropped innuendos about Donald Symons’ sex life and doctored a defensible passage of Richard Dawkins’ into an insane one. (Dawkins said of the genes, “They created us, body and mind”; the authors have quoted it repeatedly as “They control us, body and mind.”) When Scientific American ran an article on behavior genetics (studies of twins, families, and adoptees), they entitled it “Eugenics Revisited,” an allusion to the discredited movement to improve the human genetic stock. When the magazine covered evolutionary psychology, they called the article “The New Social Darwinists,” an allusion to the nineteenth-century movement that justified social inequality as part of the wisdom of nature. Even one of sociobiology's distinguished practitioners, the primatologist Sarah Blaffer Hrdy, said, “I question whether sociobiology should be taught at the high school level, or even the undergraduate level. . . . The whole message of sociobiology is oriented toward the success of the individual. It's Machiavellian, and unless a student has a moral framework already in place, we could be producing social monsters by teaching this. It really fits in very nicely with the yuppie ‘me first’ ethos.”
Entire scholarly societies joined in the fun, passing votes on empirical issues that one might have thought would be hashed out in the lab and the field. Margaret Mead's portrayal of an idyllic, egalitarian Samoa was {46} one of the founding documents of the SSSM, and when the arithropologist Derek Freeman showed that she got the facts spectacularly wrong, the American Anthropological Association voted at its business meeting to denounce his finding as unscientific. In 1986, twenty social scientists at a “Brain and Aggression” meeting drafted the Seville Statement on Violence, subsequently adopted by UNESCO and endorsed by several scientific organizations. The statement claimed to “challenge a number of alleged biological findings that have been used, even by some in our disciplines, to justify violence and war”:
It is scientifically incorrect to say that we have inherited a tendency to make war from our animal ancestors.
It is scientifically incorrect to say that war or any other violent behavior is genetically programmed into our human nature.
It is scientifically incorrect to say that in the course of human evolution there has been a selection for aggressive behavior more than for other kinds of behavior.
It is scientifically incorrect to say that humans have a “violent brain.”
It is scientifically incorrect to say that war is caused by “instinct” or any single motivation. . . . We conclude that biology does not condemn humanity to war, and that humanity can be freed from the bondage of biological pessimism and empowered with confidence to undertake the transformative tasks needed in the International Year of Peace and in the years to come.
What moral certainty could have incited these scholars to doctor quotations, censor ideas, attack the ideas’ proponents ad hominem, smear them with unwarranted associations to repugnant political movements, and mobilize powerful institutions to legislate what is correct and incorrect? The certainty comes from an opposition to three putative implications of an innate human nature.
First, if the mind has an innate structure, different people (or different classes, sexes, and races) could have different innate structures. That would justify discrimination and oppression.
Second, if obnoxious behavior like aggression, war, rape, clannishness, and the pursuit of status and wealth are innate, that would make them “natural” and hence good. And even if they are deemed objectionable, they are in the genes and cannot be changed, so attempts at social reform are futile.
Third, if behavior is caused by the genes, then individuals cannot be {47} held responsible for their actions. If the rapist is following a biological imperative to spread his genes, it's not his fault.
Aside perhaps from a few cynical defense lawyers and a lunatic fringe who are unlikely to read manifestos in the New York Review of Books, no one has actually drawn these mad conclusions. Rather, they are thought to be extrapolations that the untutored masses might draw, so the dangerous ideas must themselves be suppressed. In fact, the problem with the three arguments is not that the conclusions are so abhorrent that no one should be allowed near the top of the slippery slope that leads to them. The problem is that there is no such slope; the arguments are non sequiturs. To expose them, one need only examine the logic of the theories and separate the scientific from the moral issues.
My point is not that scientists should pursue the truth in their ivory tower, undistracted by moral and political thoughts. Every human act involving another living being is both the subject matter of psychology and the subject matter of moral philosophy, and both are important. But they are not the same thing. The debate over human nature has been muddied by an intellectual laziness, an unwillingness to make moral arguments when moral issues come up. Rather than reasoning from principles of rights and values, the tendency has been to buy an off-the-shelf moral package (generally New Left or Marxist) or to lobby for a feel-good picture of human nature that would spare us from having to argue moral issues at all.
The moral equation in most discussions of human nature is simple: innate equals right-wing equals bad. Now, many hereditarian movements have been right-wing and bad, such as eugenics, forced sterilization, genocide, discrimination along racial, ethnic, and sexual lines, and the justification of economic and social castes. The Standard Social Science Model, to its credit, has provided some of the grounds that thoughtful social critics have used to undermine these practices.
But the moral equation is wrong as often as it is right. Sometimes left-wing practices are just as bad, and the perpetrators have tried to justify them using the SSSM's denial of human nature. Stalin's purges, the Gulag, Pol Pot's killing fields, and almost fifty years of repression in China — all have been justified by the doctrine that dissenting ideas {48} reflect not the operation of rational minds that have come to different conclusions, but arbitrary cultural products that can be eradicated by re-engineering the society, “re-educating” those who were tainted by the old upbringing, and, if necessary, starting afresh with a new generation of slates that are still blank.
And sometimes left-wing positions are right because the denial of human nature is wrong. In Hearts and Minds, the 1974 documentary about the war in Vietnam, an American officer explains that we cannot apply our moral standards to the Vietnamese because their culture does not place a value on individual lives, so they do not suffer as we do when family members are killed. The director plays the quote over footage of wailing mourners at the funeral of a Vietnamese casualty, reminding us that the universality of love and grief refutes the officer's horrifying rationalization. For most of this century, guilty mothers have endured inane theories blaming them for every dysfunction or difference in their children (mixed messages cause schizophrenia, coldness causes autism, domineering causes homosexuality, lack of boundaries causes anorexia, insufficient “motherese” causes language disorders). Menstrual cramps, pregnancy sickness, and childbirth pain have been dismissed as women's “psychological” reactions to cultural expectations, rather than being treated as legitimate health issues.
The foundation of individual rights is the assumption that people have wants and needs and are authorities on what those wants and needs are. If people's stated desires were just some kind of erasable inscription or reprogrammable brainwashing, any atrocity could be justified. (Thus it is ironic that fashionable “liberation” ideologies like those of Michel Foucault and some academic feminists invoke a socially conditioned “interiorized authority,” “false consciousness,” or “inauthentic preference” to explain away the inconvenient fact that people enjoy the things that are alleged to oppress them.) A denial of human nature, no less than an emphasis on it, can be warped to serve harmful ends. We should expose whatever ends are harmful and whatever ideas are false, and not confuse the two.
So what about the three supposed implications of an innate human nature? The first “implication” — that an innate human nature implies innate human differences — is no implication at all. The mental machinery {49} I argue for is installed in every neurologically normal human being. The differences among people may have nothing to do with the design of that machinery. They could very well come from random variations in the assembly process or from different life histories. Even if the differences were innate, they could be quantitative variations and minor quirks in equipment present in all of us (how fast a module works, which module prevails in a competition inside the head) and are not necessarily any more pernicious than the kinds of innate differences allowed in the Standard Social Science Model (a faster general-purpose learning process, a stronger sex drive).
A universal structure to the mind is not only logically possible but likely to be true. Tooby and Cosmides point out a fundamental consequence of sexual reproduction: every generation, each person's blueprint is scrambled with someone else's. That means we must be qualitatively alike. If two people's genomes had designs for different kinds of machines, like an electric motor and a gasoline engine, the new pastiche would not specify a working machine at all. Natural selection is a homogenizing force within a species; it eliminates the vast majority of macroscopic design variants because they are not improvements. Natural selection does depend on there having been variation in the past, but it feeds off the variation and uses it up. That is why all normal people have the same physical organs, and why we all surely have the same mental organs as well. There are, to be sure, microscopic variations among people, mostly small differences in the molecule-by-molecule sequence of many of our proteins. But at the level of functioning organs, physical and mental, people work in the same ways. Differences among people, for all their endless fascination to us as we live our lives, are of minor interest when we ask how the mind works. The same is true for differences — whatever their source — between the averages of entire groups of people, such as races.
The sexes, of course, are a different matter. The male and female reproductive organs are a vivid reminder that qualitatively different designs are possible for the sexes, and we know that the differences come from the special gadget of a genetic “switch,” which triggers a line of biochemical dominoes that activate and deactivate families of genes throughout the brain and body. I will present evidence that some of these effects cause differences in how the mind works. In another of the ironies that run through the academic politics of human nature, this evolution-inspired research has proposed sex differences that are tightly focused on {50} reproduction and related domains, and are far less invidious than the differences proudly claimed by some schools of feminism. Among the claims of “difference feminists” are that women do not engage in abstract linear reasoning, that they do not treat ideas with skepticism or evaluate them through rigorous debate, that they do not argue from general moral principles, and other insults.
But ultimately we cannot just look at who is portrayed more flatteringly; the question is what to make of any group differences we do stumble upon. And here we must be prepared to make a moral argument. Discrimination against individuals on the basis of their race, sex, or ethnicity is wrong. The argument can be defended in various ways that have nothing to do with the average traits of the groups. One might argue that it is unfair to deny a social benefit to individuals because of factors they cannot control, or that a victim of discrimination experiences it as a uniquely painful sting, or that a group of victims is liable to react with rage, or that discrimination tends to escalate into horrors like slavery and genocide. (Those who favor affirmative action could acknowledge that reverse discrimination is wrong but argue that it undoes an even greater wrong.) None of these arguments is affected by anything any scientist will ever claim to discover. The final word on the political non-implications of group differences must go to Gloria Steinem: “There are really not many jobs that actually require a penis or a vagina, and all the other occupations should be open to everyone.”
The fallacy of the second supposed implication of a human nature — that if our ignoble motives are innate, they can't be so bad after all — is so obvious it has been given a name: the naturalistic fallacy, that what happens in nature is right. Forget the romantic nonsense in wildlife documentaries, where all creatures great and small act for the greater good and the harmony of the ecosystem. As Darwin said, “What a book a devil's chaplain might write on the clumsy, wasteful, blundering, low, and horribly cruel works of nature!” A classic example is the ichneumon wasp, who paralyzes a caterpillar and lays eggs in its body so her hatch-lings can slowly devour its living flesh from the inside.
Like many species, Homo sapiens is a nasty business. Recorded history from the Bible to the present is a story of murder, rape, and war, and {51} honest ethnography shows that foraging peoples, like the rest of us, are more savage than noble. The !Kung San of the Kalahari Desert are often held out as a relatively peaceful people, and so they are, compared with other foragers: their murder rate is only as high as Detroit's. A linguist friend of mine who studies the Wari in the Amazon rainforest learned that their language has a term for edible things, which includes anyone who isn't a Wari. Of course humans don't have an “instinct for war” or a “violent brain,” as the Seville Statement assures us, but humans don't exactly have an instinct for peace or a nonviolent brain, either. We cannot attribute all of human history and ethnography to toy guns and superhero cartoons.
Does that mean that “biology condemns man to war” (or rape or murder or selfish yuppies) and that any optimism about reducing it should be snuffed out? No one needs a scientist to make the moral point that war is not healthy for children and other living things, or the empirical point that some places and periods are vastly more peaceable than others and that we should try to understand and duplicate what makes them so. And no one needs the bromides of the Seville Statement or its disinformation that war is unknown among animals and that their dominance hierarchies are a form of bonding and affiliation that benefits the group. What could not hurt is a realistic understanding of the psychology of human malevolence. For what it's worth, the theory of a module-packed mind allows both for innate motives that lead to evil acts and for innate motives that can avert them. Not that this is a unique discovery of evolutionary psychology; all the major religions observe that mental life is often a struggle between desire and conscience.
When it comes to the hopes of changing bad behavior, the conventional wisdom again needs to be inverted: a complex human nature may allow more scope for change than the blank slate of the Standard Social Science Model. A richly structured mind allows for complicated negotiations inside the head, and one module could subvert the ugly designs of another one. In the SSSM, in contrast, upbringing is often said to have an insidious and irreversible power. “Is it a boy or a girl?” is the first question we ask about a new human being, and from then on parents treat their sons and daughters differently: they touch, comfort, breast-feed, indulge, and talk to boys and girls in unequal amounts. Imagine that this behavior has long-term consequences on the children, which include all the documented sex differences and a tendency to treat their children differently from birth. Unless we stationed parenting police in the maternity {52} ward, the circle would be complete and irrevocable. Culture would condemn women to inferiority, and we would be enslaved to the bondage of cultural pessimism, disempowered by self-doubt from undertaking transformative tasks.
Nature does not dictate what we should accept or how we should live our lives. Some feminists and gay activists react with fury to the banal observations that natural selection designed women in part for growing and nursing children and that it designed both men and women for heterosexual sex. They see in those observations the sexist and homophobic message that only traditional sexual roles are “natural” and that alternative lifestyles are to be condemned. For example, the novelist Mary Gordon, mocking a historian's remark that what all women have in common is the ability to bear children, wrote, “If the defining quality of being a woman is the ability to bear children, then not bearing children (as, for instance, Florence Nightingale and Greta Garbo did not) is somehow a failure to fulfill your destiny.” I'm not sure what “the defining quality of being a woman” and “fulfilling your destiny” even mean, but I do know that happiness and virtue have nothing to do with what natural selection designed us to accomplish in the ancestral environment. They are for us to determine. In saying this I am no hypocrite, even though I am a conventional straight white male. Well into my procreating years I am, so far, voluntarily childless, having squandered my biological resources reading and writing, doing research, helping out friends and students, and jogging in circles, ignoring the solemn imperative to spread my genes. By Darwinian standards I am a horrible mistake, a pathetic loser, not one iota less than if I were a card-carrying member of Queer Nation. But I am happy to be that way, and if my genes don't like it, they can go jump in the lake.
Finally, what about blaming bad behavior on our genes? The neuroscientist Steven Rose, in a review of a book by E. O. Wilson in which Wilson wrote that men have a greater desire for polygamy than women, accused him of really saying, “Don't blame your mates for sleeping around, ladies, it's not their fault they are genetically programmed.” The title of Rose's own book with Lewontin and Kamin, Not in Our Genes, is an allusion to Julius Caesar: {53}
Men at some time are masters of their fates:
The fault, dear Brutus, lies not in our stars,
But in ourselves . . .
For Cassius, the programming that was thought to excuse human faults was not genetic but astrological, and that raises a key point. Any cause of behavior, not just the genes, raises the question of free will and responsibility. The difference between explaining behavior and excusing it is an ancient theme of moral reasoning, captured in the saw “To understand is not to forgive.”
In this scientific age, “to understand” means to try to explain behavior as a complex interaction among (1) the genes, (2) the anatomy of the brain, (3) its biochemical state, (4) the person's family upbringing, (5) the way society has treated him or her, and (6) the stimuli that impinge upon the person. Sure enough, every one of these factors, not just the stars or the genes, has been inappropriately invoked as the source of our faults and a claim that we are not masters of our fates.
(1) In 1993 researchers identified a gene that was associated with uncontrollable violent outbursts. (“Think of the implications,” one columnist wrote. “We may someday have a cure for hockey”) Soon afterward came the inevitable headline: “Man's Genes Have Made Him Kill, His Lawyers Claim.”
(2) In 1982 an expert witness in the insanity defense of John Hinckley, who had shot President Reagan and three other men to impress the actress Jodie Foster, argued that a CAT scan of Hinckley's brain showed widened sulci and enlarged ventricles, a sign of schizophrenia and thus an excusing mental disease or defect. (The judge excluded the evidence, though the insanity defense prevailed.)
(3) In 1978 Dan White, having resigned from the San Francisco Board of Supervisors, walked into Mayor George Moscone's office and begged to be reinstated. When Moscone refused, White shot him dead, walked down the hall into the office of Supervisor Harvey Milk, and shot him dead too. White's lawyers successfully argued that at the time of his crime White had diminished capacity and had not committed a premeditated act because his binges on sugary junk food played havoc with his brain chemistry. White was convicted of voluntary manslaughter and served five years, thanks to the tactic that lives on in infamy as the Twinkie Defense. Similarly, in what is now known as the PMS (premenstrual {54} syndrome) Defense, raging hormones exonerated a surgeon who had assaulted a trooper who stopped her for drunk driving.
(4) In 1989 Lyle and Erik Menendez burst into their millionaire parents’ bedroom and killed them with a shotgun. After several months of showing off their new Porsches and Rolexes, they confessed to the shootings. Their lawyers argued the case to a hung jury by claiming self-defense, despite the fact that the victims had been lying in bed, unarmed, eating strawberries and ice cream. The Menendez boys, the lawyers said, had been traumatized into believing that their parents were going to kill them because they had been physically, sexually, and emotionally abused by the father for years. (In a new trial in 1996 they were convicted of murder and sent to prison for life.)
(5) In 1994 Colin Ferguson boarded a train and began to shoot white people at random, killing six. The radical lawyer William Kunstler was prepared to defend him by invoking the Black Rage Syndrome, in which an African American can suddenly burst under the accumulated pressure of living in a racist society. (Ferguson rejected the offer and argued his own case, unsuccessfully.)
(6) In 1992 a death-row inmate asked an appeals court to reduce his sentence for rape and murder because he had committed his crimes under the influence of pornography. The Pornography-Made-Me-Do-It Defense is an irony for the schools of feminism that argue that biological explanations of rape reduce the rapist's responsibility and that a good tactic to fight violence against women is to blame it on pornography.
As science advances and explanations of behavior become less fanciful, the Specter of Creeping Exculpation, as Dennett calls it, will loom larger. Without a clearer moral philosophy, any cause of behavior could be taken to undermine free will and hence moral responsibility. Science is!guaranteed to appear to eat away at the will, regardless of what it finds, because the scientific mode of explanation cannot accommodate the mysterious notion of uncaused causation that underlies the will. If scientists wanted to show that people had free will, what would they look for? Some random neural event that the rest of the brain amplifies into a signal triggering behavior? But a random event does not fit the concept of free will any more than a lawful one does, and could not serve as the long-sought locus of moral responsibility. We would not find someone guilty if his finger pulled the trigger when it was mechanically connected to a roulette wheel; {55} why should it be any different if the roulette wheel is inside his skull? The same problem arises for another unpredictable cause that has been suggested as the source of free will, chaos theory, in which, according to the cliche, a butterfly's flutter can set off a cascade of events culminating in a hurricane. A fluttering in the brain that causes a hurricane of behavior, if it were ever found, would still be a cause of behavior and would not fit the concept of uncaused free will that underlies moral responsibility.
Either we dispense with all morality as an unscientific superstition, or we find a way to reconcile causation (genetic or otherwise) with responsibility and free will. I doubt that our puzzlement will ever be completely assuaged, but we can surely reconcile them in part. Like many philosophers, I believe that science and ethics are two self-contained systems played oul among the same entitk s in the world, just as poker and bridge are different games played with the same fifty-two-card deck. The science game treats people as material objects, and its rules are the physical processes that cause behavior through natural’ selection and neurophysiology. The ethics game treats people as equivalent, sentient, rational, free-willed agents, and its rules are the calculus that assigns moral value to behavior through the behavior's inherent nature or its consequences.
Free will is an idealization of human beings that makes the ethics game playable. Euclidean geometry requires idealizations like infinite straight lines and perfect circles, and its deductions are sound and useful even though the world does not really have infinite straight lines or perfect circles. The world is close enough to the idealization that the theorems can usefully be applied. Similarly, ethical theory requires idealizations like free, sentient, rational, equivalent agents whose behavior is uncaused, and its conclusions can be sound and useful even though the world, as seen by science, does not really have uncaused events. As long as there is no outright coercion or gross malfunction of reasoning, the world is close enough to the idealization of free will that moral theory can meaningfully be applied to it.
Science and morality are separate spheres of reasoning. Only by recognizing them as separate can we have them both. If discrimination is wrong only if group averages are the same, if war and rape and greed are wrong only if people are never inclined toward them, if people are responsible for their actions only if the actions are mysterious, then either scientists must be prepared to fudge their data or all of us must be prepared to give up our values. Scientific arguments would turn into the {56} National Lampoon cover showing a puppy with a gun at its head and the caption “Buy This Magazine or We'll Shoot the Dog.”
The knife that separates causal explanations of behavior from moral responsibility for behavior cuts both ways. In the latest twist in the human-nature morality play, a chromosomal marker for homosexuality in some men, the so-called gay gene, was identified by the geneticist Dean Hamer. To the bemusement of Science for the People, this time it is the genetic explanation that is politically correct. Supposedly it refutes right-wingers like Dan Quayle, who had said that homosexuality “is more of a choice than a biological situation. It is a wrong choice.” The gay gene has been used to argue that homosexuality is not a choice for which gay people can be held responsible but an involuntary orientation they just can't help. But the reasoning is dangerous. The gay gene could just as easily be said to influence some people to choose homosexuality. And like all good science, Hamer's result might be falsified someday, and then where would we be? Conceding that bigotry against gay people is OK after all? The argument against persecuting gay people must be made not in terms of the gay gene or the gay brain but in terms of people's right to engage in private consensual acts without discrimination or harassment.
The cloistering of scientific and moral reasoning in separate arenas also lies behind my recurring metaphor of the mind as a machine, of people as robots. Does this not dehumanize and objectify people and lead us to treat them as inanimate objects? As one humanistic scholar lucidly put it in an Internet posting, does it not render human experience invalid, reifying a model of relating based on an I-It relationship, and delegitimating all other forms of discourse with fundamentally destructive consequences to society? Only if one is so literal-minded that one cannot shift among different stances in conceptualizing people for different purposes. A human being is simultaneously a machine and a sentient free agent, depending on the purpose of the discussion, just as he is also a taxpayer, an insurance salesman, a dental patient, and two hundred pounds of ballast on a commuter airplane, depending on the purpose of the discussion. The mechanistic stance allows us to understand what makes us tick and how we fit into the physical universe. When tfyose discussions wind down for the day, we go back to talking about each other as free and dignified human beings. {57}
The confusion of scientific psychology with moral and political goals, and the resulting pressure to believe in a structureless mind, have rippled perniciously through the academy and modern intellectual discourse. Many of us have been puzzled by the takeover of humanities departments by the doctrines of postmodernism, poststructuralism, and deconstructionism, according to which objectivity is impossible, meaning is self-contradictory, and reality is socially constructed. The motives become clearer when we consider typical statements like “Human beings have constructed and used gender — human beings can deconstruct and stop using gender,” and “The heterosexual/homosexual binary is not in nature, but is socially constructed, and therefore deconstructable.” Reality is denied to categories, knowledge, and the world itself so that reality can be denied to stereotypes of gender, race, and sexual orientation. The doctrine is basically a convoluted way of getting to the conclusion that oppression of women, gays, and minorities is bad. And the dichotomy between “in nature” and “socially constructed” shows a poverty of the imagination, because it omits a third alternative: that some categories are products of a complex mind designed to mesh with what is in nature.
Mainstream social critics, too, can state any absurdity if it fits the Standard Social Science Model. Little boys are encouraged to argue and fight. Children learn to associate sweets with pieasure because parents use sweets as a reward for eating spinach. Teenagers compete in looks and dress because they follow the example set by spelling bees and award ceremonies. Men are socialized into believing that the goal of sex is an orgasm. Eighty-year-old women are considered less physically attractive than twenty-year-olds because our phallic culture has turned the young girl into the cult object of desire. It's not just that there is no evidence for these astonishing claims, but it is hard to credit that the authors, deep down, believe them themselves. These kinds of claims are uttered without concern for whether they are true; they are part of the secular catechism of our age.
Contemporary social commentary rests on archaic conceptions of the mind. Victims burst under the pressure, boys are conditioned to do this, women are brainwashed to value that, girls are taught to be such-and-such. Where do these explanations come from? From the nineteenth-century hydraulic model of Freud, the drooling dogs and key-pressing {58} vermin of behaviorism, the mind-control plots of bad cold-war movies, the wide-eyed, obedient children of Father Knows Best.
But when we look around us, we sense that these simplistic theories just don't ring true. Our mental life is a noisy parliament of competing factions. In dealing with others, we assume they are as complicated as we are, and we guess what they are guessing we are guessing they are guessing. Children defy their parents from the moment they are born, and confound all expectations thereafter: one overcomes horrific circumstances to lead a satisfying life, another is granted every comfort but grows up a rebel without a cause. A modern state loosens its gripj, and its peoples enthusiastically take up the vendettas of their grandparents. And there are no robots.
I believe that a psychology of many computational faculties engineered by natural selection is our best hope for a grasp on how the mind works that does justice to its complexity. But I won't convince you with the opening brief in this chapter. The proof must come from insight into problems ranging from how Magic Eye stereograms work to what makes a landscape beautiful to why we find the thought of eating worms disgusting to why men kill their estranged wives. Whether or not you are persuaded by the arguments so far, I hope they have provoked your thoughts and made you curious about the explanations to come.
<< | {59} | >> |
L |
ike many baby boomers, I was first exposed to problems in philosophy by traveling through another dimension, a dimension not only of sight and sound but of mind, taking a journey into a wondrous land whose boundaries are that of imagination. I am referring to The Twilight Zone, the campy television series by Rod Serling that was popular during my childhood. Philosophers often try to clarify difficult concepts using thought experiments, outlandish hypothetical situations that help us explore the implications of our ideas. The Twilight Zone actually staged them for the camera.
One of the first episodes was called “The Lonely.” James Corry is serving a fifty-year sentence in solitary confinement on a barren asteroid nine million miles from Earth. Allenby, the captain of a supply ship that services the asteroid, takes pity on him and leaves a crate containing “Alicia,” a robot that looks and acts like a woman. At first Corry is repulsed, but of course he soon falls deeply in love. A year later Allenby returns with the news that Corry has been pardoned and he has come to get him. Unfortunately Corry can take only fifteen pounds of gear, and Alicia weighs more than that. When Corry refuses to leave, Allenby reluctantly pulls out a gun and shoots Alicia in the face, exposing a tangle of smoking wires. He tells Corry, “All you're leaving behind is loneliness.” Corry, devastated, mutters, “I must remember that. I must remember to keep that in mind.”
I still remember my horror at the climax, and the episode was much discussed in my pre-teen critics’ circle. (Why didn't he just take her head? asked one commentator.) Our pathos came both from sympathy with Corry {60} for his loss and from the sense that a sentient being had been snuffed out. Of course the directors had manipulated the audience by casting a beautiful actress rather than a heap of tin cans to play Alicia. But in evoking our sympathies they raised two vexing questions. Could a mechanical device ever duplicate human intelligence, the ultimate test being whether it could cause a real human to fall in love with it? And if a humanlike machine could be built, would it actually be conscious — would dismantling it be the act of murder we felt we had witnessed on the small screen?
The two deepest questions about the mind are “What makes intelligence possible?” and “What makes consciousness possible?” With the advent of cognitive science, intelligence has become intelligible. It may not be too outrageous to say that at a very abstract level of analysis the problem has been solved. But consciousness or sentience, the raw sensation of toothaches and redness and saltiness and middle C, is still a riddle wrapped in a mystery inside an enigma. When asked what consciousness is, we have no better answer than Louis Armstrong's when a reporter asked him what jazz is: “Lady, if you have to ask, you'll never know.” But even consciousness is not as thoroughgoing a mystery as it used to be. Parts of the mystery have been pried off and turned into ordinary scientific problems. In this chapter I will first explore what intelligence is, how a physical being like a robot or a brain could achieve it, and how our brains do achieve it. Then I will turn to what we do and do not understand about consciousness.
The Search for Intelligent Life in the Universe is the title of a stage act by the comedian Lily Tomlin, an exploration of human follies and foibles. Tomlin's title plays on the two meanings of “intelligence”: aptitude (as in the famous tongue-in-cheek definition of intelligence as “whatever IQ tests measure”), and rational, humanlike thought. The second meaning is the one I am writing about here.
We may have trouble defining intelligence, but we recognize it when we see it. Perhaps a thought experiment can clarify the concept. Suppose there was an alien being who in every way looked different from us. What would it have to do to make us think it was intelligent? Science-fiction writers, of course, face this problem as part of their job; what better {61} authority could there be on the answer? The author David Alexander Smith gave as good a characterization of intelligence as I have seen when asked by an interviewer, “What makes a good alien?”
One, they have to have intelligent but impenetrable responses to situations. You have to be able to observe the alien's behavior and say, “I don't understand the rules by which the alien is making its decisions, but the alien is acting rationally by some set of rules.” . . . The second requirement is that they have to care about something. They have to want something and pursue it in the face of obstacles.
To make decisions “rationally,” by some set of rules, means to base the decisions on some grounds of truth: correspondence to reality or soundness of inference. An alien who bumped into trees or walked off cliffs, or who went through all the motions of chopping a tree but in fact was hacking at a rock or at empty space, would not seem intelligent. Nor would an alien who saw three predators enter a cave and two leave and then entered the cave as if it were empty.
These rules must be used in service of the second criterion, wanting and pursuing something in the face of obstacles. If we had no fix on what a creature wanted, we could not be impressed when it did something to attain it. For all we know, the creature may have wanted to bump into a tree or bang an ax against a rock, and was brilliantly accomplishing what it wanted. In fact, without a specification of a creature's goals, the very idea of intelligence is meaningless. A toadstool could be given a genius award for accomplishing, with pinpoint precision and unerring reliability, the feat of sitting exactly where it is sitting. Nothing would prevent us from agreeing with the cognitive scientist Zenon Pylyshyn that rocks are smarter than cats because rocks have the sense to go away when you kick them.
Finally, the creature has to use the rational rules to attain the goal in different ways, depending on the obstacles to be overcome. As William James explained:
Romeo wants Juliet as the filings want the magnet; and if no obstacles intervene he moves toward her by as straight a line as they. But Romeo and Juliet, if a wall be built between them, do not remain idiotically pressing their faces against the opposite sides like the magnet and filings with the card. Romeo soon finds a circuitous way, by scaling the wall or otherwise, of touching Juliet's lips directly. With the filings the path is {62} fixed; whether it reaches the end depends on accidents. With the lover it is the end which is fixed; the path may be modified indefinitely.
Intelligence, then, is the ability to attain goals in the face of obstacles by means of decisions based on rational (truth-obeying) rules. The computer scientists Allen Newell and Herbert Simon fleshed this idea out further by noting that intelligence consists of specifying a goal, assessing the current situation to see how it differs from the goal, and applying a set of operations that reduce the difference. Perhaps reassuringly, by this definition human beings, not just aliens, are intelligent. We have desires, and we pursue them using beliefs, which, when all goes well, are at least approximately or probabilistically true.
An explanation of intelligence in terms of beliefs and desires is by no means a foregone conclusion. The old theory of stimulus and response from the school of behaviorism held that beliefs and desires have nothing to do with behavior — indeed, that they are as unscientific as banshees and black magic. Humans and animals emit a response to a stimulus either because it was earlier paired with a reflexive trigger for that response (for example, salivating to a bell that was paired with food) or because the response was rewarded in the presence of that stimulus (for example, pressing a bar that delivers a food pellet). As the famous behaviorist B. F. Skinner said, “The question is not whether machines think, but whether men do.”
Of course, men and women do think; the stimulus-response theory turned out to be wrong. Why did Sally run out of the building? Because she believed it was on fire and did not want to die. Her fleeing was not a predictable response to some stimulus that can be objectively described in the language of physics and chemistry. Perhaps she left when she saw smoke, but perhaps she left in response to a phone call telling her that the building was on fire, or to the sight of arriving fire trucks, or to the sound of a fire alarm. But none of these stimuli would necessarily have sent her out, either. She would not have left if she knew that the smoke was from an English muffin in a toaster, or that the phone call was from a friend practicing lines for a play, or that someone had pulled the alarm switch by accident or as a prank, or that the alarms were being tested by an electrician. The light and sound and particles that physicists can measure do not lawfully predict a person's behavior. What does predict Sally's behavior, and predict it well, is whether she believes herself to be in danger. Sally's beliefs are, of course, related to the stimuli impinging on her, but only in a tortuous, circuitous way, {63} mediated by all the rest of her beliefs about where she is and how the world works. And Sallys behavior depends just as much on whether she wants to escape the danger — if she were a volunteer firefighter, or suicidal, or a zealot who wanted to immolate herself to draw attention to a cause, or had children in the day-care center upstairs, you can bet she would not have fled.
Skinner himself did not pigheadedly insist that measurable stimuli like wavelengths and shapes predicted behavior. Instead, he defined stimuli by his own intuitions. He was perfectly happy calling “danger” — like “praise,” “English,” and “beauty” — a kind of stimulus. That had the advantage of keeping his theory in line with reality, but it was the advantage of theft over honest toil. We understand what it means for a device to respond to a red light or a loud noise — we can even build one that does — but humans are the only devices in the universe that respond to danger, praise, English, and beauty. The ability of a human to respond to something as physically nebulous as praise is part of the puzzle we are trying to solve, not part of the solution to the puzzle. Praise, danger, English, and all the other things we respond to, no less than beauty, are in the eye of the beholder, and the eye of the beholder is what we want to explain. The chasm between what can be measured by a physicist and what can cause behavior is the reason we must credit people with beliefs and desires.
In our daily lives we all predict and explain other people's behavior from what we think they know and what we think they want. Beliefs and desires are the explanatory tools of our own intuitive psychology, and intuitive psychology is still the most useful and complete science of behavior there is. To predict the vast majority of human acts — going to the refrigerator, getting on the bus, reaching into one's wallet — you don't need to crank through a mathematical model, run a computer simulation of a neural network, or hire a professional psychologist; you can just ask your grandmother.
It's not that common sense should have any more authority in psychology than it does in physics or astronomy. But this part of common sense has so much power and precision in predicting, controlling, and explaining everyday behavior, compared to any alternative ever entertained, that the odds are high that it will be incorporated in some form into our best scientific theories. I call an old friend on the other coast and we agree to meet in Chicago at the entrance of a bar in a certain hotel on a particular day two months hence at 7:45 P.M. I predict, he predicts, and everyone who knows us predicts that on that day at that time we will meet up. And we do meet up. That is amazing! In what other domain could laypeople — {64} or scientists, for that matter — predict, months in advance, the trajectories of two objects thousands of miles apart to an accuracy of inches and minutes? And do it from information that can be conveyed in a few seconds of conversation? The calculus behind this forecasting is intuitive psychology: the knowledge that I want to meet my friend and vice versa, and that each of us believes the other will be at a certain place at a certain time and knows a sequence of rides, hikes, and flights that will take us there. No science of mind or brain is ever likely to do better. That does not mean that the intuitive psychology of beliefs and desires is itself a science, but it suggests that scientific psychology will have to explain how a hunk of matter, such as a human being, can have beliefs and desires and how the beliefs and desires work so well.
The traditional explanation of intelligence is that human flesh is suffused with a non-material entity, the soul, usually envisioned as some kind of ghost or spirit. But the theory faces an insurmountable problem: How does the spook interact with solid matter? How does an ethereal nothing respond to flashes, pokes, and beeps and get arms and legs to move? Another problem is the overwhelming evidence that the mind is the activity of the brain. The supposedly immaterial soul, we now know, can be bisected with a knife, altered by chemicals, started or stopped by electricity, and extinguished by a sharp blow or by insufficient oxygen. Under a microscope, the brain has a breathtaking complexity of physical structure fully commensurate with the richness of the mind.
Another explanation is that mind comes from some extraordinary form of matter. Pinocchio was animated by a magical kind of wood found by Geppetto that talked, laughed, and moved on its own. Alas, no one has ever discovered such a wonder substance. At first one might think that the wonder substance is brain tissue. Darwin wrote that the brain “secretes” the mind, and recently the philosopher John Searle has argued that the physico-chemical properties of brain tissue somehow produce the mind just as breast tissue produces milk and plant tissue produces sugar. But recall that the same kinds of membranes, pores, and chemicals are found in brain tissue throughout the animal kingdom, not to mention in brain tumors and cultures in dishes. All of these globs of neural tissue have the same physico-chemical properties, but hot all of {65} them accomplish humanlike intelligence. Of course, something about the tissue in the human brain is necessary for our intelligence, but the physical properties are not sufficient, just as the physical properties of bricks are not sufficient to explain architecture and the physical properties of oxide particles are not sufficient to explain music. Something in the patterning of neural tissue is crucial.
Intelligence has often been attributed to some kind of energy flow or force field. Orbs, luminous vapors, auras, vibrations, magnetic fields, and lines of force figure prominently in spiritualism, pseudoscience, and science-fiction kitsch. The school of Gestalt psychology tried to explain visual illusions in terms of electromagnetic force fields on the surface of the brain, but the fields were never found. Occasionally the brain surface has been described as a continuous vibrating medium that supports holograms or other wave interference patterns, but that idea, too, has not panned out. The hydraulic model, with its psychic pressure building up, bursting out, or being diverted through alternative channels, lay at the center of Freud's theory and can be found in dozens of everyday metaphors: anger welling up, letting off steam, exploding under the pressure, blowing one's stack, venting one's feelings, bottling up rage. But even the hottest emotions do not literally correspond to a buildup and discharge of energy (in the physicist's sense) somewhere in the brain. In Chapter 6 I will try to persuade you that the brain does not actually operate by internal pressures but contrives them as a negotiating tactic, like a terrorist with explosives strapped to his body.
A problem with all these ideas is that even if we did discover some gel or vortex or vibration or orb that spoke and plotted mischief like Gep-petto's log, or that, more generally, made decisions based on rational rules and pursued a goal in the face of obstacles, we would still be faced with the mystery of how it accomplished those feats.
No, intelligence does not come from a special kind of spirit or matter or energy but from a different commodity, information. Information is a correlation between two things that is produced by a lawful process (as opposed to coming about by sheer chance). We say that the rings in a stump carry information about the age of the tree because their number correlates with the tree's age (the older the tree, the more rings it has), and the correlation is not a coincidence but is caused by the way trees grow. Correlation is a mathematical and logical concept; it is not defined in terms of the stuff that the correlated entities are made of.
Information itself is nothing special; it is found wherever causes leave {66} effects. What is special is information processing. We can regard a piece of natter that carries information about some state of affairs as a symbol; it can “stand for” that stale of affairs. But as a piece of matter, it can do other things as well — physical things, whatever that kind of matter in that kind of state can do according to the laws of physics and chemistry. Tree rings carry information about age, but they also reflect light and absorb staining material. Footprints carry information about animal motions, but they also trap water and cause eddies in the wind.
Now here is an idea. Suppose one were to build a machine with parts that are affected by the physical properties of some symbol. Some lever or electric eye or tripwire or magnet is set in motion by the pigment absorbed by a tree ring, or the water trapped by a footprint, or the light reflected by a chalk mark, or the magnetic charge in a bit of oxide. And suppose that the machine then causes something to happen in some other pile of matter. It burns new marks onto a piece of wood, or stamps impressions into nearby dirt, or charges some other bit of oxide. Nothing special has happened so far; all I have described is a chain of physical events accomplished by a pointless contraption.
Here is the special step. Imagine that we now try to interpret the newly arranged piece of matter using the scheme according to which the original piece carried information. Say we count the newly bunled wood rings and interpret them as the age of some tree at some time, even though they were not caused by the growth of any tree. And let's say that the machine was carefully designed so that the interpretation of its new markings made sense — that is, so that they carried information about something in the world. For example, imagine a machine that scans the rings in a stump, burns one mark on a nearby plank for each ring, moves over to a smaller stump from a tree that was cut down at the same time, scans its rings, and sands off one mark in the plank for each ring. When we count the marks on the plank, we have the age of the first tree at the time that the second one was planted. We would have a kind of rational machine, a machine that produces true conclusions from true premises — not because of any special kind of matter or energy, or because of any part that was itself intelligent or rational. All we have is a carefully contrived chain of ordinary physical events, whose first link was a configuration of matter that carries information. Our rational machine owes its rationality to two properties glued together in the entity we call a symbol: a symbol carries information, and it causes things to happen. (Tree rings correlate with the age of the tree, and they can absorb the light beam of a scanner.) {67} When the caused things themselves carry information, we call the whole system an information processor, or a computer.
Now, this whole scheme might seem like an unrealizable hope. What guarantee is there that any collection of thingamabobs can be arranged to fall or swing or shine in just the right pattern so that when their effects are interpreted, the interpretation will make sense? (More precisely, so that it will make sense according to some prior law or relationship we find interesting; any heap of stuff can be given a contrived interpretation after the fact.) How confident can we be that some machine will make marks that actually correspond to some meaningful state of the world, like the age of a tree when another tree was planted, or the average age of the tree's offspring, or anything else, as opposed to being a meaningless pattern corresponding to nothing at all?
The guarantee comes from the work of the mathematician Alan Turing. He designed a hypothetical machine whose input symbols and output symbols could correspond, depending on the details of the machine, to any one of a vast number of sensible interpretations. The machine consists of a tape divided into squares, a read-write head that can print or read a symbol on a square and move the tape in either direction, a pointer that can point to a fixed number of tickmarks on the machine, and a set of mechanical reflexes. Each reflex is triggered by the symbol being read and the current position of the pointer, and it prints a symbol on the tape, moves the tape, and/or shifts the pointer. The machine is allowed as much tape as it needs. This design is called a Turing machine.
What can this simple machine do? It can take in symbols standing for a number or a set of numbers, and print out symbols standing for new numbers that are the corresponding value for any mathematical function that can be solved by a step-by-step sequence of operations (addition, multiplication, exponentiation, factoring, and so on — I am being imprecise to convey the importance of Turing's discovery without the technicalities). It can apply the rules of any useful logical system to derive true statements from other true statements. It can apply the rules of any grammar to derive well-formed sentences. The equivalence among Turing machines, calculable mathematical functions, logics, and grammars, led the logician Alonzo Church to conjecture that any well-defined recipe or set of steps that is guaranteed to produce the solution to some problem in a finite amount of time (that is, any algorithm) can be implemented on a Turing machine.
What does this mean? It means that to the extent that the world {68} obeys mathematical equations that can be solved step by step, aimachine can be built that simulates the world and makes predictions abdut it. To the extent that rational thought corresponds to the rules of logic, a machine can be built that carries out rational thought. To the extent that a language can be captured by a set of grammatical rules, a machine can be built that produces grammatical sentences. To the extent that thought consists of applying any set of well-specified rules, a machine can be built that, in some sense, thinks.
Turing showed that rational machines — machines that use the physical properties of symbols to crank out new symbols that make some kind of sense — are buildable, indeed, easily buildable. The computer scientist Joseph Weizenbaum once showed how to build one out of a die, some rocks, and a roll of toilet paper. In fact, one doesn't even need a huge warehouse of these machines, one to do sums, another to do square roots, a third to print English sentences, and so on. One kind of Turing machine is called a universal Turing machine. It can take in a description of any other Turing machine printed on its tape and thereafter mimic that machine exactly. A single machine can be programmed to do anything that any set of rules can do.
Does this mean that the human brain is a Turing machine? Certainly not. There are no Turing machines in use anywhere, let alone in our heads. They are useless in practice: too clumsy, too hard to program, too big, and too slow. But it does not matter. Turing merely wanted to prove that some arrangement of gadgets could function as an intelligent symbol-processor. Not long after his discovery, more practical symbol-processors were designed, some of which became IBM and Univac mainframes and, later, Macintoshes and PCs. But all of them were equivalent to Turing's universal machine. If we ignore size and speed, and give them as much memory storage as they need, we can program them to produce the same outputs in response to the same inputs.
Still other kinds of symbol-processors have been proposed as models of the human mind. These models are often simulated on commercial computers, but that is just a convenience. The commercial computer is first programmed to emulate the hypothetical mental computer (creating what computer scientists call a virtual machine), in much the same way that a Macintosh can be programmed to emulate a PC. Only the virtual mental computer is taken seriously, not the silicon chips that emulate it. Then a program that is meant to model some sort of ithinking (solving a problem, understanding a sentence) is run on the virtual mental {69} computer. A new way of understanding human intelligence has been born.
Let me show you how one of these models works. In an age when real computers are so sophisticated that they are almost as incomprehensible to laypeople as minds are, it is enlightening to see an example of computation in slow motion. Only then can one appreciate how simple devices can be wired together to make a symbol-processor that shows real intelligence. A lurching Turing machine is a poor advertisement for the theory that the mind is a computer, so I will use a model with at least a vague claim to resembling our mental computer. I'll show you how it solves a problem from everyday life — kinship relations — that is complex enough that we can be impressed when a machine solves it.
The model we'll use is called a production system. It eliminates the feature of commercial computers that is most starkly unbiological: the ordered list of programming steps that the computer follows single-mind-edly, one after another. A production system contains a memory and a set of reflexes, sometimes called “demons” because they are simple, self-contained entities that sit around waiting to spring into action. The memory is like a bulletin board on which notices are posted. Each demon is a knee-jerk reflex that waits for a particular notice on the board and responds by posting a notice of its own. The demons collectively constitute a program. As they are triggered by notices on the memory board and post notices of their own, in turn triggering other demons, and so on, the information in memory changes and eventually contains the correct output for a given input. Some demons are connected to sense organs and are triggered by information in the world rather than information in memory. Others are connected to appendages and respond by moving the appendages rather than by posting more messages in memory.
Suppose your long-term memory contains knowledge of the immediate families of you and everyone around you. The content of that knowledge is a set of propositions like “Alex is the father of Andrew.” According to the computational theory of mind, that information is embodied in symbols: a collection of physical marks that correlate with the state of the world as it is captured in the propositions.
These symbols cannot be English words and sentences, notwithstanding {70} the popular misconception that we think in our mother tongue. As I showed in The Language Instinct, sentences in a spoken language like English or Japanese are designed for vocal communication between impatient, intelligent social beings. They achieve brevity by leaving out any information that the listener can mentally fill in from the context. In contrast, the “language of thought” in which knowledge is couched can leave nothing to the imagination, because it is the imagination. Another problem with using English as the medium of knowledge is that English sentences can be ambiguous. When the serial killer Ted Bundy wins a stay of execution and the headline reads “Bundy Beats Date with Chair,” we do a double-take because our mind assigns two meanings to the string of words. If one string of words in English can correspond to two meanings in the mind, meanings in the mind cannot be strings of words in English. Finally, sentences in a spoken language are cluttered with articles, prepositions, gender suffixes, and other grammatical boilerplate. They are needed to help get information from one head to another by way of the mouth and the ear, a slow channel, but they are not needed inside a single head where information can be transmitted directly by thick bundles of neurons. So the statements in a knowledge system are not sentences in English but rather inscriptions in a richer language of thought, “mentalese.”
In our example, the portion of mentalese that captures family relations comes in two kinds of statements. An example of the first is Alex father-of Andrew: a name, followed by an immediate family relationship, followed by a name. An example of the second is Alex is-male: a name followed by its sex. Do not be misled by my use of English words and syntax in the mentalese inscriptions. This is a courtesy to you, the reader, to help you keep track of what the symbols stand for. As far as the machine is concerned, they are simply different arrangements of marks. As long as we use each one consistently to stand for someone (so the symbol used for Alex is always used for Alex and never for anyone else), and arrange them according to a consistent plan (so they preserve information about who is the father of whom), they could be any marks in any arrangement at all. You can think of the marks as bar codes recognized by a scanner, or keyholes that admit only one key, or shapes that fit only one template. Of course, in a commercial computer they would be patterns of charges in silicon, and in a brain they would be firings in sets of neurons. The key point is that nothing in the machine understands them the way you or I do; parts of the machine respond to their shapes; and are {71} triggered to do something, exactly as a gumball machine responds to the shape and weight of a coin by releasing a gumball.
The example to come is an attempt to demystify computation, to get you to see how the trick is done. To hammer home my explanation of the trick — that symbols both stand for some concept and mechanically cause things to happen — I will step through the activity of our production system and describe everything twice: conceptually, in terms of the content of the problem and the logic that solves it, and mechanically, in terms of the brute sensing and marking motions of the system. The system is intelligent because the two correspond exactly, idea-for-mark, log-ical-step-for-motion.
Let's call the portion of the system's memory that holds inscriptions about family relationships the Long-Term Memory. Let's identify another part as the Short-Term Memory, a scratchpad for the calculations. A part of the Short-Term Memory is an area for goals; it contains a list of questions that the system will “try” to answer. The system wants to know whether Gordie is its biological uncle. To begin with, the memory looks like this:
Long-Term Memory |
Short-Term Memory |
Goal |
Abel parent-of Me |
Gordie uncle-of Me? |
|
Abel is-male |
||
Bella parent-of Me |
||
Bella is-female |
||
Claudia sibling-of Me |
||
Claudia is-female |
||
Duddie sibling-of Me |
||
Duddie is-male |
||
Edgar sibling-of Abel |
||
Edgar is-male |
||
Fanny sibling-of Abel |
||
Fanny is-female |
||
Gordie sibling-of Bella |
||
Gordie is-male |
Conceptually speaking, our goal is to find the answer to a question; the answer is affirmative if the fact it asks about is true. Mechanically speaking, the system must determine whether a string of marks in the Goal column followed by a question mark (?) has a counterpart with an identical string of marks somewhere in memory. One of the demons is designed to {72} answer these look-up questions by scanning for identical marks in the Goal and Long-Term Memory columns. When it detects a match, it prints a mark next 1.0 the question which indicates that it has been answered affirmatively. For convenience, let's say the mark looks like this: Yes.
IF: Goal = blah-blah-blah? Long-Term Memory = blah-blah-blah THEN: MARK GOAL Yes |
The conceptual challenge faced by the system is that it does not explicitly know who is whose uncle; that knowledge is implicit in the other things it knows. To say the same thing mechanically: there is no uncle-of mark in the Long-Term Memory; there are only marks like sibling-of and parent-of. Conceptually speaking, we need to deduce knowledge of unclehood from knowledge of parenthood and knowledge of siblinghood. Mechanically speaking, we need a demon to print an uncle-of inscription flanked by appropriate marks found in sibling-of and parent-of inscriptions. Conceptually speaking, we need to find out who our parents are, identify their siblings, and then pick the males. Mechanically speaking, we need the following demon, which prints new inscriptions in the Goal area that trigger the appropriate memory searches:
IF: Goal = Q uncle-of P THEN: ADD GOAL Find P's Parents Find Parents’ Siblings Distinguish Uncles/Aunts |
This demon is triggered by an uncle-of inscription in the Goal column. The Goal column indeed has one, so the demon goes to work and adds some new marks to the column:
Long-Term Memory |
Short-Term Memory |
Goal |
Abel parent-of Me |
Gordie uncle-of Me? |
|
Abel is-male |
Find Me's Parents |
|
Bella parent-of Me |
Find Parents’ Siblings |
|
Bella is-female |
Distinguish Uncles/Aunts |
|
Claudia sibling-of Me |
||
Claudia is-female |
||
Duddie is-male |
||
Edgar sibling-of Abel |
||
Edgar is-male |
||
Fanny sibling-of Abel |
||
Fanny is-female |
||
Gordie sibling-of Bella |
||
Gordie is-male |
||
... |
There must also be a device — some other demon, or extra machinery inside this demon — that minds its Ps and Qs. That is, it replaces the P label with a list of the actual labels for names: Me, Abel, Gordie, and so on. I'm hiding these details to keep things simple.
The new Goal inscriptions prod other dormant demons into action. One of them (conceptually speaking) looks up the system's parents, by (mechanically speaking) copying all the inscriptions containing the names of the parents into Short-Term Memory (unless the inscriptions are already there, of course; this proviso prevents the demon from mindlessly making copy after copy like the Sorcerer's Apprentice):
IF: Goal = Find P's Parents Long-Term Memory = X parent-of P Short-Term Memory ¹ X parent-of P THEN: COPY TO Short-Term Memory X parent-of P ERASE GOAL |
Our bulletin board now looks like this:
Long-Term Memory |
Short-Term Memory |
Goal |
Abel parent-of Me |
Abel parent-of Me |
Gordie uncle-of Me? |
Abel is-male |
Bella parent-of Me |
Find Parents’ Siblings |
Bella parent-of Me |
Distinguish Uncles/Aunts |
|
Bella is-female |
||
Claudia sibling-of Me |
||
Claudia is-female |
||
Duddie is-male |
||
Edgar sibling-of Abel |
||
Edgar is-male |
||
Fanny sibling-of Abel |
||
Fanny is-female |
||
Gordie sibling-of Bella |
||
Gordie is-male |
||
... |
Now that we know the parents, we can find the parents’ siblings. Mechanically speaking: now that the names of the parents are written in Short-Term Memory, a demon can spring into action that copies inscriptions about the parents’ siblings:
IF: Goal = Find Parent's Siblings Short-Term Memory = X parent-of Y Long-Term Memory = Z sibling-of X Short-Term Memory * Z sibling-of X THEN: COPY TO SHORT-TERM MEMORY Z sibling-of X ERASE GOAL |
Here is its handiwork:
Long-Term Memory |
Short-Term Memory |
Goal |
Abel parent-of Me |
Abel parent-of Me |
Gordie uncle-of Me? |
Abel is-male |
Bella parent-of Me |
Distinguish Uncles/Aunts |
Bella parent-of Me |
Edgar sibling-of Abel |
|
Bella is-female |
Fanny sibling-of Abel |
|
Claudia sibling-of Me |
Gordie sibling-of Bella |
|
Claudia is-female |
||
Duddie sibling-of Me |
||
Duddie is-male |
||
Edgar sibling-of Abel |
||
Edgar is-male |
||
Fanny sibling-of Abel |
||
Fanny is-female |
||
Gordie sibling-of Bella |
||
As it stands, we are considering the aunts and uncles collectively. To separate the uncles from the aunts, we need to find the males. Mechanically speaking, the system needs to see which inscriptions have counterparts in Long-Term Memory with is-male marks next to them. Here is the demon that does the checking:
IF: Goal = Distinguish Uncles/Aunts Short-Term Memory = X parent-of Y Long-Term Memory = Z sibling-of X Long-Term Memory = Z is-male THEN: STORE IN LONG-TERM MEMORY Z uncle-of Y ERASE GOAL |
This is the demon that most directly embodies the system's knowledge of the meaning of “uncle”: a male sibling of a parent. It adds the unclehood inscription to Long-Term Memory, not Short-Term Memory, because the inscription represents a piece of knowledge that is permanently true:
Long-Term Memory |
Short-Term Memory |
Goal |
Edgar uncle-of-Me |
Abel parent-of Me |
Gordie uncle-of Me? |
Gordie uncle-of-Me |
Bella parent-of Me |
|
Abel parent-of Me |
Edgar sibling-of Abel |
|
Abel is-male |
Fanny sibling-of Abel |
|
Bella parent-of Me |
Gordie sibling-of Bella |
|
Bella is-female |
||
Claudia sibling-of Me |
||
Claudia is-female |
||
Duddie sibling-of Me |
||
Duddie is-male |
||
Edgar sibling-of Abel |
||
Edgar is-male |
||
Fanny sibling-of Abel |
||
Fanny is-female |
||
Gordie sibling-of Bella |
||
Gordie is-male |
||
... |
Conceptually speaking, we have just deduced the fact that we inquired about. Mechanically speaking, we have just created mark-for-mark {76} identical inscriptions in the Goal column and the Long-Term Memory column. The very first demon I mentioned, which scans for such duplicates, is triggered to make the mark that indicates the problem has been solved:
Long-Term Memory |
Short-Term Memory |
Goal |
Edgar uncle-of-Me |
Abel parent-of Me |
Gordie uncle-of Me? Yes |
Gordie uncle-of-Me |
Bella parent-of Me |
|
Abel parent-of Me |
Edgar sibling-of Abel |
|
Abel is-male |
Fanny sibling-of Abel |
|
Bella parent-of Me |
Gordie sibling-of Bella |
|
Bella is-female |
||
Claudia sibling-of Me |
||
Claudia is-female |
||
Duddie sibling-of Me |
||
Duddie is-male |
||
Edgar sibling-of Abel |
||
Edgar is-male |
||
Fanny sibling-of Abel |
||
Fanny is-female |
||
Gordie sibling-of Bella |
||
Gordie is-male |
||
... |
What have we accomplished? We have built a system out of lifeless gumball-machine parts that did something vaguely mindlike: it deduced the truth of a statement that it had never entertained before. From ideas about particular parents and siblings and a knowledge of the meaning of unclehood, it manufactured true ideas about particular uncles. The trick, to repeat, came from the processing of symbols: arrangements of matter that have both representational and causal properties, that is, that simultaneously carry information about something and take part in a chain of physical events. Those events make up a computation, because the machinery was crafted so that if the interpretation of the symbols that trigger the machine is a true statement, then the interpretation of the symbols created by the machine is also a true statement. The computational theory of mind is the hypothesis that intelligence is computation in this sense.
“This sense” is broad, and it shuns some of the baggage found in {77} other definitions of computation. For example, we need not assume that the computation is made up of a sequence of discrete steps, that the symbols must be either completely present or competely absent (as opposed to being stronger or weaker, more active or less active), that a correct answer is guaranteed in a finite amount of time, or that the truth value be “absolutely true” or “absolutely false” rather than a probability or a degree of certainty. The computational theory thus embraces an alternative kind of Computer with many elements that are active to a degree corresponding to the probability that some statement is true or false, and in which the activity levels change smoothly to register new and roughly accurate probabilities. (As we shall see, that may be the way the brain works.) The key idea is that the answer to the question “What makes a system smart?” is not the kind of stuff it is made of or the kind of energy flowing through it, but what the parts of the machine stand for and how the patterns of changes inside it are designed to mirror truth-preserving relationships (including probabilistic and fuzzy truths).
Why should you buy the computational theory of mind? Because it has solved millennia-old problems in philosophy, kicked off the computer revolution, posed the significant questions of neuroscience, and provided psychology with a magnificently fruitful research agenda.
Generations of thinkers have banged their heads against the problem of how mind can interact with matter. As Jerry Fodor has put it, “Self-pity can make one weep, as can onions.” How can our intangible beliefs, desires, images, plans, and goals reflect the world around us and pull the levers by which we, in turn, shape the world? Descartes became the laughingstock of scientists centuries after him (unfairly) because he proposed that mind and matter were different kinds of stuff that somehow interacted in a part of the brain called the pineal gland. The philosopher Gilbert Ryle ridiculed the general idea by calling it the Doctrine of the Ghost in the Machine (a phrase that was later co-opted for book titles by the writer Arthur Koestler and the psychologist Stephen Kosslyn and for an album title by the rock group The Police). Ryle and other philosophers argued that mentalistic terms such as “beliefs,” “desires,” and {78} “images” are meaningless and come from sloppy misunderstandings of language, as if someone heard the expression “for Pete's sake” and went around looking for Pete. Simpatico behaviorist psychologists claimed that these invisible entities were as unscientific as the Tooth Fairy and tried to ban them from psychology.
And then along came computers: fairy-free, fully exorcised hunks of metal that could not be explained without the full lexicon of mentalistic taboo words. “Why isn't my computer printing?” “Because the program doesn't know you replaced your dot-matrix printer with a laser printer. It still thinks it is talking to the dot-matrix and is trying to print the document by asking the printer to acknowledge its message. But the printer doesn't understand the message; it's ignoring it because it expects its input to begin with ‘%!’ The program refuses to give up control while it polls the printer, so you have to get the attention of the monitor so that it can wrest control back from the program. Once the program learns what printer is connected to it, they can communicate.” The more complex the system and the more expert the users, the more their technical conversation sounds like the plot of a soap opera.
Behaviorist philosophers would insist that this is all just loose talk. The machines aren't really understanding or trying anything, they would say; the observers are just being careless in their choice of words and are in danger of being seduced into grave conceptual errors. Now, what is wrong with this picture? The philosophers are accusing the computer scientists of fuzzy thinking? A computer is the most legalistic, persnickety, hard-nosed, unforgiving demander of precision and explicitness in the universe. From the accusation you'd think it was the befuddled computer scientists who call a philosopher when their computer stops working rather than the other way around. A better explanation is that computation has finally demystified mentalistic terms. Beliefs are inscriptions in memory, desires are goal inscriptions, thinking is computation, perceptions are inscriptions triggered by sensors, trying is executing operations triggered by a goal.
(You are objecting that we humans feel something when we have a belief or a desire or a perception, and a mere inscription lacks tlie power to create such feelings. Fair enough. But try to separate the problem of explaining intelligence from the problem of explaining conscious feelings. So far I'm trying to explain intelligence; we'll get to consciousness later in the chapter.) {79}
The computational theory of mind also rehabilitates once and for all the infamous homunculus. A standard objection to the idea that thoughts are internal representations (an objection popular among scientists trying to show how tough-minded they are) is that a representation would require a little man in the head to look at it, and the little mm would require an even littler man to look at the representations inside him, and so on, ad infinitum. But once more we have the spectacle of the theoretician insisting to the electrical engineer that if the engineer is correct his workstation must contain hordes of little elves. Talk of homunculi is indispensable in computer science. Data structures are read and interpreted and examined and recognized and revised all the time, and the subroutines that do so are unashamedly called “agents,” “demons,” “supervisors,” “monitors,” “interpreters,” and “executives.” Why doesn't all this homunculus talk lead to an infinite regress? Because an internal representation is not a lifelike photograph of the world, and the homunculus that “looks at it” is not a miniaturized copy of the entire system, requiring its entire intelligence. That indeed would have explained nothing. Instead, a representation is a set of symbols corresponding to aspects of the world, and each homunculus is required only to react in a few circumscribed ways to some of the symbols, a feat far simpler than what the system as a whole does. The intelligence of the system emerges from the activities of the not-so-intelligent mechanical demons inside it. The point, first made by Jerry Fodor in 1968, has been succinctly put by Daniel Dennett:
Homunculi are bogeymen only if they duplicate entire the talents they are rung in to explain. ... If one can get a team or committee of relatively ignorant, narrow-minded, blind homunculi to produce the intelligent behavior of the whole, this is progress. A flow chart is typically the organizational chart of a committee of homunculi (investigators, librarians, accountants, executives); each box specifies a homunculus by prescribing a function without saying how it is accomplished (one says, in effect: put a little man in there to do the job). If we then look closer at the individual boxes we see that the function of each is accomplished by subdividing it via another flow chart into still smaller, more stupid homunculi. Eventually this nesting of boxes within boxes lands you with homunculi so stupid (all they have to do is remember whether to say yes or no when asked) that they can be, as one says, “replaced by a machine.” One discharges fancy homunculi from one's scheme by organizing armies of idiots to do the work. {80}
You still might wonder how the marks being scribbled and erased by demons inside the computer are supposed to represent or stand for things in the world. Who decides that this mark in the system corresponds to that bit of the world? In the case of a computer, the answer is obvious: we get to decide what the symbols mean, because we built the machine. But who means the meaning of the symbols allegedly inside us? Philosophers call this the problem of “intentionality” (confusingly, because it has nothing to do with intentions). There are two common answers. One is that a symbol is connected to its referent in the world by our sense organs. Your mother's face reflects light, which stimulates your eye, which triggers a cascade of templates or similar circuits, which inscribe the symbol mother in your mind. The other answer is that the unique pattern of symbol manipulations triggered by the first symbol mirrors the unique pattern of relationships between the referent of the first symbol and the referents of the triggered symbols. Once we agree, for whatever reason, to say that mother means mother, uncle means uncle, and so on, the new interlocking kinship statements generated by the demons turn out to be uncannily true, time and again. The device prints Bella mother-of Me, and sure enough, Bella is my mother. Mother means “mother” because it plays a role in inferences about mothers.
These are called the “causal” and the “inferential-role” theories, and philosophers hostile to each have had fun thinking up preposterous thought experiments to refute them. Oedipus didn't want to marry his mother, but he did so anyway. Why? Because his mother triggered the symbol Jocasta in him rather than the symbol Mem, and his desire was couched as “If it's Mem, don't marry her.” The causal effects of Jocasta, the woman who really was Oedipus’ mother, were irrelevant; all that mattered was the inferential role that the symbols Jocasta and Mom played inside Oedipus’ head. A lightning bolt hits a dead tree in the middle of a swamp, and by an amazing coincidence the slime coalesces into a molecule-for-molecule replica of me at this moment, memories included. Swampman has never been in contact with my mother, but most people would say that his mother thoughts are about my mother, just as mine are. Again we conclude that causation by something in the world is not necessary for a symbol to be about something; its inferential role is enough.
But, but, but! Suppose the sequence of information-processing steps {81} in a chess-playing computer turns out, by a remarkable coincidence, to be identical to the battlefield events in the Six-Day War (King's knight = Moshe Dayan, Rook to c7 = Israeli army captures the Golan Heights, and so on). Would the program be “about” the Six-Day War every bit as much as it is “about” the chess game? Suppose that someday we discovered that cats are not animals after all, but lifelike robots controlled from Mars. Any inference rule that computed “If it's a cat, then it must be an animal” would be inoperative. The inferential role of our mental symbol cat would have changed almost beyond recognition. But surely the meaning of cat would be unchanged: you'd still be thinking “cat” when Felix the Robot slunk by. Score two points for the causal theory.
A third view is summarized by the television ad parody on Saturday Night Live: You're both right — it's a floor wax and a dessert topping. Together the causal and inferential roles of a symbol determine what it represents. (On this view, Swampman's thoughts would be about my mother because he has a future-oriented causal connection with her: he can recognize her when he meets her.) Causal and inferential roles tend to be in sync because natural selection designed both our perceptual systems and our inference modules to work accurately, most of the time, in this world. Not all philosophers agree that causation plus inference plus natural selection are enough to nail down a concept of “meaning” that would work perfectly in all worlds. (“Suppose Swampman has an identical twin on another planet . . .”) But if so, one might respond, so much the worse for that concept of meaning. Meaning might make sense only relative to a device that was designed (by engineers or by natural selection) to function in a particular kind of world. In other worlds — Mars, Swampland, the Twilight Zone — all bets are off. Whether or not the causal-plus-inferential theory is completely philosopher-proof, it takes the mystery out of how a symbol in a mind or a machine can mean something.
Another sign that the computational theory of mind is on the right track is the existence of artificial intelligence: computers that perform humanlike intellectual tasks. Any discount store can sell you a computer that surpasses a human's ability to calculate, store and retrieve facts, draft drawings, check spelling, route mail, and set type. A well-stocked software house can sell you programs that play excellent chess and that {82} recognize alphabetic characters and carefully pronounced speech. Clients with deeper pockets can buy programs that respond to questions in English about restricted topics, control robot arms that weld and spray-paint, and duplicate human expertise in hundreds of areas such as picking stocks, diagnosing diseases, prescribing drugs, and troubleshooting equipment breakdowns. In 1996 the computer Deep Blue defeated the world chess champion Gary Kasparov in one game and played him to a draw in two others before losing the match, and it is only a matter of time before a computer defeats a world champion outright. Though there are no Terminator-class robots, there are thousands of smaller-scale artificial intelligence programs in the world, including some hidden in your personal computer, car, and television set. And progress continues.
These low-key successes are worth pointing out because of the emotional debate over What Computers Will-Soon/Won't-Ever Do. One side says robots are just around the corner (showing that the mind is a computer); the other side says it will never happen (showing that it isn't). The debate seems to come right out of the pages of Christopher Cerf and Victor Navasky's The Experts Speak:
Well-informed people know it is impossible to transmit the voice over wires and that were it possible to do so, the thing would be of no practical value.
— Editorial, The Boston Post, 1865
Fifty years hence . . . [w]e shall escape the absurdity of growing a whole chicken in order to eat the breast or wing, by growing these parts separately under a suitable medium.
— Winston Churchill, 1932
Heavier-than-air flying machines are impossible.
— Lord Kelvin, pioneer in thermodynamics and electricity, 1895
[By 1965] the deluxe open-road car will probably be 20 feet long, powered by a gas turbine engine, little brother of the jet engine.
— Leo Cherne, editor-publisher of The Research Institute of America, 1955
Man will never reach the moon, regardless of all future scientific advances.
— Lee Deforest, inventor of the vacuum tube, 1957
Nuclear powered vacuum cleaners will probably be a reality within 10 years.
The one prediction coming out of futurology that is undoubtedly correct is that in the future today's futurologists will look silly. The ultimate attainments of artificial intelligence are unknown, and will depend on countless practical vicissitudes that will be discovered only as one goes along. What is indisputable is that computing machines can be intelligent.
Scientific understanding and technological achievement are only loosely connected. For some time we have understood much about the hip and the heart, but artificial hips are commonplace while artificial hearts are elusive. The pitfalls between theory and application must be kept in mind when we look to artificial intelligence for clues about computers and minds. The proper label for the study of the mind informed by computers is not Artificial Intelligence but Natural Computation.
The computational theory of mind has quietly entrenched itself in neu-roscience, the study of the physiology of the brain and nervous system. No corner of the field is untouched by the idea that information processing is the fundamental activity of the brain. Information processing is what makes neuroscientists more interested in neurons than in glial cells, even though the glia take up more room in the brain. The axon (the long output fiber) of a neuron is designed, down to the molecule, to propagate information with high fidelity across long separations, and when its electrical signal is transduced to a chemical one at the synapse (the junction between neurons), the physical format of the information changes while the information itself remains the same. And as we shall see, the tree of dendrites (input fibers) on each neuron appears to perform the basic logical and statistical operations underlying computation. Information-theoretic terms such as “signals,” “codes,” “representations,” “transformations,” and “processing” suffuse the language of neu-roscience.
Information processing even defines the legitimate questions of the field. The retinal image is upside down, so how do we manage to see the world right-side up? If the visual cortex is in the back of the brain, why doesn't it feel like we are seeing in the back of our heads? How is it possible that an amputee can feel a phantom limb in the space where his real limb used to be? How can our experience of a green cube arise from {84} neurons that are neither colored green nor in the shape of a cube? Every neuroscientist knows that these are pseudo-questions, but why? Because they are about properties of the brain that make no difference to the transmission and processing of information.
If a scientific theory is only as good as the facts it explains and the discoveries it inspires, the biggest selling point for the computational theory of mind is its impact on psychology. Skinner and other behaviorists insisted that all talk about mental events was sterile speculation; only stimulus-response connections could be studied in the lab and the field. Exactly the opposite turned out to be true. Before computational ideas were imported in the 1950s and 1960s by Newell and Simon and the psychologists George Miller and Donald Broadbent, psychology was dull, dull, dull. The psychology curriculum comprised physiological psychology, which meant reflexes, and perception, which meant beeps, and learning, which meant rats, and memory, which meant nonsense syllables, and intelligence, which meant IQ, and personality, which meant personality tests. Since then psychology has brought the questions of history's deepest thinkers into the laboratory and has made thousands of discoveries, on every aspect of the mind, that could not have been dreamed of a few decades ago.
The blossoming came from a central agenda for psychology set by the computational theory: discovering the form of mental representations (the symbol inscriptions used by the mind) and the processes (the demons) that access them. Plato said that we are trapped inside a cave and know the world only through the shadows it casts on the wall. The skull is our cave, and mental representations are the shadows. The information in an internal representation is all that we can know about the world. Consider, as an analogy, how external representations work. My bank statement lists each deposit as a single sum. If I deposited several checks and some cash, I cannot verify whether a particular check was among them; that information was obliterated in the representation. What's more, the form of a representation determines what can easily be inferred from it, because the symbols and their arrangement are the only things a homunculus stupid enough to be replaced by a machine can respond to. Our representation of numbers is valuable because addition {85} can be performed on the numbers with a few dronelike operations: looking up entries in the addition table and carrying digits. Roman numerals have not survived, except as labels or decorations, because addition operations are far more complicated with them, and multiplication and division operations are practically impossible.
Pinning down mental representations is the route to rigor in psychology. Many explanations of behavior have an airy-fairy feel to them because they explain psychological phenomena in terms of other, equally mysterious psychological phenomena. Why do people have more trouble with this task than with that one? Because the first one is “more difficult.” Why do people generalize a fact about one object to another object? Because the objects are “similar.” Why do people notice this event but not that one? Because the first event is “more salient.” These explanations are scams. Difficulty, similarity, and salience are in the mind of the beholder, which is what we should be trying to explain. A computer finds it more difficult to remember the gist of Little Red Riding Hood than to remember a twenty-digit number; you find it more difficult to remember the number than the gist. You find two crumpled balls of “newspaper to be similar, even though their shapes are completely different, and find two people's faces to be different, though their shapes are almost the same. Migrating birds that navigate at night by the stars in the sky find the positions of the constellations at different times of night quite salient; to a typical person, they are barely noticeable.
But if we hop down to the level of representations, we find a firmer sort of entity, which can be rigorously counted and matched. If a theory of psychology is any good, it should predict that the representations required by the “difficult” task contain more symbols (count ‘em) or trigger a longer chain of demons than those of the “easy” task. It should predict that the representations of two “similar” things have more shared symbols and fewer nonshared symbols than the representations of “dissimilar” things. The “salient” entities should have different representations from their neighbors; the “nonsalient” entities should have the same ones.
Research in cognitive psychology has tried to triangulate on the mind's internal representations by measuring people's reports, reaction times, and errors as they remember, solve problems, recognize objects, and generalize from experience. The way people generalize is perhaps the most telltale sign that the mind uses mental representations, and lots of them. {86}
Suppose it takes a while for you to learn to read a fancy new typeface, festooned with curlicues. You have practiced with some words and are now as quick as you are for any other typeface. Now you see a familiar word that was not in your practice set — say, elk. Do you have to relearn that the word is a noun? Do you have to relearn how to pronounce it? Relearn that the referent is an animal? What the referent looks like? That it has mass and breathes and suckles its young? Surely not. But this banal talent of yours tells a story. Your knowledge about the word elk could not have been connected directly to the physical shapes of printed letters. If it had, then when new letters were introduced, your knowledge would have no connection to them and would be unavailable until you learned the connections anew. In reality, your knowledge must have been connected to a node, a number, an address in memory, or an entry in a mental dictionary representing the abstract word elk, and that entry must be neutral with respect to how it is printed or pronounced. When you learned the new typeface, you created a new visual trigger for the letters of the alphabet, which in turn triggered the old elk entry, and everything hooked up to the entry was instantly available, without your having to reconnect, piece by piece, everything you know about elks to the new way of printing elk. This is how we know that your mind contains mental representations specific to abstract entries for words, not just the shapes of the words when they are printed.
These leaps, and the inventory of internal representations they hint at, are the hallmark of human cognition. If you learned that wapiti was another name for an elk, you could take all the facts connected to the word elk and instantly transfer them to wapiti, without having to solder new connections to the word one at a time. Of course, only your zoological knowledge would transfer; you would not expect wapiti to be pronounced like elk. That suggests you have a level of representation specific to the concepts behind the words, not just the words themselves. Your knowledge of facts about elks hangs off the concept; the words elk and wapiti also hang off the concept; and the spelling elk and pronunciation [elk] hang off the word elk.
We have moved upward from the typeface; now let's move downward. If you had learned the typeface as black ink on white paper, you wouldn't have to relearn it for white ink on red paper. This unmasks a representation for visual edges. Any color abutting any other color is seen as an edge; edges define strokes; an arrangement of strokes makes up an alphanumeric character. {87}
The various mental representations connected with a concept like an elk can be shown in a single diagram, sometimes called a semantic network, knowledge representation, or propositional database.
This is a fragment of the immense multimedia dictionary, encyclopedia, and how-to manual we keep in our heads. We find these layers upon layers of representations everywhere we look in the mind. Say I asked you to print the word elk in any typeface you wanted, but with your left hand (if you are a righty), or by writing it in the sand with your toe, or by tracing it with a penlight held in your teeth. The printing would be messy but recognizable. You might have to practice to get the motions to be smoother, but you would not have to relearn the strokes composing each letter, let alone the alphabet or the spelling of every English word. This transfer of skill must tap into a level of representation for motor control that specifies a geometric trajectory, not the muscle contractions or limb movements that accomplish it. The trajectory would be translated into actual motions by lower-level control programs for each appendage.
Or recall Sally escaping from the burning building earlier in this chapter. Her desire must have been couched as the abstract representation flee-from-danger. It could not have been couched as run-from-smoke, because the desire could have been triggered by signs other than smoke (and sometimes smoke would not trigger it), and her flight could {88} have been accomplished by many kinds of action, not just running. Yet her behavioral response was put together for the first time there and then. Sally must be modular one part of her assesses danger, another decides whether to flee, yet another figures out how to flee.
The combinatorics of mentalese, and of other representations composed of parts, explain the inexhaustible repertoire of human thought and action. A few elements and a few rules that combine them can generate an unfathomably vast number of different representations, because the number of possible representations grows exponentially with their size. Language is an obvious example. Say you have ten choices for the word to begin a sentence, ten choices for the second word (yielding a hundred two-word beginnings), ten choices for the third word (yielding a thousand three-word beginnings), and so on. (Ten is in fact the approximate geometric mean of the number of word choices available at each point in assembling a grammatical and sensible sentence.) A little arithmetic shows that the number of sentences of twenty words or less (not an unusual length) is about 1020: a one followed by twenty zeros, or a hundred million trillion, or a hundred times the number of seconds since the birth of the universe. I bring up the example to impress you not with the vastness of language but with the vastness of thought. Language, after all, is not scat-singing: every sentence expresses a distinct idea. (There are no truly synonymous sentences.) So in addition to whatever ineffable thoughts people might have, they can entertain something like a hundred million trillion different effable thoughts.
The combinatorial immensity of thinkable structures is found in many spheres of human activity. The young John Stuart Mill was alarmed to discover that the finite number of musical notes, together with the maximum practical length of a musical piece, meant that the world would soon run out of melodies. At the time he sank into this melancholy, Brahms, Tchaikovsky, Rachmaninoff, and Stravinsky had not yet been born, to say nothing of the entire genres of ragtime, jazz, Broadway musicals, electric blues, country and western, rock and roll, samba, reggae, and punk. We are unlikely to have a melody shortage anytime soon because music is combinatorial: if each note of a melody can be selected from, say, eight notes on average, there are 64 pairs of notes, 512 motifs of three notes, 4,096 phrases of four notes, and so on, multiplying out to trillions and trillions of musical pieces. {89}
Our everyday ease in generalizing our knowledge is one class of evidence that we have several kinds of data representations inside our heads. Mental representations also reveal themselves in the psychology laboratory. With clever techniques, psychologists can catch a mind in the act of flipping from representation to representation. A nice demonstration comes from the psychologist Michael Posner and colleagues. Volunteers sit in front of a video screen and see pairs of letters flashed briefly: A A, for example. They are asked to press one button if the letters are the same, another button if they are different (say, A b). Sometimes the matching letters are both uppercase or both lowercase (a A or a a); that is, they are physically identical. Sometimes one is uppercase and one is lowercase (a a or a a); they are the same letter of the alphabet, but physically different. When the letters are physically identical, people press the buttons more quickly and accurately than when they are physically different, presumably because the people are processing the letters as visual forms and can simply match them by their geometry, template-style. When one letter is A and the other letter is a, people have to convert them into a format in which they are equivalent, namely “the letter a”; this conversion adds about a tenth of a second to the reaction time. But if one letter is flashed and the other follows seconds later, it doesn't matter whether they were physically identical or not; A-then-A is as slow as A-then-a. Quick template-matching is no longer possible. Apparently after a few seconds the mind automatically converts a visual representation into an alphabetic one, discarding the information about its geometry.
Such laboratory legerdemain has revealed that the human brain uses at least four major formats of representation. One format is the visual image, which is like a template in a two-dimensional, picturelike mosaic. (Visual images are discussed in Chapter 4.) Another is a phonological representation, a stretch of syllables that we play in our minds like a tape loop, planning out the mouth movements and imagining what the syllables sound like. This stringlike representation is an important component of our short-term memory, as when we look up a phone number and silently repeat it to ourselves just long enough to dial the number. Phonological short-term memory lasts between one and five seconds and can hold from four to seven “chunks.” (Short-term memory is measured in chunks rather than sounds because each item can be a label that points {90} to a much bigger information structure in long-term memory, such as the content of a phrase or sentence.) A third format is the grammatical representation: nouns and verbs, phrases and clauses, sterns and roots, phonemes and syllables, all arranged into hierarchical trees. In The Language Instinct I explained how these representations determine what goes into a sentence and how people communicate and play with language.
The fourth format is mentalese, the language of thought in which our conceptual knowledge is couched. When you put down a book, you forget almost everything about the wording and typeface of the sentences and where they sat on the page. What you take away is their content or gist. (In memory tests, people confidently “recognize” sentences they never saw if they are paraphrases of the sentences they did see.) Mentalese is the medium in which content or gist is captured; I used bits of it in the bulletin board of the production system that identified uncles, and in the “knowledge” and “concept” levels of the semantic network shown in the last diagram. Mentalese is also the mind's lingua franca, the traffic of information among mental modules that allows us to describe what we see, imagine what is described to us, carry out instructions, and so on. This traffic can actually be seen in the anatomy of the brain. The hippocampus and connected structures, which put our memories into long-term storage, and the frontal lobes, which house the circuitry for decision making, are not directly connected to the brain areas that process raw sensory input (the mosaic of edges and colors and the ribbon of changing pitches). Instead, most of their input fibers carry what neuroscientists call “highly processed” input coming from regions one or more stops downstream from the first sensory areas. The input consists of codes for objects, words, and other complex concepts.
Why so many kinds of representations? Wouldn't it be simpler to have an Esperanto of the mind? In fact, it would be hellishly complicated. The modular organization of mental software, with its packaging of knowledge into separate formats, is a nice example of how evolution and engineering converge on similar solutions. Brian Kernighan, a wizard in the software world, wrote a book with P. J. Plauger called The Elements of Programming Style (a play on Strunk and White's famous writing manual, {91} The Elements of Style). They give advice on what makes a program work powerfully, run efficiently, and evolve gracefully. One of their maxims is “Replace repetitive expressions by calls to a common function.” For example, if a program has to compute the areas of three triangles, it should not have three different commands, each with the coordinates of one of the triangles embedded in its own copy of the formula for the area of a triangle. Instead, the program should have the formula spelled out once. There should be a “calculate-triangle-area” function, and it should have slots labeled X, Y, and Z that can stand for any triangle's coordinates. That function can be invoked three times, with the coordinates from the input plugged into the X, Y, and Z slots. This design principle becomes even more important as the function grows from a one-line formula to a multistep subroutine, and it inspired these related maxims, all of which seem to have been followed by natural selection as it designed our modular, multiformat minds:
Modularize.
Use subroutines.
Each module should do one thing well.
Make sure every module hides something.
Localize input and output in subroutines.
A second principle is captured in the maxim
Choose the data representation that makes the program simple.
Kernighan and Plauger give the example of a program that reads in a line of text and then has to print it out centered within a border. The line of text could be stored in many formats (as a string of characters, a list of coordinates, and so on), but one format makes the centering child's play: allocate eighty consecutive memory slots that mirror the eighty positions in the input-output display. The centering can be accomplished in a few steps, without error, for an input of any size; with any other format, the program would have to be more complicated. Presumably the distinct formats of representation used by the human mind — images, phonological loops, hierarchical trees, mentalese — evolved because they allow simple programs (that is, stupid demons or homunculi) to compute useful things from them.
And if you like the intellectual stratosphere in which “complex systems” of all kinds are lumped together, you might be receptive to Herbert {92} Simon's argument that modular design in computers and minds its a special case of modular, hierarchical design in all complex systems. Bodies contain tissues made of cells containing organelles; armed forces comprise armies which contain divisions broken into battalions and eventually platoons; books contain chapters divided into sections, subsections, paragraphs, and sentences; empires are assembled out of countries, provinces, and territories. These “nearly decomposable” systems are defined by rich interactions among the elements belonging to the same component and few interactions among elements belonging to different components. Complex systems are hierarchies of modules because only elements that hang together in modules can remain stable long enough to be assembled into larger and larger modules. Simon gives the analogy of two watchmakers, Hora and Tempus:
The watches the men made consisted of about 1,000 parts each. Tempus had so constructed his that if he had one partly assembled and had to put it down — to answer the phone, say — it immediately fell to pieces and had to be reassembled from the elements. . . .
The watches that Hora made were no less complex than those of Tempus. But he had designed them so that he could put together sub-assemblies of about ten elements each. Ten of these subassemblies, again, could be put together into a larger subassembly; and a system of ten of the latter subassemblies constituted the whole watch. Hence, when Hora had to put down a partly assembled watch in order to answer the phone, he lost only a small part of his work, and he assembled his watches in only a fraction of the man-hours it took Tempus.
Our complex mental activity follows the wisdom of Hora. As we live our lives, we don't have to attend to every squiggle or plan out every muscle twitch. Thanks to word symbols, any typeface can awaken any bit of knowledge. Thanks to goal symbols, any sign of danger can trigger any means of escape.
The payoff for the long discussion of mental computation and mental representation I have led you through is, I hope, an understanding of the complexity, subtlety, and flexibility that the human mind is capable of even if it is nothing but a machine, nothing but the on-board computer of a robot made of tissue. We don't need spirits or occult forces to explain intelligence. Nor, in an effort to look scientific, do we have to ignore the evidence of our own eyes and claim that human beings are bundles of conditioned associations, puppets of the genes, or followers of brutish {93} instincts. We can have both the agility and discernment of human thought and a mechanistic framework in which to explain it. The later chapters, which try to explain common sense, the emotions, social relations, humor, and the arts, build on the foundation of a complex computational psyche.
Of course, if it was unimaginable that the computational theory of mind was false, that would mean it had no content. In fact, it has been attacked head-on. As one would expect of a theory that has become so indispensable, pea-shooting is not enough; nothing less than undermining the foundations could bring it down. Two flamboyant writers have taken on the challenge. Both have chosen weapons suitable to the occasion, though the weapons are as opposite as can be: one is an appeal to down-home common sense, the other to esoteric physics and mathematics.
The first attack comes from the philosopher John Searle. Searle believes that he refuted the computational theory of mind in 1980 with a thought experiment he adapted from another philosopher, Ned Block (who, ironically, is a major proponent of the computational theory). Searle's version has become famous as the Chinese Room. A man who knows no Chinese is put in a room. Pieces of paper with squiggles on them are slipped under the door. The man has a long list of complicated instructions such as “Whenever you see [squiggle squiggle squiggle], write down [squoggle squoggle squoggle].” Some of the rules tell him to slip his scribbles back out under the door. He gets good at following the instructions. Unknown to him, the squiggles and squoggles are Chinese characters, and the instructions are an artificial intelligence program for answering questions about stories in Chinese. As far as a person on the other side of the door knows, there is a native Chinese speaker in the room. Now, if understanding consists of running a suitable computer program, the guy must understand Chinese, because he is running such a program. But the guy doesn't understand Chinese, not a word of it; he's just manipulating symbols. Therefore, understanding — and, by extension, any aspect of intelligence — is not the same as symbol manipulation or computation. {94}
Searle says that what the program is missing is intentionality, the connection between a symbol and what it means. Many people have interpreted him as saying that the program is missing consciousness, and indeed Searle believes that consciousness and intentionality are closely related because we are conscious of what we mean when we have a thought or use a word. Intentionality, consciousness, and other mental phenomena are caused not by information processing, Searle concludes, but by the “actual physical-chemical properties of actual human brains” (though he never says what those properties are).
The Chinese Room has kicked off a truly unbelievable amount of commentary. More than a hundred published articles have replied to it, and I have found it an excellent reason to take my name off all Internet discussion-group lists. To people who say that the whole room (man plus rule sheet) understands Chinese, Searle replies: Fine, let the guy memorize the rules, do the calculations in his head, and work outdoors. The room is gone, and our symbol-manipulator still does not understand Chinese. To those who say the man lacks any sensorimotor connection to the world, and that is the crucial missing factor, Searle replies: Suppose that the incoming squiggles are the outputs of a television camera and the outgoing squoggles are the commands to a robot arm. He has the connections, but he still doesn't speak the language. To those who say his program does not mirror what the brain does, Searle can invoke Block's parallel distributed counterpart to the Chinese Room, the Chinese Gym: millions of people in a huge gym act as if they are neurons and shout signals to each other over walkie-talkies, duplicating a neural network that answers questions about stories in Chinese. But the gym does not understand Chinese any more than the guy did.
Searle's tactic is to appeal over and over to our common sense. You can almost hear him saying, “Aw, c'mon! You mean to claim that the guy understands Chinese??!!! Geddadahere! He doesn't understand a word!! He's lived in Brooklyn all his life!!” and so on. But the history of science has not been kind to the simple intuitions of common sense, to put it mildly. The philosophers Patricia and Paul Churchland ask us to imagine how Searle's argument might have been used against Maxwell's theory that light consists of electromagnetic waves. A guy holds a magnet in his hand and waves it up and down. The guy is creating electromagnetic radiation, but no light comes out; therefore, light is not an electromagnetic wave. The thought experiment slows down the waves to a range in which we humans no longer see them as light. By trusting our intuitions {95} in the thought experiment, we falsely conclude that rapid waves cannot be light, either. Similarly, Searle has slowed down the mental computation to a range in which we humans no longer think of it as understanding (since understanding is ordinarily much faster). By trusting our intuitions in the thought experiment, we falsely conclude that rapid computation cannot be understanding, either. But if a speeded-up version of Searle's preposterous story could come true, and we met a person who seemed to converse intelligently in Chinese but was really deploying millions of memorized rules in fractions of a second, it is not so clear that we would deny that he understood Chinese.
My own view is that Searle is merely exploring facts about the English word understand. People are reluctant to use the word unless certain stereotypical conditions apply: the rules of the language are used rapidly and unconsciously, and the content of the language is connected to the beliefs of the whole person. If people balk at using the vernacular word understand to embrace exotic conditions that violate the stereotype but preserve the essence of the phenomenon, then nothing, scientifically speaking, is really at stake. We can look for another word, or agree to use the old one in a technical sense; who cares? The explanation of what makes understanding work is the same. Science, after all, is about the principles that make things work, not which things are “really” examples of a familiar word. If a scientist explains the functioning of the human elbow by saying it is a second-class lever, it is no refutation to describe a guy holding a second-class lever made of steel and proclaim, “But look, the guy doesn't have three elbows!!!”
As for the “physical-chemical properties” of the brain, I have already mentioned the problem: brain tumors, the brains of mice, and neural tissue kept alive in a dish don't understand, but their physical-chemical properties are the same as the ones of our brains. The computational theory explains the difference: those hunks of neural tissue are not arranged into patterns of connectivity that carry out the right kind of information processing. For example, they do not have parts that distinguish nouns from verbs, and their activity patterns do not carry out the rules of syntax, semantics, and common sense. Of course, we can always call that a difference in physical-chemical properties (in the same sense that two books differ in their physical-chemical properties), but then the term is meaningless because it can no longer be defined in the language of physics and chemistry.
With thought experiments, turnabout is fair play. Perhaps the ultimate {96} reply to Searle's Chinese Room may be found in a story by the science-fiction writer Terry Bisson, widely circulated on the Internet, which has the incredulity going the other way. It reports a conversation between the leader of an interplanetary explorer fleet and his commander in chief, and begins as follows:
“They're made out of meat.”
“Meat?” . . . “There's no doubt about it. We picked several from different parts of the planet, took them aboard our recon vessels, probed them all the way through. They're completely meat.”
“That's impossible. What about the radio signals? The messages to the stars?”
“They use the radio waves to talk, but the signals don't come from them. The signals come from machines.”
“So who made the machines? That's who we want to contact.”
“They made the machines. That's what I'm trying to tell you. Meat made the machines.”
“That's ridiculous. How can meat make a machine? You're asking me to believe in sentient meat.”
“I'm not asking you, I'm telling you. These creatures are the only sentient race in the sector and they're made out of meat.”
“Maybe they're like the Orfolei. You know, a carbon-based intelligence that goes through a meat stage.”
“Nope. They're born meat and they die meat. We studied them for several of their life spans, which didn't take too long. Do you have any idea [of] the life span of meat?”
“Spare me. Okay, maybe they're only part meat. You know, like the Weddilei. A meat head with an electron plasma brain inside.”
“Nope, we thought of that, since they do have meat heads like the Weddilei. But I told you, we probed them. They're meat all the way through.”
“No brain?”
“Oh, there is a brain all right. It's just that the brain is made out of meat!”
“So . . . what does the thinking?”
“You're not understanding, are you? The brain does the thinking. The meat.”
“Thinking meat! You're asking me to believe in thinking meat!”
“Yes, thinking meat! Conscious meat! Loving meat. Dreaming meat. The meat is the whole deal! Are you getting the picture?” {97}
The other attack on the computational theory of mind comes from the mathematical physicist Roger Penrose, in a best-seller called The Emperor's New Mind (how's that for an in-your-face impugnment!). Penrose draws not on common sense but on abstruse issues in logic and physics. He argues that Godel's famous theorem implies that mathematicians — and, by extension, all humans — are not computer programs. Roughly, Godel proved that any formal system (such as a computer program or a set of axioms and rules of inference in mathematics) that is even moderately powerful (powerful enough to state the truths of arithmetic) and consistent (it does not generate contradictory statements) can generate statements that are true but that the system cannot prove to be true. Since we human mathematicians can just see that those statements are true, we are not formal systems like computers. Penrose believes that the mathematician's ability comes from an aspect of consciousness that cannot be explained as computation. In fact, it cannot be explained by the operation of neurons; they're too big. It cannot be explained by Darwin's theory of evolution. It cannot even be explained by physics as we currently understand it. Quantum-mechanical effects, to be explained in an as yet nonexistent theory of quantum gravity, operate in the microtubules that make up the miniature skeleton of neurons. Those effects are so strange that they might be commensurate with the strangeness of consciousness.
Penrose's mathematical argument has been dismissed as fallacious by logicians, and his other claims have been reviewed unkindly by experts in the relevant disciplines. One big problem is that the gifts Penrose attributes to his idealized mathematician are not possessed by real-life mathematicians, such as the certainty that the system of rules being relied on is consistent. Another is that quantum effects almost surely cancel out in nervous tissue. A third is that microtubules are ubiquitous among cells and appear to play no role in how the brain achieves intelligence. A fourth is that there is not even a hint as to how consciousness might arise from quantum mechanics.
The arguments from Penrose and Searle have something in common other than their target. Unlike the theory they attack, they are so unconnected to discovery and explanation in scientific practice that they have been empirically sterile, contributing no insight and inspiring no discoveries on how the mind works. In fact, the most interesting implication of {98} The Emperor's New Mind was pointed out by Dennett. Penrose's denunciation of the computational theory of mind turns out to be a backhanded compliment. The computational theory fits so well into our understanding of the world that, in trying to overthrow it, Penrose had to reject most of contemporary neuroscience, evolutionary biology, and physics!
In Lewis Carroll's story “What the Tortoise Said to Achilles,” the swift-footed warrior has caught up with the plodding tortoise, defyirtg Zeno's paradox in which any head start given to the tortoise should make him uncatchable. (In the time it would take for Achilles to close the gap, the tortoise would have progressed a small amount; in the time it took to close that gap, the tortoise would have moved a bit farther, ad infinitum.) The tortoise offers Achilles a similar paradox from logic. Achilles pulls an enormous notebook and a pencil from his helmet, and the tortoise dictates Euclid's First Proposition:
(A) Things that are equal to the same are equal to each other.
(B) The two sides of this Triangle are things that are equal to the same.
(Z) The two sides of this Triangle are equal to each other.
The tortoise gets Achilles to agree that anyone who accepts A and B and “If A and B then Z” must also accept Z. But now the tortoise disagrees with Achilles’ logic. He says he is entitled to reject conclusion Z, because no one ever wrote down the if-then rule on the list of premises he must accept. He challenges Achilles to force him to conclude Z. Achilles replies by adding C to the list in his notebook:
(C) If A and B are true, Z must be true.
The tortoise replies that he fails to see why he should assume that just because A and B and C are true, Z is true. Achilles adds one more statement —
(D) If A and B and C are true, Z must be true.
— and declares that “Logic [must] take you by the throat, and force you” to accept Z. The tortoise replies, {99}
“Whatever Logic is good enough to tell me is worth writing down. So enter it in your book, please. We will call it
(E) If A and B and C and D are true, Z must be true.”
“I see,” said Achilles; and there was a touch of sadness in his tone.
Here the narrator, having pressing business at the Bank, was obliged to leave the happy pair, and did not again pass the spot until some months afterwards. When he did so, Achilles was still seated on the back of the much-enduring tortoise, and was writing in his notebook, which appeared to be nearly full. The tortoise was saying, “Have you got that last step written down? Unless I've lost count, that makes a thousand and one. There are several millions more to come.”
The solution to the paradox, of course, is that no inference system follows explicit rules all the way down. At some point the system must, as Jerry Rubin (and later the Nike Corporation) said, just do it. That is, the rule must simply be executed by the reflexive, brute-force operation of the system, no more questions asked. At that point the system, if implemented as a machine, would not be following rules but obeying the laws of physics. Similarly, if representations are read and written by demons (rules for replacing symbols with symbols), and the demons have smaller (and stupider) demons inside them, eventually you have to call Ghost-busters and replace the smallest and stupidest demons with machines — in the case of people and animals, machines built from neurons: neural networks. Let's see how our picture of how the mind works can be grounded in simple ideas of how the brain works.
The first hints came from the mathematicians Warren McCulloch and Walter Pitts, who wrote about the “neurological” properties of connected neurons. Neurons are complicated and still not understood, but McCulloch and Pitts and most neural-network modelers since have identified one thing neurons do as the most significant thing. Neurons, in effect, add up a set of quantities, compare the sum to a threshold, and indicate whether the threshold is exceeded. That is a conceptual description of what they do; the corresponding physical description is that a firing neuron is active to varying degrees, and its activity level is influenced by the activity levels of the incoming axons from other neurons attached at synapses to the neuron's dendrites (input structures). A synapse has a strength ranging from positive (excitatory) through zero (no effect) to negative (inhibitory). The activation level of each incoming axon is multiplied by the strength of the synapse. The neuron sums these {100} incoming levels; if the total exceeds a threshold, the neuron will, become more active, sending a signal in turn to any neuron connected to it. Though neurons are always firing and incoming signals merely cause it to fire at a detectably faster or slower rate, it is sometimes convenient to describe them as being either off (resting rate) or on (elevated rate).
McCulloch and Pitts showed how these toy neurons could be wired up to make logic gates. Logic gates implement the basic logical relations “and,” “or,” and “not” that underlie simple inferences. “A and B” (conceptually) is true if A is true and if B is true. An AND-gate (mechpnically) produces an output if both of its inputs are on. To make an AND-gate out of toy neurons, set the threshold of the output unit to be greater than each of the incoming weights but less than their sum, as in the mini-network on the left below. “A or B” (conceptually) is true if A is true or if B is true. An OR-gate (mechanically) produces an output if either of its inputs is on. To make one, set the threshold to be less than each incoming weight, as in the middle mini-network below. Finally, “not A” (conceptually) is true if A is false, and vice versa. A NOT-gate (mechanically) produces an output when it receives no input, and vice versa. To make one, set the threshold at zero, so the neuron will fire when it gets no input, and make the incoming weight negative, so that an incoming signal will turn the neuron off, as in the mini-network on the right.
Suppose that each toy neuron represents a simple proposition. The mini-networks can be wired together, with the output of one feeding the input to another, to evaluate the truth of a complex proposition. For example, a neural network could evaluate the proposition {[(X chews its cud) and (X has cloven hooves)] or [(X has fins) and (X has scales)]}, a summary of what it takes for an animal to be kosher. In fact, if a network of toy neurons is connected to some kind of extendable memory (such as a roll of paper moving under a rubber stamp and an eraser), it wopld be a Turing machine, a full-powered computer. {101}
It is utterly impractical, though, to represent propositions, or even the concepts composing them, in logic gates, whether those logic gates are made out of neurons or semiconductors. The problem is that every concept and proposition has to be hard-wired in advance as a separate unit. Instead, both computers and brains represent concepts as patterns of activity over sets of units. A simple example is the lowly byte, which represents an alphanumeric character in your computer. The representation of the letter B is 01000010, where the digits (bits) correspond to tiny pieces of silicon laid out in a row. The second and seventh pieces are charged, corresponding to the ones, and the other pieces are uncharged, corresponding to the zeros. A byte can also be built out of toy neurons, and a circuit for recognizing the B pattern can be built as a simple neural network:
You can imagine that this network is one of the parts making up a demon. If the bottom row of toy neurons is connected to short-term memory, the top one detects whether short-term memory contains an instance of the symbol B. And on page 102 is a network for a demon-part that writes the symbol B into memory.
We are on our way to building a conventional digital computer out of toy neurons, but let's change direction a bit and make a more biomorphic computer. First, we can use the toy neurons to implement not classical logic but fuzzy logic. In many domains people do not have all-or-none convictions about whether something is true. A thing can be a better or a worse example of a category rather than being either in or out. Take the category “vegetable.” Most people agree that celery is a full-fledged {102}
vegetable but that garlic is only a so-so example. And if we are td believe the Reagan administration when it justified its parsimonious school lunch program, even ketchup is a kind of vegetable — though after a firestorm of criticism the administration conceded that it is not a very good example of one. Conceptually speaking, we eschew the idea that something either is or is not a vegetable and say that things can be better or worse examples of a vegetable. Mechanically speaking, we no longer insist that a unit representing vegetablehood be either on or off, but allow it to have a value ranging from 0 (for a rock) through 0.1 (for ketchup) through .4 (for garlic) to 1.0 (for celery).
We can also scrap the arbitrary code that relates each concept to a meaningless string of bits. Each bit can earn its keep by representing something. One bit might represent greenness, another leafiness, another crunchiness, and so on. Each of these vegetable-property units could be connected with a small weight to the vegetable unit itself. Other units, representing features that vegetables lack, such as “magnetic” or “mobile,” could be connected with negative weights. Conceptually speaking, the more vegetable properties something has, the better an example it is of a vegetable. Mechanically speaking, the more vegetable-property units are turned on, the higher the activation level of the vegetable unit.
Once a network is allowed to be squishy, it can represent degrees of evidence and probabilities of events and can make statistical decisions. Suppose each unit in a network represents a piece of evidence implicating the butler (fingerprints on the knife, love letters to the victim's wife, and so on). Suppose the top node represents the conclusion that the butler did it. Conceptually speaking, the more clues there are that the butler might have done it, the higher our estimate would be that the butler did {103}
do it. Mechanically speaking, the more clue units there are that are turned on, the greater the activation of the conclusion unit. We could implement different statistical procedures in the network by designing the conclusion unit to integrate its inputs in different ways. For example, the conclusion unit could be a threshold unit like the ones in crisp logic gates; that would implement a policy to put out a decision only if the weight of evidence exceeded a critical value (say, “beyond a reasonable doubt”). Or the conclusion unit could increase its activity gradually; its degree of confidence could increase slowly with the first clues trickling in, build quickly as more and more are amassed, and level off at a point of diminishing returns. These are two of the kinds of unit that neural-network modelers like to use.
We can get even more adventurous, and take inspiration from the fact that with neurons, unlike silicon chips, connections are cheap. Why not connect every unit to every other unit? Such a network would embody not only the knowledge that greenness predicts vegetablehood and crunchiness predicts vegetablehood, but that greenness predicts crunchiness, crunchiness predicts leafiness, greenness predicts lack of mobility, and so on: {104}
With this move, interesting things begin to happen. The network begins to resemble human thought processes in ways that sparsely connected networks do not. For this reason psychologists and artificial intelligence researchers have been using everything-connected-to-everything networks to model many examples of simple pattern recognition. They have built networks for the lines that co-occur in letters, the letters that co-occur in words, the animal parts that co-occur in animals, and the pieces of furniture that co-occur in rooms. Often the decision node at the top is thrown away and only the correlations among the properties are calculated. These networks, sometimes called auto-associators, have five nifty features.
First, an auto-associator is a reconstructive, content-addressable memory. In a commercial computer, the bits themselves are meaningless, and the bytes made out of them have arbitrary addresses, like houses on a street, which have nothing to do with their contents. Memory locations are accessed by their addresses, and to determine whether a pattern has been stored somewhere in memory you have to search them all (or use clever shortcuts). In a content-addressable memory, on the other hand, specifying an item automatically lights up any location in memory containing a copy of the item. Since an item is represented in an auto-associator by turning on the units that represent its properties (in this case celery, greenness, leafiness, and so on), and since those units are connected to one another with strong weights, the activated units will reinforce one another, and after a few rounds in which activation reverberates through the network, all the units pertaining to the item will lock into the “on” position. That indicates that the item has been recognized. In fact, a single auto-associator can accommodate many sets of weights in its battery of connections, not just one, so it can store many items at a time.
Better yet, the connections are redundant enough that even if only a part of the pattern for an item is presented to the auto-associator, say, greenness and crunchiness alone, the rest of the pattern, leafiness, gets completed automatically. In some ways this is reminiscent of the mind. We do not need predefined retrieval tags for items in memory; almost any aspect of an object can bring the entire object to mind. For example, we can recall “vegetable” upon thinking about things that are green and leafy or green and crunchy or leafy and crunchy. A visual example is our ability to complete a word from a few of its fragments. We do not see this figure as random line segments or even as an arbitrary sequence of letters like MIHB, but as something more probable: {105}
A second selling point, called “graceful degradation,” helps deal with noisy input or hardware failure. Who isn't tempted to throw a shoe through the computer screen when it responds to the command pritn file with the error message pritn: command not found? In Woody Allen's Take the Money and Run, the bank robber Virgil Starkwell is foiled by his penmanship when the teller asks him why he wrote that he is pointing a gub at her. In a Gary Larson cartoon that adorns the office door of many a cognitive psychologist, a pilot flying over a castaway on a desert island reads the message scratched in the sand and shouts into his radio, “Wait! Wait! . . . Cancel that, I guess it says ‘HELF'.” Real-life humans do better, perhaps because we are fitted with auto-associators that use a preponderance of mutually consistent pieces of information to override one unusual piece. “Pritn” would activate the more familiar pattern “print”; “gub” would be warped to “gun,” “HELF” to “HELP.” Similarly, a computer with a single bad bit on its disk, a smidgen of corrosion in one of its sockets, or a brief dip in its supply of power can lock up and crash. But a human being who is tired, hung over, or brain-damaged does not lock up and crash; usually he or she is slower and less accurate but can muster an intelligible response.
A third advantage is that auto-associators can do a simple version of the kind of computation called constraint satisfaction. Many problems that humans solve have a chicken-and-egg character. An example from Chapter 1 is that we compute the lightness of a surface from a guess about its angle and compute the angle of the surface from a guess about its lightness, without knowing either for sure beforehand. These problems abound in perception, language, and common-sense reasoning. Am I looking at a fold or at an edge? Am I hearing the vowel [I] (as in pin) or the vowel [e] (as in pen) with a southern accent? Was I the victim of an act of malice or an act of stupidity? These ambiguities can sometimes be resolved by choosing the interpretation that is consistent with the greatest number of interpretations of other ambiguous events, if they could all be resolved at once. For example, if one speech sound can be interpreted as either send or sinned, and another as either pen or pin, I can resolve the uncertainties if I hear one speaker utter both words with the same vowel sound. He must have intended send and pen, I would reason, {106} because send a pen is the only guess that does not violate solme constraint. Sinned and pin would give me sinned a pin, which violates the rules of grammar and plausible meaning; send and pin can be ruled out by the constraint that the two vowels were pronounced identically; sinned and pen can be ruled out because they violate both these constraints.
This kind of reasoning takes a long time if all the compatibilities must be tested one at a time. But in an auto-associator, they are coded beforehand in the connections, and the network can evaluate them all at once. Suppose each interpretation is a toy neuron, one for sinned, one for send, and so on. Suppose that pairs of units whose interpretations are consistent are connected with positive weights and pairs of units whose interpretations are inconsistent are connected with negative weights. Activation will ricochet around the network, and if all goes well, it will settle into a state in which the greatest number of mutually consistent interpretations are active. A good metaphor is a soap bubble that wobbles in eggy and amoeboid shapes as the tugs among its neighboring molecules pull it into a sphere.
Sometimes a constraint network can have mutually inconsistent but equally stable states. That captures the phenomenon of global ambiguity, in which an entire object, not just its parts, can be interpreted in two ways. If you stare at the drawing of a cube on page 107 (called a Necker cube), your perception will flip from a downward view of its top face to an upward view of its bottom face. When the global flip occurs, the interpretations of all of the local parts are dragged with it. Every near edge becomes a far edge, every convex corner becomes a concave corner, and so on. Or vice versa: if you try to see a convex corner as concave, you can sometimes nudge the whole cube into flipping. The dynamics can be captured in a network, shown below the cube, in which the units represent the interpretations of the parts, and the interpretations that are consistent in a 3-D object excite each other while the ones that are inconsistent inhibit each other.
A fourth advantage comes from a network's ability to generalize automatically. If we had connected our letter-detector (which funneled a bank of input units into a decision unit) to our letter-printer (which had an intention unit fanning out into a bank of output units), we would have made a simple read-write or lookup demon — for example, one that responds to a B by printing a C. But interesting things happen if you skip the middleman and connect the input units directly to the output units. {107}
Instead of a faithful-to-the-letter lookup demon, you have one that can generalize a bit. The network is called a pattern associator.
Suppose the input units at the bottom represent the appearance of animals: “hairy,” “quadrupedal,” “feathered,” “green,” “long-necked,” and so on. With enough units, every animal can be represented by turning on the units for its unique set of properties. A parrot is represented by turning the “feathered” unit on, the “hairy” unit off, and so on. Now suppose the output units at the top stand for zoological facts. One represents the fact that the animal is herbivorous, another that it is warm-blooded, and so on. With no units standing for a particular animal (that is, with no unit for “parrot”), the weights automatically represent statistical knowledge about classes of animals. They embody the knowledge that feathered things tend to be warm-blooded, animals with hair tend to bear live {108} young, and so on. Any fact stored in the connections for one animal (parrots are warm-blooded) automatically transfers to similar animals (budgies are warm-blooded), because the network does not care that the connections belong to an animal at all. The connections merely say which visible properties predict which invisible properties, skipping ideas about species of animals altogether.
Conceptually speaking, a pattern associator captures the idea that if two objects are similar in some ways, they are probably similar in other ways. Mechanically speaking, similar objects are represented by some of the very same units, so any piece of information connected to the units for one object will ipso facto be connected to many of the units for the other. Moreover, classes of different degrees of inclusiveness are superimposed in the same network, because any subset of the units implicitly defines a class. The fewer the units, the larger the class. Say there are input units for “moves,” “breathes,” “hairy,” “barks,” “bites,” and “lifts-leg-at-hydrants.” The connections emanating out of all six trigger facts about dogs. The connections emanating out of the first three trigger facts about mammals. The connections emanating out of the first two trigger facts about animals. With suitable weights, the knowledge programmed in for one animal can be shared with both its immediate and its distant family members.
A fifth trick of neural networks is that they learn from examples, where learning consists of changes in the connection weights. The model-builder (or evolution) does not have to hand-set the thousands of weights needed to get the outputs right. Suppose a “teacher” feeds a pattern associator with an input and also with the correct output. A learning mechanism compares the network's actual output — which at first will be pretty random — with the correct one, and adjusts the weights to minimize the difference between the two. If the network leaves an output node off that the teacher says ought to be on, we want to make it more likely that the current funnel of active inputs will turn it on in the future. So the weights on the active inputs to the recalcitrant output unit are increased slightly. In addition, the output node's own threshold is lowered slightly, to make it more trigger-happy across the board. If the network turns an output node on and the teacher says it should be off, the opposite happens: the weights of the currently active input lines are taken down a notch (possibly driving the weight past zero to a negative value), and the target node's threshold is raised. This all makes the hyperactive output node more likely to turn off in response to those {109} inputs in the future. A whole series of inputs and their outputs is presented to the network, over and over, causing waves of little adjustments of the connection weights, until it gets every output right for every input, at least as well as it can manage to.
A pattern associator equipped with this learning technique is called a perceptron. Perceptrons are interesting but have a big flaw They are like the chef from hell: they think that if a little of each ingredient is good, a lot of everything must be better. In deciding whether a set of inputs justifies turning on an output, the perceptron weights them and adds them up. Often that gives the wrong answer, even on very simple problems. A textbook example of this flaw is the perceptron's handling of the simple logical operation called exclusive-or (“xor”), which means “A or B, but not both.”
When A is on, the network should turn A-xor-B on. When B is on, the network should turn A-xor-B on. These facts will coax the network into increasing the weight for the connection from A (say, to .6) and increasing the weight for the connection from B (say, to .6), making each one high enough to overcome the output unit's threshold (say, .5). But when A and B are both on, we have too much of a good thing — A-xor-B is screaming its head off just when we want it to shut up. If we try smaller weights or a higher threshold, we can keep it quiet when A and B are both on, but then, unfortunately, it will be quiet when just A or just B is on. You can experiment with your own weights and you will see that nothing works. Exclusive-or is just one of many demons that cannot be built out of perceptrons; others include demons to determine whether an even or an odd number of units are on, to determine whether a string of active units is symmetrical, and to get the answer to a simple addition problem.
The solution is to make the network less of a stimulus-response creature and give it an internal representation between the input and output layers. It needs a representation that makes the crucial kinds of information about the inputs explicit, so that each output unit really can just add {110} up its inputs and get the right answer. Here is how it can be done for exclusive-or:
The two hidden units between the input and the output calculate useful intermediate products. The one on the left computes the simple case of “A or B,” which in turn simply excites the output node. The one on the right computes the vexing case of “A and B,” and it inhibits the output node. The output node can simply compute “(A or B) and not (A and B),” which is well within its feeble powers. Note that even at the microscopic level of building the simplest demons out of toy neurons, internal representations are indispensable; stimulus-response connections are not enough.
Even better, a hidden-layer network can be trained to set its own weights, using a fancier version of the perceptron learning procedure. As before, a teacher gives the network the correct output for every input, and the network adjusts the connection weights up or down to try to reduce the difference. But that poses a problem the perceptron did not have to worry about: how to adjust the connections from the input units to the hidden units. It is problematic because the teacher, unless it is a mind reader, has no way of knowing the “correct” states for the hidden units, which are sealed inside the network. The psychologists David Rumelhart, Geoffrey Hinton, and Ronald Williams hit on a clever solution. The output units propagate back to each hidden unit a signal that represents the sum of the hidden unit's errors across all the output units it connects to (“you're sending too much activation,” or “you're sending too little activation,” and by what amount). That signal can serve as a surrogate teaching signal which may be used to adjust the hidden layer's inputs. The connections from the input layer to each hidden unit can be nudged up or down to reduce the hidden unit's tendency to overshoot or undershoot, given the current input pattern. This procedure, called “error back-propagation” or simply “backprop,” can be iterated backwards to any number of layers. {111}
We have reached what many psychologists treat as the height of the neural-network modeler's art. In a way, we have come full circle, because a hidden-layer network is like the arbitrary road map of logic gates that McCulloch and Pitts proposed as their neuro-logical computer. Conceptually speaking, a hidden-layer network is a way to compose a set of propositions, which can be true or false, into a complicated logical function held together by ands, ors, and nots — though with two twists. One is that the values can be continuous rather than on or off, and hence they can represent the degree of truth or the probability of truth of some statement rather than dealing only with statements that are absolutely true or absolutely false. The second twist is that the network can, in many cases, be trained to take on the right weights by being fed with inputs and their correct outputs. On top of these twists there is an attitude: to take inspiration from the many connections among neurons in the brain and feel no guilt about going crazy with the number of gates and connections put into a network. That ethic allows one to design networks that compute many probabilities and hence that exploit the statistical redundancies among the features of the world. And that, in turn, allows neural networks to generalize from one input to similar inputs without further training, as long as the problem is one in which similar inputs yield similar outputs.
Those are a few ideas on how to implement our smallest demons and their bulletin boards as vaguely neural machines. The ideas serve as a bridge, rickety for now, along the path of explanation that begins in the conceptual realm (Grandma's intuitive psychology and the varieties of knowledge, logic, and probability theory that underlie it), continues on to rules and representations (demons and symbols), and eventually arrives at real neurons. Neural networks also offer some pleasant surprises. In figuring out the mind's software, ultimately we may use only demons stupid enough to be replaced by machines. If we seem to need a smarter demon, someone has to figure out how to build him out of stupider ones. It all goes faster, and sometimes goes differently, when neural-network modelers working from the neurons upward can build an inventory of stock demons that do handy things, like a content-addressable memory or an automatically generalizing pattern associator. The mental software engineers (actually, reverse-engineers) have a good parts catalogue from which they can order smart demons. {112}
Where do the rules and representations in mentalese leave off and the neural networks begin? Most cognitive scientists agree on the extremes. At the highest levels of cognition, where we consciously plod through steps and invoke rules we learned in school or discovered ourselves, the mind is something like a production system, with symbolic inscriptions in memory and demons that carry out procedures. At a lower level, the inscriptions and rules are implemented in something like neural networks, which respond to familiar patterns and associate them with other patterns. But the boundary is in dispute. Do simple neural networks handle the bulk of everyday thought, leaving only the products of book-learning to be handled by explicit rules and propositions? Or are the networks more like building blocks that aren't humanly smart until they are assembled into structured representations and programs?
A school called connectionism, led by the psychologists David Rurnelhart and James McClelland, argues that simple networks by themselves can account for most of human intelligence. In its extreme form, connectionism says that the mind is one big hidden-layer back-propagation network, or perhaps a battery of similar or identical ones, and intelligence emerges when a trainer, the environment, tunes the connection weights. The only reason that humans are smarter than rats is that our networks have more hidden layers between stimulus and response and we live in an environment of other humans who serve as network trainers. Rules and symbols might be useful as a rough-and-ready approximation to what is happening in a netw6rk for a psychologist who can't keep track of the millions of streams of activation flowing through the connections, but they are no more than that.
The other view — which I favor — is that those neural networks alone cannot do the job. It is the structuring of networks into programs for manipulating symbols that explains much of human intelligence. In particular, symbol manipulation underlies human language and the parts of reasoning that interact with it. That's not all of cognition, but it's a lot of it; it's everything we can talk about to ourselves and others. In my day job as a psycholinguist I have gathered evidence that even the simplest of talents that go into speaking English, such as forming the past tense of verbs (walk into walked, come into came), is too computationally sophisticated to be handled {113} in a single neural network. In this section, I will present a more general class of evidence. Does the content of our common-sense thoughts (the kind of information we exchange in conversation) require a computational device designed to implement a highly structured rhentalese, or can it be handled by generic neural-network stuff — what one wag has called connec-toplasm? I will show you that our thoughts have a delicate logical structuring that no simple network of homogeneous layers of units can handle.
Why should you care? Because these demonstrations cast doubt on the most influential theory of how the mind works that has ever been proposed. By itself, a perceptron or a hidden-layer network is a high-tech implementation of an ancient doctrine: the association of ideas. The British philosophers John Locke, David Hume, George Berkeley, David Hartley, and John Stuart Mill proposed that thought is governed by two laws. One is contiguity: ideas that are frequently experienced together get associated in the mind. Thereafter, when one is activated, the other is activated too. The other law is resemblance: when two ideas are similar, whatever has been associated with the first idea is automatically associated with the second. As Hume summed up the theory in 1748:
Experience shows us a number of uniform effects, resulting from certain objects. When a new object, endowed with similar sensible qualities, is produced, we expect similar powers and forces, and look for a like effect. From a body of like color and consistence with bread we expect like nourishment and support.
Association by contiguity and resemblance was also thought to be the scrivener that fills the famous blank slate, Locke's metaphor for the neonate mind. The doctrine, called associationism. dominated British and American views of the mind for centuries, and to a large extent still does. When the “ideas” were replaced by stimuli and responses, associationism became behaviorism. The blank slate and the two general-purpose laws of learning are also the psychological underpinnings of the Standard Social Science Model. We hear it in cliches about how our upbringing leads us to “associate’ food with love, wealth with happiness, height with power, and so on.
Until recently, associationism was too vague to test. But neural-network models, which are routinely simulated on computers, make the ideas precise. The learning scheme, in which a teacher presents the network with an input and the correct output and the network strives to duplicate the pairing in the future, is a good model of the law of contiguity. {114} The distributed input representation, in which a concept doeS not get its own unit (“parrot”) but is represented by a pattern of activity over units for its properties (“feathered,” “winged,” and so on), allows for automatic generalization to similar concepts and thus nicely fits thle law of association by resemblance. And if all parts of the mind start off as the same kind of network, we have an implementation of the blank slate. Connectionism thus offers an opportunity. In seeing what simple neural-network models can and cannot do, we can put the centuries-old doctrine of the association of ideas to a rigorous test.
Before we begin, we need to set aside some red herrings. Connectionism is not an alternative to the computational theory of mind, but a variety of it, which claims that the main kind of information processing done by the mind is multivariate statistics. Connectionism is not a necessary corrective to the theory that the mind is like a commercial computer, with a high-speed, error-free, serial central processing unit; no one holds that theory. And there is no real-life Achilles who claims that every form of thinking consists of cranking through thousands of rules from a logic textbook. Finally, connectionist networks are not particularly realistic models of the brain, despite the hopeful label “neural networks.” For example, the “synapse” (connection weight) can switch from excitatory to inhibitory, and information can flow in both directions along an “axon” (connection), both anatomically impossible. When there is a choice between getting a job done and mirroring the brain, connectionists often opt for getting the job done; that shows that the networks are used as a form of artificial intelligence based loosely on the metaphor of neurons, and are not a form of neural modeling. The question is, do they perform the right kinds of computations to model the workings of human thought?
Raw connectoplasm has trouble with five feats of everyday thinking. The feats appear to be subtle at first, and were not even suspected of existing until logicians, linguists, and computer scientists began to put the meanings of sentences under a microscope. But the fdats give human thought its distinctive precision and power and are, I think, an important part of the answer to the question, How does the mind work? One feat is entertaining the concept of an individual. Let's go back to the first departure of neural networks from computerlike representations. {115} Rather than symbolizing an entity as an arbitrary pattern in a string of bits, we represented it as a pattern in a layer of units, each standing for one of the entity's properties. An immediate problem is that there is no longer a way to tell apart two individuals with identical properties. They are represented in one and the same way, and the system is blind to the fact that they are not the same hunk of matter. We have lost the individual: we can represent vegetableness or horsehood, but not a particular vegetable or a particular horse. Whatever the system learns about one horse melds into what it knows about another, identical one. And there is no natural way to represent two horses. Making the horsey nodes twice as active won't do it, because that is indistinguishable from being twice as confident that the properties of a horse are present or from thinking that the properties of a horse are present to twice the degree.
It is easy to confuse the relationship between a class and a subclass, such as “animal” and “horse” (which a network handles easily), with the relationship between a subclass and an individual, such as “horse” and “Mr. Ed.” The two relationships are, to be sure, similar in one way. In both, any property of the higher entity is inherited by the lower entity. If animals breathe, and horses are animals, then horses breathe; if horses have hooves, and Mr. Ed is a horse, then Mr. Ed has hooves. This can lure a modeler into treating an individual as a very, very specific subclass, using some slight difference between the two entities — a freckle unit that is on for one individual but off for the other — to distinguish near-doppelgangers.
Like many connectionist proposals, the idea dates back to British associationism. Berkeley wrote, “Take away the sensations of softness, moisture, redness, tartness, and you take away the cherry, since it is not a being distinct from sensations. A cherry, I say, is nothing but a congeries of sensible impressions.” But Berkeley's suggestion never did work. Your knowledge of the properties of two objects can be identical and still you can know they are distinct. Imagine a room with two identical chairs. Someone comes in and switches them around. Is the room the same as or different from before? Obviously, everyone understands that it is different. But you know of no feature that distinguishes one chair from the other — except that you can think of one as Chair Number One and the other as Chair Number Two. We are back to arbitrary labels for memory slots, as in the despised digital computer! The same point underlies a joke from the comedian Stephen Wright: “While I was gone, someone stole everything in my apartment and replaced it {116} with an exact replica. When I told my roommate, he said, ‘Do I know you?”
There is, admittedly, one feature that always distinguishes individuals: they cannot be in the same place at the same time. Perhaps the mind could stamp every object with the time and place and constantly update those coordinates, allowing it to distinguish individuals with identical properties. But even that fails to capture our ability to keep individuals apart in our minds. Suppose an infinite white plane contains nothing but two identical circles. One of them slides over and superimposes itself on the second one for a few moments, then proceeds on its way. I don't think anyone has trouble conceiving of the circles as distinct entities even in the moments in which they are in the same place at the same time. That shows that being in a certain place at a certain time is not our mental definition of “individual.”
The moral is not that individuals cannot be represented in neural networks. It's easy; just dedicate some units to individuals' identities as individuals, independent of the individuals’ properties. One could give each individual its own unit, or give each individual the equivalent of a serial number, coded in a pattern of active units. The moral is that the networks of the mind have to be crafted to implement the abstract logical notion of the individual, analogous to the role played by an arbitrarily labeled memory location in a computer. What does not work is a pattern association restricted to an object's observable properties, a modern instantiation of the Aristotelian dictum that “there is nothing in the intellect that was not previously in the senses.”
Is this discussion just an exercise in logic? Not at all: the concept of the individual is the fundamental particle of our faculties of social reasoning. Let me give you two real-life examples, involving those grand arenas of human interaction, love and justice.
Monozygotic twins share most of their properties. Apart from the physical resemblance, they think alike, feel alike, and act alike. Not identically, of course, and that is a loophole through which one might try to represent them as very narrow subclasses. But any creature representing them as subclasses should at least tend to treat identical twins alike. The creature should transfer its opinions from one to the other, at least probabilistically or to some extent — remember, that is a selling point of associationism and its implementation in connectoplasm. For example, whatever attracts you to one twin — the way he walks, the way he talks, the way he looks, and so on — should attract you to the other. And this {117} should cast identical twins in tales of jealousy and betrayal of truly gothic proportions. In fact, nothing happens. The spouse of one identical twin feels no romantic attraction toward the other twin. Love locks our feelings in to another person as that person, not as a kind of person, no matter how narrow the kind.
On March 10, 1988, someone bit off half the ear of Officer David J. Storton. No one doubts who did it: either Shawn Blick, a twenty-one-year-old man living in Palo Alto, California, or Jonathan Blick, his identical twin brother. Both were scuffling with the officer, and one of them bit off part of his ear. Both were charged with mayhem, attempted burglary, assaulting a police officer, and aggravated mayhem. The aggravated mayhem charge, for the ear biting, carries a life sentence. Officer Storton testified that one of the twins had short hair and the other long, and it was the long-haired man who bit him. Unfortunately, by the time the men surrendered three days later they sported identical crew cuts and weren't talking. Their lawyers argued that neither one could be given the severe sentence for aggravated mayhem. For each brother there is a reasonable doubt as to whether he did it, because it could have been the other. The argument is compelling because our sense of justice picks out the individual who did a deed, not the characteristics of that individual.
Our obsession with individual personhood is not an inexplicable quirk, but probably evolved because every human being we meet, quite apart from any property we can observe, is guaranteed to house an unreplicable collection of memories and desires owing to a unique embryological and biographical history. In Chapter 6, when we reverse-engineer the sense of justice and the emotion of romantic love, we will see that the mental act of registering individual persons is at the heart of their design.
Human beings are not the only class of confusable individuals we have to keep distinct; a shell game is another real-life example. Many animals have to play shell games and thus keep track of individuals. One example is the mother who has to track her offspring, which may look like everyone else's but invisibly carries her genes. Another is the predator of herding animals, who has to track one member of the herd, following the tag-in-the-swimming-pool strategy: if you're “It,” don't switch quarries, giving everyone but yourself time to catch their breath. When zoologists in Kenya tried to make their data collection easier by color-coding the horns of wildebeests they had tranquilized, they found that no matter how carefully they restored the marked animal to vigor before {118} reintroducing it to the herd, it was killed within a day or so by hyenas. One explanation is that the colored marker made it easy for the hyenas to individuate the wildebeest and chase it to the point of exhaustion. Recent thinking about zebra stripes is that they are not for blending in with stripey tall grass — always a dubious explanation — but for turning the zebras into a living shell game, baffling lions and other predators as they try to keep their attention on just one zebra. Of course, we do not know that hyenas or lions have the concept of an individual; perhaps an odd man out just looks more appetizing. But the examples illustrate the computational problem of distinguishing individuals from classes, and they underscore the human mind's facility in solving it.
A second problem for associationism is called compositionality: the ability of a representation to be built out of parts and to have a meaning that comes from the meanings of the parts and from the way they are combined. Compositionality is the quintessential property of all human languages. The meaning of The baby ate the slug can be calculated from the meanings of baby, ate, the, and slug and from their positions in the sentence. The whole is not the sum of the parts; when the weirds are rearranged into The slug ate the baby, a different idea is conveyed. Since you have never heard either sentence before, you must have interpreted them by applying a set of algorithms (incorporating the rules of syntax) to the strings of words. The end product in each case is a novel thought you assembled on the fly. Equipped with the concepts of babies, slugs, and eating, and with an ability to arrange symbols for them on a mental bulletin board according to a scheme that can be registered by the demons that read it, you can think the thought for the first time in your life.
Journalists say that when a dog bites a man, that is not news, but when a man bites a dog, that is news. The compositionality of mental representations is what allows us to understand news. We can entertain wild and wonderful new ideas, no matter how outlandish. The cow jumped over the moon; the Grinch stole Christmas; the universe began with a big bang; aliens land at Harvard; Michael Jackson married Elvis’ daughter. Thanks to the mathematics of combinatorics, we will never run out of news. There are hundreds of millions of trillions of thinkable thoughts. {119}
You might think it is easy to put compositionality in a neural network: just turn on the units for “baby,” “eats,” and “slug.” But if that was all that happened in your mind, you would be in a fog as to whether the baby ate the slug, the slug ate the baby, or the baby and the slug ate. The concepts must be assigned to roles (what logicians call “arguments”): who is the eater, who is the eaten.
Perhaps, then, one could dedicate a node to each combination of concepts and roles. There would be a baby-eats-slug node and a slug-eats-baby node. The brain contains a massive number of neurons, one might think, so why not do it that way? One reason not to is that there is massive and then there is really massive. The number of combinations grows exponentially with their allowable size, setting off a combinatorial explosion whose numbers surpass even our most generous guess of the brain's capacity. According to legend, the vizier Sissa Ben Dahir claimed a humble reward from King Shirham of India for inventing the game of chess. All he asked for was a grain of wheat to be placed on the first square of a chessboard, two grains of wheat on the second, four on the third, and so on. Well before they reached the sixty-fourth square the king discovered he had unwittingly committed all the wheat in his kingdom. The reward amounted to four trillion bushels, the world's wheat production for two thousand years. Similarly, the combinatorics of thought can overwhelm the number of neurons in the brain. A hundred million trillion sentence meanings cannot be squeezed into a brain with a hundred billion neurons if each meaning must have its own neuron.
But even if they did fit, a complex thought is surely not stored whole, one thought per neuron. The clues come from the way our thoughts are related to one another. Imagine that each thought had its own unit. There would have to be separate units for the baby eating the slug, the slug eating the baby, the chicken eating the slug, the chicken eating the baby, the slug eating the chicken, the baby seeing the slug, the slug seeing the baby, the chicken seeing the slug, and so on. Units have to be assigned to all of these thoughts and many more; any human being capable of thinking the thought that the baby saw the chicken is also capable of thinking the thought that the chicken saw the baby. But there is something suspicious about this inventory of thought-units; it is shot through with coincidences. Over and over again we have babies eating, slugs eating, babies seeing, slugs seeing, and so on. The thoughts perfectly slot themselves into the rows, columns, layers, hyper-rows, hyper-columns, and hyper-layers of a vast matrix. But this striking pattern is baffling if thoughts are {120} just a very big collection of separate units; the units could just as easily have represented an inventory of isolated factoids that had nothing do with one another. When nature presents us with objects that perfectly fill a rectangular bank of pigeonholes, it's telling us that the objects must be built out of smaller components which correspond to the rows and the columns. That's how the periodic table of the elements led to an understanding of the structure of the atom. For similar reasons we can conclude that the warp and weft of our thinkable thoughts are the concepts composing them. Thoughts are assembled out of concepts; they are not stored whole.
Compositionality is surprisingly tricky for connectoplasm. All the obvious tricks turn out to be inadequate halfway measures. Suppose we dedicate each unit to a combination of one concept and one role. Perhaps one unit would stand for baby-eats and another for slug-is-eaten, or perhaps one unit would stand for baby-does-something and another for slug-has-something-done-to-it. This cuts down the number of combinations considerably — but at the cost of reintroducing befuddlement about who did what to whom. The thought “The baby ate the chicken when the poodle ate the slug” would be indistinguishable from the thought “The baby ate the slug when the poodle ate the chicken.” The problem is that a unit for baby-eats does not say what it ate, and a unit for slug-is-eaten does not say who ate it.
A step in the right direction is to build into the hardware a distinction between the concepts (baby, slug, and so on) and the roles they play (actor, acted upon, and so on). Suppose we set up separate pools of units, one for the role of actor, one for the action, one for the role of acted upon. To represent a proposition, each pool of units is filled with the pattern for the concept currently playing the role, shunted in from a separate memory store for concepts. If we connected every node to every other node, we would have an auto-asso-ciator for propositions, and it could achieve a modicum of facility with combinatorial thoughts. We could store “baby ate slug,” and then when any two of the components were presented as a question (say, “baby” and “slug,” representing the question “What is the relationship between the baby and the slug?”), the network would complete the pattern by turning on the units for the third component (in this case, “ate”). {121}
Or would it? Alas, it would not. Consider these thoughts:
Baby same-as baby.
Baby different-from slug.
Slug different from baby.
Slug same-as slug.
No set of connection weights that allow “baby” in the first slot and “same-as” in the middle to turn on “baby” in the third slot, and that allow “baby” and “different-from” to turn on “slug,” and that allow “slug” and “different-from” to turn on “baby,” will also allow “slug” and “same-as” to turn on “slug.” It's the exclusive-or problem in a different guise. If the baby-to-baby and same-to-baby links are strong, they will turn on “baby” in response to “baby same-as___” (which is good), but they will also turn on “baby” in response to “baby different-from___” (which is bad) and in response to “slug same-as___” (also bad). Jigger the weights all you want; you will never find ones that work for all four sentences. Since any human can understand the four sentences without getting confused, the human mind must represent propositions with something more sophisticated than a set of concept-to-concept or concept-to-role associations. The mind needs a representation for the proposition itself. In this example, the model needs an extra layer of units — most straightforwardly, a layer dedicated to representing the entire proposition, separately from the concepts and their {122}
roles. The bottom of page 121 shows, in simplified form, a model devised by Geoffrey Hinton that does handle the sentences.
The bank of “proposition” units light up in arbitrary patterns, a bit like serial numbers, that label complete thoughts. It acts as a superstructure keeping the concepts in each proposition in their proper slots. Note how closely the architecture of the network implements standard, language-like mentalese! There have been other suggestions for compositional networks that aren't such obvious mimics, but they all have to have some specially engineered parts that separate concepts from their roles and that bind each concept to its role properly. The ingredients of logic such as predicate, argument, and proposition, and the computational machinery to handle them, have to be snuck back in to get a model to do mind-like things; association-stuff by itself is not enough.
Another mental talent that you may never have realized you have is called quantification, or variable-binding. It arises from a combination of the first problem, individuals, with the second, compositionality. Our compositional thoughts are, after all, often about individuals, and it makes a difference how those individuals are linked to the various parts of the thought. The thought that a particular baby ate a particular slug is different from the thought that a particular baby eats slugs in general, or that babies in general eat slugs in general. There is a family of jokes whose humor depends on the listener appreciating that difference. “Every forty-five seconds someone in the United States sustains a head injury.” “Omigod! That poor guy!” When we hear that “Hildegard wants to many a man with big muscles,” we wonder whether she has a particular he-man lined up or if she is just hanging hopefully around the gym. Abraham Lincoln said, “You may fool all the people some of the time; you can even fool some of the people all the time; but you can't fool all of the people all the time.” Without an ability to compute quantification, we could not understand what he said.
In these examples, we have several sentences, or several readings of an ambiguous sentence, in which the same concepts play the same roles but the ideas as a whole are very different. Hooking up concepts to their roles is not enough. Logicians capture these distinctions with variables and quantifiers. A variable is a place-holding symbol like x or y which stands for the same entity across different propositions or different parts {123} of one proposition. A quantifier is a symbol that can express “There exists a particular x who . . .” and “For all x it is true that ...” A thought can then be captured in a proposition built out of symbols for concepts, roles, quantifiers, and variables, all precisely ordered and bracketed. Compare, for example, “Every forty-five seconds {there exists an X [who gets injured]}” with “There exists an X {who every forty-five seconds [gets injured]}.” Our mentalese must have machinery that does something similar. But so far, we have no hint as to how this can be done in an associative network.
Not only can a proposition be about an individual, it must be treated as a kind of individual itself, and that gives rise to a new problem. Con-nectoplasm gets its power from superimposing patterns in a single set of units. Unfortunately, that can breed bizarre chimeras or make a network fall between two stools. It is part of a pervasive bugaboo for connecto-plasm called interference or cross-talk.
Here are two examples. The psychologists Neal Cohen and Michael McCloskey trained a network to add two digits. They first trained it to add 1 to the other numbers: when the inputs were “1” and “3,” the network learned to put out “4,” and so on. Then they trained it to add 2 to any other number. Unfortunately, the add-2 problem sucked the connection weights over to values that were optimal for adding 2, and because the network had no hardware set aside to anchoring the knowledge of how to add 1, it became amnesic for how to add 1! The effect is called “catastrophic forgetting” because it is unlike the mild forgetting of everyday life. Another example comes from a network designed by McClelland and his collaborator Alan Kawamoto to assign meanings to ambiguous sentences. For example, A bat broke the ivindow can mean either that a baseball bat was hurled at it or that a winged mammal flew through it. The network came up with the one interpretation that humans do not make: a winged mammal broke the window using a baseball bat!
As with any other tool, the features that make connectoplasm good for some things make it bad for other things. A network's ability to generalize comes from its dense interconnectivity and its superposition of inputs. But if you're a unit, it's not always so great to have thousands of other units yammering in your ear and to be buffeted by wave after wave of inputs. Often different hunks of information should be packaged and stored separately, not blended. One way to do this is to give each proposition its own storage slot and address — once again showing that not all aspects of computer design can be dismissed as silicon curiosities. Computers, {124} after all, were not designed as room heaters; they were designed to process information in a way that is meaningful to human uslers.
The psychologists David Sherry and Dan Schacter have pulshed this line of reasoning farther. They note that the different engineering demands on a memory system are often at cross-purposes. Natural selection, they argue, responded by giving organisms specialized memory systems. Each has a computational structure optimized for the demands of one of the tasks the mind of the animal must fulfill. For example, birds that cache seeds to retrieve in leaner times have evolved a capacious memory for the hiding places (ten thousand places, in the case of the Clark's Nutcracker). Birds whose males sing to impress the females or to intimidate other males have evolved a capacious memory for songs (two hundred, in the case of the nightingale). The memory for caches and the memory for songs are in different brain structures and have different patterns of wiring. We humans place two very different demands on our memory system at the same time. We have to remember individual episodes of who did what to whom, when, where, and why, and that requires stamping each episode with a time, a date, and a serial number. But we also must extract generic knowledge about how people work and how the world works. Sherry and Schacter suggest that nature gave us one memory system for each requirement: an “episodic” or autobiographical memory, and a “semantic” or generic-knowledge memory, following a distinction first made by the psychologist Endel Tulving.
The trick that multiplies human thoughts into truly astronomical numbers is not the slotting of concepts into three or four roles but a kind of mental fecundity called recursion. A fixed set of units for each role is not enough. We humans can take an entire proposition and give it a role in some larger proposition. Then we can take the larger proposition and embed it in a still-larger one, creating a hierarchical tree structure of propositions inside propositions. Not only did the baby eat the slug, but the father saw the baby eat the slug, and I wonder whether Che father saw the baby eat the slug, and the father knows that I wonder whether he saw the baby eat the slug, and I can guess that the father knbws that I wonder whether he saw the baby eat the slug, and so on. Just as an ability {125} to add 1 to a number bestows the ability to generate an infinite set of numbers, the ability to embed a proposition inside another proposition bestows the ability to think an infinite number of thoughts.
To get propositions-inside-propositions out of the network displayed in the preceding diagram, one could add a new layer of connections to the top of the diagram, connecting the bank of units for the whole proposition to the role slot in some bigger proposition; the role might be something like “event-observed.” If we continue to add enough layers, we could accommodate an entire multiply nested proposition by etching a full tree diagram for it in connectoplasm. But this solution is clumsy and raises suspicions. For every kind of recursive structure, there would have to be a different network hard-wired in: one network for a person thinking about a proposition, another for a person thinking about a proposition about a person thinking about a proposition, a third for a person communicating a proposition about some person to another person, and so on.
In computer science and psycholinguistics, a more powerful and flexible mechanism is used. Each simple structure (for a person, an action, a proposition, and so on) is represented in long-term memory once, and a processor shuttles its attention from one structure to another, storing the itinerary of visits in short-term memory to thread the proposition together. This dynamic processor, called a recursive transition network, is especially plausible for sentence understanding, because we hear and read words one at a time rather than inhaling an entire sentence at once. We also seem to chew our complex thoughts piece by piece rather swallowing or regurgitating them whole, and that suggests that the mind is equipped with a recursive proposition-cruncher for thoughts, not just for sentences. The psychologists Michael Jordan and Jeff Elman have built networks whose output units send out connections that loop back into a set of short-term memory units, triggering a-new cycle of activation flow. That looping design provides a glimpse of how iterative information processing might be implemented in neural networks, but it is not enough to interpret or assemble structured propositions. More recently, there have been attempts to combine a looping network with a propositional network to implement a kind of recursive transition network out of pieces of connectoplasm. These attempts show that unless neural networks are specially assembled into a recursive processor, they cannot handle our recursive thoughts. {126}
The human mind must be given credit for one more cognitive feat that is difficult to wring out of connectoplasm, and therefore difficult to explain by associationism. Neural networks easily implement a fuzzy logic in which everything is a kind-of something to some degree. To be sure, many common-sense concepts really are fuzzy at their edges and have no clear definitions. The philosopher Ludwig Wittgenstein offered the example of “a game,” whose exemplars (jigsaw puzzles, roller derby, curling, Dungeons and Dragons, cockfighting, and so on) have nothing in common, and earlier I gave you two others, “bachelor” and “vegetable.” The members of a fuzzy category lack a single defining feature; they overlap in many features, much like the members of a family or the strands of a rope, none of which runs the entire length. In the comic strip Bloom County, Opus the Penguin, temporarily amnesic, objects when he is told he is a bird. Birds are svelte and aerodynamic, he points out; he is not. Birds can fly; he cannot. Birds can sing; his performance of “Yesterday” left his listeners gagging. Opus suspects he is really Bullwinkle the Moose. So even concepts like “bird” seem to be organized not around necessary and sufficient conditions but around prototypical members. If you look up bird in the dictionary, it will be illustrated not with a penguin but with Joe Bird, typically a sparrow.
Experiments in cognitive psychology have shown that people are bigots about birds, other animals, vegetables, and tools. People share a stereotype, project it to all the members of a category, recognize the stereotype more quickly than the nonconformists, and even claim to have seen the stereotype when all they really saw were examples similar to it. These responses can be predicted by tallying up the properties that a member shares with other members of the category: the more birdy properties, the better the bird. An auto-associator presented with examples from a category pretty much does the same thing, because it computes correlations among properties. That's a reason to believe that parts of human memory are wired something like an auto-associator.
But there must be more to the mind than that. People are not always fuzzy. We laugh at Opus because a part of us knows that he really is a bird. We may agree on the prototype of a grandmother — the kindly, gray-haired septuagenarian dispensing blueberry muffins or chicken soup (depending on whose stereotype we're talking about) — but at the same time we have no trouble understanding that Tina Turner and Elizabeth {127} Taylor are grandmothers (indeed, a Jewish grandmother, in Taylor's case). When it comes to bachelors, many people — such as immigration authorities, justices of the peace, and health care bureaucrats — are notoriously Mwfuzzy about who belongs in the category; as we all know, a lot can hinge on a piece of paper. Examples of unfuzzy thinking are everywhere. A judge may free an obviously guilty suspect on a technicality. Bartenders deny beer to a responsible adult the day before his twenty-first birthday. We joke that you can't be a little bit pregnant or a little bit married, and after a Canadian survey reported that married women have sex 1.57 times a week, the cartoonist Terry Mosher drew a woman sitting up in bed beside her dozing husband and muttering, “Well, that was .57.”
In fact, fuzzy and crisp versions of the same category can live side by side in a single head. The psychologists Sharon Armstrong, Henry Gleitman, and Lila Gleitman mischievously gave the standard tests for fuzzy categories to university students but asked them about knife-edged categories like “odd number” and “female.” The subjects happily agreed to daft statements such as that 13 is a better example of an odd number than 23 is, and that a mother is a better example of a female than a comedienne is. Moments later the subjects also claimed that a number either is odd or is even, and that a person either is female or is male, with no gray areas.
People think in two modes. They can form fuzzy stereotypes by unin-sightfully soaking up correlations among properties, taking advantage of the fact that things in the world tend to fall into clusters (things that bark also bite and lift their legs at hydrants). But people can also create systems of rules — intuitive theories — that define categories in terms of the rules that apply to them, and that treat all the members of the category equally. Ail cultures have systems of formal kinship rules, often so precise that one can prove theorems in them. Our own kinship system gives us a crisp version of “grandmother”: the mother of a parent, muffins be damned. Law, arithmetic, folk science, and social conventions (with their rites of passage sharply delineating adults from children and husbands from bachelors) are other rule systems in which people all over the planet reckon. The grammar of a language is yet another.
Rule systems allow us to rise above mere similarity and reach conclusions based on explanations. Hinton, Rumelhart, and McClelland wrote: “People are good at generalizing newly acquired knowledge. If, for example, you learn that chimpanzees like onions you will probably raise your estimate of the probability that gorillas like onions. In a network that uses distributed representations, this kind of generalization is automatic.” {128} Their boast is a twentieth-century echo of Hume's remark that from a body similar to bread in color and consistency we expect a similar degree of nourishment. But the assumption breaks down in any domain in which a person actually knows something. The onion-loving gorilla was intended only as an example, of course, but it is interesting to see how even this simple example underestimates us. Knowing a bit of zoology and not much about gorillas, I would definitely not raise my estimate of the probability that gorillas like onions. Animals can be cross-classified. They may be grouped by genealogy and resemblance into a taxon, such as the great apes, but they also may be grouped into “guilds” that specialize in certain ways of getting food, such as omnivores, herbivores, and carnivores. Knowing this principle leads me to reason as follows. Chimpanzees are omnivores, and it is not surprising that they eat onions; after all, we are omnivores, and we eat them. But gorillas are herbivores, who spend their days munching wild celery, thistles, and other plants. Herbivores are often finicky about which species they feed on, because their digestive systems are optimized to detoxify the poisons in some kinds of plants and not others (the extreme example being koalas, who specialize in eating eucalyptus leaves). So it would not surprise me if gorillas avoided the pungent onion, regardless of what chimpanzees do. Depending on which system of explanation I call to mind, chimpanzees and’ gorillas are either highly similar category-mates or as different as people and cows.
In associationism and its implementation in connectoplasm, the way an object is represented (namely, as a set of properties) automatically commits the system to generalizing in a certain way (unless it is trained out of the generalization with specially provided contrary examples). The alternative I am pushing is that humans can mentally symbolize kinds of objects, and those symbols can be referred to in a number of rule systems we carry around in our heads. (In artificial intelligence, this technique is called explanation-based generalization, and connectionist designs are an example of the technique called similarity-based generalization.) Our rule systems couch knowledge in compositionaj, quantified, recursive propositions, and collections of these propositions interlock to form modules or intuitive theories about particular domains of experience, such as kinship, intuitive science, intuitive psychology, number, language, and law. Chapter 5 explores some of those domains.
What good are crisp categories and systems of rules? In the social world they can adjudicate between haggling parties each pointing at the fuzzy boundary of a category, one saying something is inside and the {129} other saying it is outside. Rites of passage, the age of majority, diplomas, licenses, and other pieces of legal paper draw sharp lines that all parties can mentally represent, lines that let everyone know where everyone else stands. Similarly, all-or-none rules are a defense against salami tactics, in which a person tries to take advantage of a fuzzy category by claiming one borderline case after another to his advantage.
Rules and abstract categories also help in dealing with the natural world. By sidestepping similarity, they allow us to get beneath the surface and ferret out hidden laws that make things tick. And because they are, in a sense, digital, they give representations stability and precision. If you make a chain of analog copies from an analog tape, the quality declines with each generation of copying. But if you make a chain of digital copies, the last can be as good as the first. Similarly, crisp symbolic representations allow for chains of reasoning in which the symbols are copied verbatim in successive thoughts, forming what logicians call a sorites:
All ravens are crows.
All crows are birds.
All birds are animals.
All animals need oxygen.
A sorites allows a thinker to draw conclusions with confidence despite meager experience. For example, a thinker can conclude that ravens need oxygen even if no one has ever actually deprived a raven of oxygen to see what happens. The thinker can reach that conclusion even if he or she has never witnessed an experiment depriving any animal of oxygen but only heard the statement from a credible expert. But if each step in the deduction were fuzzy or probabilistic or cluttered with the particulars of the category members one step before, the slop would accumulate. The last statement would be as noisy as an wth-generation bootleg tape or as unrecognizable as the last whisper in a game of broken telephone. People in all cultures carry out long chains of reasoning built from links whose truth they could not have observed directly. Philosophers have often pointed out that science is made possible by that ability.
Like many issues surrounding the mind, the debate over connectionism is often cast as a debate between innateness and learning. And as always, {130} that makes it impossible to think clearly. Certainly learning plays an enormous role in connectionist modeling. Often a modeler, sent back to the drawing board by the problems I have mentioned, will take advantage of a hidden-layer network's ability to learn a set of inputs and outputs and generalize them to new, similar ones. By training the living daylights out of a generic hidden-layer network, one can sometimes get it to do approximately the right thing. But heroic training regimes cannot, by themselves, be the salvation of connectoplasm. That is not because the networks have too little innate structure and too much environmental input. It is because raw connectoplasm is so underpowered that networks must often be built with the worst combination: too much innate structure combined with too much environmental input.
For example, Hinton devised a three-layer network to compute family relationships. (He intended it as a demonstration of how networks work, but other connectionists have treated it as a real theory of psychology.) The input layer had units for a name and units for a relationship, such as “Colin” and “mother.” The output layer had units for the name of the person so related, such as “Victoria.” Since the units and connections are the innate structure of a network, and only the connection weights are learned, taken literally the network corresponds to an innate module in the brain just for spitting out answers to questions about who is related to a named person in a given way. It is not a system for reasoning about kinship in general, because the knowledge is smeared across the connection weights linking the question layer to the answer layer, rather than being stored in a database that can be accessed by different retrieval processes. So the knowledge is useless if the question is changed slightly, such as asking how two people are related or asking for the names and relationships in a person's family. In this sense, the model has too much innate structure; it is tailored to a specific quiz.
After training the model to reproduce the relationships in a small, made-up family, Hinton called attention to its ability to generalize to new pairs of kin. But in the fine print we learn that the network had to be trained on 100 of the 104 possible pairs in order to generalize to the remaining 4. And each of the 100 pairs in the training regime had to be fed into the network 1,500 times (150,000 lessons in all)! Obviously children do not learn family relationships in a manner even remotely like this. The numbers are typical of connectionist networks, because they do not cut to the solution by means of rules but need to have mast of the examples pounded into them and merely interpolate between the examples. {131} Every substantially different kind of example must be in the training set, or the network will interpolate spuriously, as in the story of the statisticians on a duck hunt: One shoots a yard too high, the second shoots a yard too low, and the third shouts, “We got him!”
Why put connectoplasm under such strong lights? Certainly not because I think neural-network modeling is unimportant — quite the contrary! Without it, my whole edifice on how the mind works would be left levitating in midair. Nor do I think that network modeling is merely subcontracting out the work of building demons and data structures from neural hardware. Many connectionist models offer real surprises about what the simplest steps of mental computation can accomplish. I do think that connectionism has been oversold. Because networks are advertised as soft, parallel, analogical, biological, and continuous, they have acquired a cuddly connotation and a diverse fan club. But neural networks don't perform miracles, only some logical and statistical operations. The choices of an input representation, of the number of networks, of the wiring diagram chosen for each one, and of the data pathways and control structures that interconnect them explain more about what makes a system smart than do the generic powers of the component connectoplasm.
But my main intent is not to show what certain kinds of models cannot do but what the mind can do. The point of this chapter is to give you a feel for the stuff our minds are made of. Thoughts and thinking are no longer ghostly enigmas but mechanical processes that can be studied, and the strengths and weaknesses of different theories can be examined and debated. I find it particularly illuminating to see the shortcomings of the venerable doctrine of the association of ideas, because they highlight the precision, subtlety, complexity, and open-endedness of our everyday thinking. The computational power of human thought has real consequences. It is put to good use in our capacity for love, justice, creativity, literature, music, kinship, law, science, and other activities we will explore in later chapters. But before we get to them, we must return to the other question that opened this chapter.
What about consciousness? What makes us actually suffer the pain of a toothache or see the blue of the sky as hlue The computational theory of {132} mind, even with complete neural underpinnings, offers no clear answer. The symbol blue is inscribed, goal states change, some neurons fire; so what? Consciousness has struck many thinkers as not just a problem but almost a miracle:
Matter can differ from matter only in form, bulk, density, motion and direction of motion: to which of these, however varied or combined, can consciousness be annexed? To be round or square, to be solid or fluid, to be great or little, to be moved slowly or swiftly one way or another, are modes of material existence, all equally alien from the nature of cogitation.
— Samuel Jdhnson
How it is that anything so remarkable as a state of consciousness comes about as a result of irritating nervous tissue, is just as unaccountable as the appearance of the Djin, when Aladdin rubbed his lamp.
— Thomas Huxley
Somehow, we feel, the water of the physical brain is turned into the wine of consciousness, but we draw a total blank on the nature of this conversion. Neural transmissions just seem like the wrong kind of materials with which to bring consciousness into the world.
— Colin McGinn
Consciousness presents us with puzzle after puzzle. How can a neural event cause consciousness to happen? What good is consciousness? That is, what does the raw sensation of redness add to the train of billiard-ball events taking place in our neural computers? Any effect of perceiving something as red — noticing it against a sea of green, saying out loud, “That's red,” reminiscing about Santa Claus and fire engines, becoming agitated — could be accomplished by pure information processing triggered by a sensor for long-wavelength light. Is consciousness an impotent side effect hovering over the symbols, like the lights flashing on a computer or the thunder that accompanies lightning? And if consciousness is useless — if a creature without it could negotiate the world as well as a creature with it — why would natural selection have favored the conscious one?
Consciousness has recently become the circle that everyone wants to square. Almost every month an article announces that consciousness has been explained at last, often with a raspberry blown at the theologians and humanists who would put boundaries on science and another one {133} for the scientists and philosophers who dismiss the topic as too subjective or muddled to be studyable.
Unfortunately, many of the things that people write about consciousness are almost as puzzling as consciousness itself. Stephen Jay Gould wrote, “Homo sapiens is one small twig [on the tree of life]. . . . Yet our twig, for better or worse, has developed the most extraordinary new quality in all the history of multicellular life since the Cambrian explosion. We have invented consciousness with all its sequelae from Hamlet to Hiroshima.” Gould has denied consciousness to all nonhuman animals; other scientists grant it to some animals but not all. Many test for consciousness by seeing whether an animal recognizes that the image in a mirror is itself and not another animal. By this standard, monkeys, young chimpanzees, old chimpanzees, elephants, and human toddlers are unconscious. The only conscious animals are gorillas, orangutans, chimpanzees in their prime, and, according to Skinner and his student Robert Epstein, properly trained pigeons. Others are even more restrictive than Gould: not even all people are conscious. Julian Jaynes claimed that consciousness is a recent invention. The people of early civilizations, including the Greeks of Homer and the Hebrews of the Old Testament, were unconscious. Dennett is sympathetic to the claim; he believes that consciousness “is largely a product of cultural evolution that gets imparted to brains in early training” and that it is “a huge complex of memes,” meme being Dawkins’ term for a contagious feature of culture, such as a catchy jingle or the latest fashion craze.
Something about the topic of consciousness makes people, like the White Queen in Through the Looking Glass, believe six impossible things before breakfast. Could most animals really be unconscious— sleepwalkers, zombies, automata, out cold? Hath not a dog senses, affections, passions? If you prick them, do they not feel pain? And was Moses really unable to taste salt or see red or enjoy sex? Do children learn to become conscious in the same way that they learn to wear baseball caps turned around?
People who write about consciousness are not crazy, so they must have something different in mind when they use the word. One of the best observations about the concept of consciousness came from Woody Allen in his hypothetical college course catalogue:
Introduction to Psychology: The theory of human behavior. ... Is there a split between mind and body, and, if so, which is better to have? {134}
. . . Special consideration is given to a study of consciousness as opposed to unconsciousness, with many helpful hints on how to remain conscious.
Verbal humor sets readers up with one meaning of an ambiguous word and surprises them with another. Theoreticians also trade on the ambiguity of the word consciousness, not as a joke but as a bait-and-switch: the reader is led to expect a theory for one sense of the word, the hardest to explain, and is given a theory for another sense, the easiest to explain. I don't like to dwell on definitions, but when it comes to consciousness we have no choice but to begin by disentangling the meanings.
Sometimes “consciousness” is just used as a lofty synonym for “intelligence.” Gould, for example, must have been using it in this way. But there are three more-specialized meanings, nicely distinguished by the linguist Ray Jackendoff and the philosopher Ned Block.
One is self-knowledge. Among the various people and objects that an intelligent being can have information about is the being itself. Not only can I feel pain and see red, I can think to myself, “Hey, here I am, Steve Pinker, feeling pain and seeing red!” Oddly enough, this recondite sense of the word is the one that most academic discussions have in mind. Consciousness is typically defined as “building an internal model of the world that contains the self,” “reflecting back on one's own mode of understanding,” and other kinds of navel-gazing that have nothing to do with consciousness as it is commonly understood: being alive and awake and aware.
Self-knowledge, including the ability to use a mirror, is no more mysterious than any other topic in perception and memory. If I have a mental database for people, what's to prevent it from containing an entry for myself? If I can learn to raise my arm and crane my neck to sight a hidden spot on my back, why couldn't I learn to raise a mirror and look up at it to sight a hidden spot on my forehead? And access to information about the self is perfectly easy to model. Any beginning programmer can write a short piece of software that examines, reports on, and even modifies itself. A robot that could recognize itself in a mirror would not be much more difficult to build than a robot that could recognize anything at all. There are, to be sure, good questions to ask about the evolution of self-knowledge, its development in children, and its advantages (and, more interesting, disadvantages, as we shall see in Chapter 6). But self-knowledge is an everyday topic in cognitive science, not the paradox of {135} water becoming wine. Because it is so easy to say something about self-knowledge, writers can crow about their “theory of consciousness.”
A second sense is access to information. I ask, “A penny for your thoughts?” You reply by telling me the content of your daydreams, your plans for the day, your aches and itches, and the colors, shapes, and sounds in front of you. But you cannot tell me about the enzymes secreted by your stomach, the current settings of your heart and breathing rate, the computations in your brain that recover 3-D shapes from the 2-D retinas, the rules of syntax that order the words as you speak, or the sequence of muscle contractions that allow you to pick up a glass. That shows that the mass of information processing in the nervous system falls into two pools. One pool, which includes the products of vision and the contents of short-term memory, can be accessed by the systems underlying verbal reports, rational thought, and deliberate decision making. The other pool, which includes autonomic (gut-level) responses, the internal calculations behind vision, language, and movement, and repressed desires or memories (if there are any), cannot be accessed by those systems. Sometimes information can pass from the first pool to the second or vice versa. When we first learn how to use a stick shift, every motion has to be thought out, but with practice the skill becomes automatic. With intense concentration and biofeedback, we can focus on a hidden sensation like our heartbeat.
This sense of consciousness, of course, also embraces Freud's distinction between the conscious and the unconscious mind. As with self-knowledge, there is nothing miraculous or even mysterious about it. Indeed, there are obvious analogues in machines. My computer has access to information about whether the printer is working or not working (it is “conscious” of it, in this particular sense) and can print out an error message, Printer not responding. But it has no access to information about why the printer is not working; the signal carried back along the cable from printer to computer does not include the information. The chip inside the printer, in contrast, does have access to that information (it is conscious of it, in this sense); the sensors in different parts of the printer feed into the chip, and the chip can turn on a yellow light if the toner supply is low and a red light if the paper is jammed.
Finally, we come to the most interesting sense of all, sentience: subjective experience, phenomenal awareness, raw feels, first-person present tense, “what it is like” to be or do something, if you have to ask you'll never know. Woody Allen's joke turned on the difference between this sense of consciousness and Freud's sense of it as access to information {136} by the deliberative, language-using parts of the mind. And this sense, sentience, is the one in which consciousness seems like a miracle.
The remainder of the chapter is about consciousness in these last two senses. First I will look at access, at what kinds of information the different parts of the mind make available to one another. In this sense of the word, we really are coming to understand consciousness. Interesting things can be said about how it is implemented in the brain, the role it plays in mental computation, the engineering specs it is designed to meet (and hence the evolutionary pressures that gave rise to it), and how those specs explain the main features of consciousness — sensory awareness, focal attention, emotional coloring, and the will. Finally, I will turn to the problem of sentience.
Someday, probably sooner rather than later, we will have a fine understanding of what in the brain is responsible for consciousness in the sense of access to information. Francis Crick and Christof Koch, for example, have set out straightforward criteria for what we should look for. Most obviously, information from sensation and memory guides behavior only in an awake animal, not an anesthetized one. Therefore some of the neural bases of access-consciousness can be found in whatever brain structures act differently when an animal is awake and when it is in a dreamless sleep or out cold. The lower layers of the cerebral cortex are one candidate for that role. Also, we know that information about an object being perceived is scattered across many parts of the cerebral cortex. Therefore information access requires a mechanism that binds together geographically separated data. Crick and Koch suggest that synchronization of neural firing might be one such mechanism, perhaps entrained by loops from the cortex to the thalamus, the cerebrum's central way-station. They also note that voluntary, planned behavior requires activity in the frontal lobes. Therefore access-consciousness may be determined by the anatomy of the fiber tracts running from various parts of the brain to the frontal lobes. Whether or not they are right, they have shown that the problem can be addressed in the lab.
Access-consciousness is also a mere problem, not a mystery, in our grasp of the computations carried out by the brain. Recall our uncle-detecting production system. It has a communal short-term metnory: a {137} workspace or bulletin board visible to all of the demons in the system. In a separate part of the system lies a larger repository of information, a long-term memory, that cannot be read by the demons until pieces of it are copied to the short-term memory. Many cognitive psychologists have pointed out that in these models the short-term memory (communal bulletin board, global workspace) acts just like consciousness. When we are aware of a piece of information, many parts of the mind can act on it. We not only see a ruler in front of us but can describe it, reach for it, deduce that it can prop up a window, or count its markings. As the philosopher Stephen Stich has put it, conscious information is inferen-tially promiscuous; it makes itself available to a large number of information-processing agents rather than committing itself to one alone. Newell and Simon have made headway in understanding human problem-solving simply by asking a person to think aloud when working on a puzzle. They have nicely simulated the mental activity using a production system where the contents of the bulletin board correspond step for step with the person's report of what he is consciously thinking.
The engineering specs of information access, and thus the selection pressures that probably gave rise to it, are also becoming clearer. The general principle is that any information processor must be given limited access to information because information has costs as well as benefits.
One cost is space: the hardware to hold the information. The limitation is all too clear to microcomputer owners deciding whether to invest in more RAM. Of course the brain, unlike a computer, comes with vast amounts of parallel hardware for storage. Sometimes theorists infer that the brain can store all contingencies in advance and that thought can be reduced to one-step pattern recognition. But the mathematics of a combinatorial explosion bring to mind the old slogan of MTV: Too much is never enough. Simple calculations show that the number of humanly graspable sentences, sentence meanings, chess games, melodies, seeable objects, and so on can exceed the number of particles in the universe. For example, there are thirty to thirty-five possible moves at each point in a chess game, each of which can be followed by thirty to thirty-five responses, defining about a thousand complete turns. A typical chess game lasts forty turns, yielding 10120 different chess games. There are about 1070 paticles in the visible universe. So no one can play chess by memorizing all the games and recognizing every sequence of moves. The same is true for sentences, stories, melodies, and so on. Of course, some combinations can be stored, but pretty soon either you run out of brain {138} or you start to superimpose the patterns and get useless chimeras and blends. Rather than storing googols of inputs and their outputs or questions and their answers, an information processor needs rules or algorithms that operate on a subset of information at a time and calculate an answer just when it is needed.
A second cost of information is time. Just as one couldn't store all the chess games in a brain less than the size of the universe, one can't mentally play out all the chess games in a lifetime less than the age of the universe (1018 seconds). Solving a problem in a hundred years is, practically speaking, the same as not solving it at all. In fact, the requirements on an intelligent agent are even more stringent. Life is a series 0f deadlines. Perception and behavior take place in real time, such as in hunting an animal or keeping up one's end of a conversation. And since computation itself takes time, information processing can be part of the problem rather than part of the solution. Think about a hiker planning the quickest route back to camp before it gets dark and taking twenty minutes to plot out a path that saves her ten minutes.
A third cost is resources. Information processing requires energy. That is obvious to anyone who has stretched out the battery life of a laptop computer by slowing down the processor and restricting its access to information on the disk. Thinking, too, is expensive. The technique of functional imaging of brain activity (PET and MRI) depends on the fact that working brain tissue calls more blood its way and consumes more glucose.
Any intelligent agent incarnated in matter, working in real time, and subject to the laws of thermodynamics must be restricted in its access to information. Only information relevant to the problem at hand should be allowed in. That does not mean that the agent should wear blijikers or become an amnesiac. Information that is irrelevant at one time for one purpose might be relevant at another time for another purpose. So information must be routed. Information that is always irrelevant to a kind of , computation should be permanently sealed off from it. Information that is sometimes relevant and sometimes irrelevant should be accessible to a computation when it is relevant, insofar as that can be predicted in advance. This design specification explains why access-consciousness exists in the human mind and also allows us to understand some of its details.
Access-consciousness has four obvious features. First, we are aware, to varying degrees, of a rich field of sensation: the colors and {139} shapes of the world in front of us, the sounds and smells we are bathed in, the pressures and aches of our skin, bone, and muscles. Second, portions of this information can fall under the spotlight of attention, get rotated into and out of short-term memory, and feed our deliberative cogitation. Third, sensations and thoughts come with an emotional flavoring: pleasant or unpleasant, interesting or repellent, exciting or soothing. Finally, an executive, the “I,” appears to make choices and pull the levers of behavior. Each of these features discards some information in the nervous system, defining the highways of access-consciousness. And each has a clear role in the adaptive organization of thought and perception to serve rational decision making and action.
Let's begin with the perceptual field. Jackendoff, after reviewing the levels of mental representation used by various modules, asked which level corresponds to the rich field of present-tense awareness. For example, visual processing runs from the rods and cones in the retina, through intermediate levels representing edges, depths, and surfaces, to a recognition of the objects in front of us. Language understanding proceeds from raw sound up through representations of syllables, words, and phrases, to an understanding of the content of the message.
Jackendoff observed that access-consciousness seems to tap the intermediate levels. People are unaware of the lowest levels of sensation. We do not spend our lives in Proustian contemplation of every crumb of the madeleine and every nuance of the decoction of lime flowers. We literally cannot see the lightness of the coal in the sun, the darkness of the snowball inside, the pale green-gray of the “black” areas on the television screen, or the rubbery parallelograms that a moving square projects on our retinas. What we “see” is a highly processed product: the surfaces of objects, their intrinsic colors and textures, and their depths, slants, and tilts. In the sound wave arriving at our ears, syllables and words are warped and smeared together, but we don't hear that seamless acoustic ribbon; we “hear” a chain of well-demarcated words. Our immediate awareness does not exclusively tap the highhest level of representation, either. The highest levels — the contents of the world, or the gist of a message — tend to stick in long-term memory days and years after an experience, but as the experience is unfolding, we are aware of the sights and sounds. We do not just abstractly think “Face!” when we see a face; the shadings and contours are available for scrutiny. {140}
The advantages of intermediate-level awareness are not hard to find. Our perception of a constant shape and lightness across changes in viewing conditions tracks the object's inherent properties: the lump of coal itself stays rigid and black as we move around it or raise the lights, and we experience it as looking the same. The lower levels are not needed, and the higher levels are not enough. The raw data and computational steps behind these constancies are sealed off from our awareness, no doubt because they use the eternal laws of optics and neither need advice from, nor have any insights to offer to, the rest of cognition. The products of the computation are released for general consumption well before the identities of objects are established, because we need more than a terse mise en scene to make our way around the world. Behavior is a game of inches, and the geometry and composition of surfaces must be available to the decision processes that plan the next step or grasp. Similarly, while we are understanding a sentence there is nothing to be gained in peering all the way down to the hisses and hums of the sound wave; they have to be decoded into syllables before they match up with anything meaningful in the mental dictionary. The speech decoder uses a special key with lifelong validity and should be left to do its job without interference from kibbitzers in the rest of the mind. But as with vision, the rest of the mind cannot be satisfied with only the final product, either — in this case the speaker's gist. The choice of words and the tone of voice carry information that allows us to hear between the lines.
The next noteworthy feature of conscious access is the spotlight of attention. It serves as the quintessential demonstration that unconscious parallel processing (in which many inputs are processed at the same time, each by its own mini-processor) can go only so far. An early stage of parallel processing does what it can, and passes along a representation from which a more cramped and plodding processor must select the information it needs. The psychologist Anne Treisman thought up a few simple, now classic demonstrations of where unconscious processing leaves off and conscious processing begins. People are shown a display of colored shapes, like X's and O's, and are asked to press a button if they see a specified target. If the search target is an O and the display shows one O in a sea of X's, the person responds quickly. It doesn't matter how many X's there are; people say the O just pops out. (Pop-out, as the effect is now called, is a nice sign of unconscious parallel processing.) Similarly, a green O pops out from a sea of red O's. But if the experimenter {141} asks the person to find a letter that is both green and an O, and the letter sits somewhere in a mixed sea of green X's and red O's, the person must consciously search the display, letter by letter, checking each one to see if it meets the two-part criterion. The task becomes like the children's comic strip Where's Waldo?, in which the hero in the red-and-white-striped jersey hides in a throng of people-wearing red, white, or stripes.
What exactly is happening? Imagine that the visual field is sprinkled with thousands of little processors, each of which detects a color or a simple shape like a curve, an angle, or a line whenever it appears at the processor's location. The output of one set of processors looks like this: red red red red green red red red, and so on. The output of another set looks like this: straight straight straight curved straight straight straight, and so on. Superimposed on these processors is a layer of odd-man-out detectors. Each stands astride a group of line or color detectors and “marks” any spot on the visual field that differs from its neighbors in color or in contour. The green surrounded by reds acquires a little extra flag. All it takes to see a green among reds is to spot the flag, a task within the powers of even the simplest demon. An O among X's can be detected in the same way. But the thousands of processors tiled across the field are too stupid to calculate conjunctions of features: a patch that is green and curved, or red and straight. The conjunctions are detected only by a programmable logic machine that looks at one part of the visual field at a time through a narrow, movable window, and passes on its answer to the rest of cognition.
Why is visual computation divided into an unconscious parallel stage and a conscious serial stage? Conjunctions are combinatorial. It would be impossible to sprinkle conjunction detectors at every location in the visual field because there are too many kinds of conjunctions. There are a million visual locations, so the number of processors needed would be a million multiplied by the number of logically possible conjunctions: the number of colors we can discriminate times the number of contours times the number of depths times the number of directions of motion times the number of velocities, and so on, an astronomical number. Parallel, unconscious computation stops after it labels each location with a color, contour, depth, and motion; the combinations then have to be computed, consciously, at one location at a time.
The theory makes a surprising prediction. If the conscious processor is focused at one location, the features at other locations should float {142} around unglued. For example, a person not deliberately attending to a region should not know whether it contains a red X and a green O or a green X and a red O — the color and shape should float in separate planes until the conscious processor binds them together at a particular spot. Treisman found that that is what happens. When people are distracted from some colored letters, they can report the letters and they can report the colors, but they misreport which color went with which letter. These illusory combinations are a striking demonstration of the limits of unconscious visual computation, and they are not uncommon in everyday life. When words are glimpsed absent-mindedly or out of the corner of the eye, the letters sometimes rearrange themselves. One psychologist began to study the phenomenon after he walked past a coffee machine and wondered why it claimed to be dispensing the World's Worst Coffee. The sign, of course, really said “World's Best Coffee.” One time I did a double-take when driving past a billboard advertising a brothel (actually the Brothers’ Hotel). When flipping through a magazine I once caught sight of a headline about anti-semitic cameras (they were semi-antique).
There are bottlenecks constricting the flow of information from inside the person as well as from outside. When we try to retrieve a memory, the items drip into awareness one at a time, often with agonizing delays if the information is old or uncommon. Ever since Plato invoked the metaphor of soft wax, psychologists have assumed that the neural medium must be inherently resistant to retaining information, fading with time unless the information is pounded in. But the brain can record indelible memories, such as the content of shocking news and a few of the details of the time and place at which one hears it. So the neural medium itself is not necessarily to blame.
The psychologist John Anderson has reverse-engineered human memory retrieval, and has shown that the limits of memory are not a byproduct of a mushy storage medium. As programmers like to say, “It's not a bug, it's a feature.” In an optimally designed information-retrieval system, an item should be recovered only when the relevance of the item outweighs the cost of retrieving it. Anyone who has used a computerized library retrieval system quickly comes to rue the avalanche df titles spilling across the screen. A human expert, despite our allegedly feeble powers of retrieval, vastly outperforms any computer in locating p piece of information from its content. When I need to find articles on ia topic in an unfamiliar field, I don't use the library computer; I send email to a pal in the field. {143}
What would it mean for an information-retrieval system to be optimally designed? It should cough up the information most likely to be useful at the time of the request. But how could that be known in advance? The probabilities could be estimated, using general laws about what kinds of information are most likely to be needed. If such laws exist, we should be able to find them in information systems in general, not just human memory; for example, the laws should be visible in the statistics of books requested at a library or the files retrieved in a computer. Information scientists have discovered several of these laws. A piece of information that has been requested many times in the past is more likely to be needed now than a piece that has been requested only rarely. A piece that has been requested recently is more likely to be needed now than a piece that has not been requested for a while. An optimal information-retrieval system should therefore be biased to fetch frequently and recently encountered items. Anderson notes that that is exactly what human memory retrieval does: we remember common and recent events better than rare and long-past events. He found four other classic phenomena in memory research that meet the optimal design criteria independently established for computer information-retrieval systems.
A third notable feature of access-consciousness is the emotional coloring of experience. We not only register events but register them as pleasurable or painful. That makes us take steps to have more of the former and less of the latter, now and in the future. None of this is a mystery. Computationally speaking, representations trigger goal states, which in turn trigger information-gathering, problem-solving, and behavior-selecting demons that calculate how to attain, shun, or modify the charged situation. And evolutionarily speaking, there is seldom any mystery in why we seek the goals we seek — why, for example, people would rather make love with an attractive partner than get a slap on the belly with a wet fish. The things that become objects of desire are the kinds of things that led, on average, to enhanced odds of survival and reproduction in the environment in which we evolved: water, food, safety, sex, status, mastery over the environment, and the well-being of children, friends, and kin.
The fourth feature of consciousness is the funneling of control to an executive process: something we experience as the self, the will, the “I.” The self has been under assault lately. The mind is a society of agents, according to the artificial intelligence pioneer Marvin Minsky. It's a large {144} collection of partly finished drafts, says Daniel Dennett, who adds, “It's a mistake to look for the President in the Oval Office of the brain.”
The society of mind is a wonderful metaphor, and I will use it with gusto when explaining the emotions. But the theory can be taken too far if it outlaws any system in the brain charged with giving the reins or the floor to one of the agents at a time. The agents of the brain might very well be organized hierarchically into nested subroutines with a set of master decision rules, a computational demon or agent or good-kind-of-homunculus, sitting at the top of the chain of command. It would not be a ghost in the machine, just another set of if-then rules or a neural network that shunts control to the loudest, fastest, or strongest agent one level down.
We even have hints about the brain structures that house the decision-making circuitry. The neurologist Antonio Damasio has noted that damage to the anterior cingulate sulcus, which receives input from many higher perceptual areas and is connected to the higher levels of the motor system, leaves a patient in a seemingly alert but strangely unresponsive state. The report led Francis Crick to proclaim, only partly in jest, that the seat of the will had been discovered. And for many decades neurologists have known that exercising the will — forming and carrying out plans — is a job of the frontal lobes. A sad but typical example came to me from a man who called about his fifteen-year-old son, who had suffered an injury to his frontal lobes in a car accident. The boy would stay in the shower for hours at a time, unable to decide when to get out, and could not leave the house because he kept looping back to his room to check whether he had turned off the lights.
Why would a society of mental agents need an executive at the top? The reason is as clear as the old Yiddish expression “You can't dance at two weddings with only one tuches.” No matter how many agents we have in our minds, we each have exactly one body. Custody of each major part must be granted to a controller that selects a plan from the hubbub of competing agents. The eyes have to point at one object at a time; they can't fixate on the empty space halfway between two interesting objects or wobble between them in a tug-of-war. The limbs must be choreographed to pull the body or objects along a path that attains the goal of just one of the mind's agents. The alternative, a truly egalitarian society of mind, is shown in the wonderfully silly movie All of Me. Lily Tomlin is a hypochondriac heiress who hires a swami to transfer her soul into the body of a woman who doesn't want hers. During the transfer, a {145} chamberpot containing her soul falls out the window and conks a passerby, played by Steve Martin, on the head. Tomlin's dybbuk comes to rest in the right half of his body while he retains control of the left half. He lurches in a zigzag as first his left half strides in one direction and then his right half, pinkie extended, minces in the other.
So, consciousness in the sense of access is coming to be understood. What about consciousness in the sense of sentience? Sentience and access may be two sides of a single coin. Our subjective experience is also the grist for our reasoning, speech, and action. We do not just experience a toothache; we complain about it and head to the dentist.
Ned Block has tried to clarify the distinction between access and sentience by thinking up scenarios in which access could occur without sentience and vice versa. An example of access without sentience might be found in the strange syndrome called blindsight. When a person has a large blind spot because of damage to his visual cortex, he will adamantly deny that he can see a thing there, but when forced to guess where an object is, he performs well above chance. One interpretation is that the blindsighter has access to the objects but is not sentient of them. Whether or not this is correct, it shows that it is possible to conceive of a difference between access and sentience. Sentience without access might occur when you are engrossed in a conversation and suddenly realize that there is a jackhammer outside the window and that you have been hearing it, but not noticing it, for some time. Prior to the epiphany you were sentient of the noise but had no access to it. But Block admits that the examples are a bit strained, and suspects that in reality access and sentience go together.
So we may not need a separate theory of where sentience occurs in the brain, how it fits into mental computation, or why it evolved. It seems to be an extra quality of some kinds of information access. What we do need is a theory of how the subjective qualities of sentience emerge out of mere information access. To complete the story, then, I must present a theory that addresses questions like these:
• If we could ever duplicate the information processing in the human mind as an enormous computer program, would a computer running the program be conscious? {146}
• What if we took that program and trained a large number of people, say, the population of China, to hold in mind the data and act out the steps? Would there be one gigantic consciousness hovering over? China, separate from the consciousnesses of the billion individuals? If they were implementing the brain state for agonizing pain, would there be some entity that really was in pain, even if every citizen was cheerful and light-hearted?
• Suppose the visual receiving area at the back of your brain was surgically severed from the rest and remained alive in your skull, receiving input from the eyes. By every behavioral measure you are blind. Is there a mute but fully aware visual consciousness sealed off in the back of your head? What if it was removed and kept alive in a dish?
• Might your experience of red be the same as my experience of green? Sure, you might label grass as “green” and tomatoes as “red,” just as I do, but perhaps you actually see the grass as having the coldr that I would describe, if I were in your shoes, as red.
• Could there be zombies? That is, could there be an android rigged up to act as intelligently and as emotionally as you and me, but in which there is “no one home” who is actually feeling or seeing anything? How do I know that you're not a zombie?
• If someone could download the state of my brain and duplicate it in another collection of molecules, would it have my consciousness? If someone destroyed the original, but the duplicate continued to live my life and think my thoughts and feel my feelings, would I have been murdered? Was Captain Kirk snuffed out and replaced by a twin every time he stepped into the transporter room?
• What is it like to be a bat? Do beetles enjoy sex? Does a worm scream silently when a fisherman impales it on a hook?
• Surgeons replace one of your neurons with a microchip that duplicates its input-output functions. You feel and behave exactly as before. Then they replace a second one, and a third one, and so on, until more and more of your brain becomes silicon. Since each microchip does exactly what the neuron did, your behavior and memory never change. Do you even notice the difference? Does it feel like dying? Is some other conscious entity moving in with you?
Beats the heck out of me! I have some prejudices, but no idea of how to begin to look for a defensible answer. And neither does anyone else. The computational theory of mind offers no insight; neither does any {147} finding in neuroscience, once you clear up the usual confusion of sentience with access and self-knowledge.
How can a book called How the Mind Works evade the responsibility of explaining where sentience comes from? I could, I suppose, invoke the doctrine of logical positivism, which holds that if a statement cannot be verified it is literally meaningless. The imponderables in my list ask about the quintessentially unverifiable. Many thinkers, such as Dennett, conclude that worrying about them is simply flaunting one's confusion: sentient experiences (or, as philosophers call them, qualia) are a cognitive illusion. Once we have isolated the computational and neurological correlates of access-consciousness, there is nothing left to explain. It's just irrational to insist that sentience remains unexplained after all the manifestations of sentience have been accounted for, just because the computations don't have anything sentient in them. It's like insisting that wetness remains unexplained even after all the manifestations of wetness have been accounted for, because moving molecules aren't wet.
Most people are uncomfortable with the argument, but it is not easy to find anything wrong with it. The philosopher Georges Rey once told me that he has no sentient experiences. He lost them after a bicycle accident when he was fifteen. Since then, he insists, he has been a zombie. I assume he is speaking tongue-in-cheek, but of course I have no way of knowing, and that is his point.
The qualia-debunkers do have a point. At least for now, we have no scientific purchase on the special extra ingredient that gives rise to sentience. As far as scientific explanation goes, it might as well not exist. It's not just that claims about sentience are perversely untestable; it's that testing them would make no difference to anything anyway. Our incomprehension of sentience does not impede our understanding of how the mind works in the least. Generally the parts of a scientific problem fit together like a crossword puzzle. To reconstruct human evolution, we need physical anthropology to find the bones, archeology to understand the tools, molecular biology to date the split from chimpanzees, and paleobotany to reconstruct the environment from fossil pollen. When any part of the puzzle is blank, such as a lack of chimpanzee fossils or an uncertainty about whether the climate was wet or dry, the gap is sorely felt and everyone waits impatiently for it to be filled. But in the study of the mind, sentience floats in its own plane, high above the causal chains of psychology and neuroscience. If we ever could trace all the neurocom-putational steps from perception through reasoning and emotion to {148} behavior, the only thing left missing by the lack of a theory of sentience would be an understanding of sentience itself.
But saying that we have no scientific explanation of sentience is not the same as saying that sentience does not exist at all. I am as certain that I am sentient as I am certain of anything, and I bet you feel the same. Though I concede that my curiosity about sentience may never be satisfied, I refuse to believe that I am just confused when I think I am sentient at all! (Dennett's analogy of unexplained wetness is not decisive: wetness is itself a subjective feeling, so the observer's dissatisfaction is just the problem of sentience all over again.) And we cannot banish sentience from our discourse or reduce it to information access, because moral reasoning depends on it. The concept of sentience underlies our certainty that torture is wrong and that disabling a robot is the destruction of property but disabling a person is murder. It is the reason that the death of a loved one does not impart to us just self-pity at our loss but the uncomprehending pain of knowing that the person's thoughts and pleasures have vanished forever.
If you bear with me to the end of the book, you will learn my own hunch about the mystery of sentience. But the mystery remains a mystery, a topic not for science but for ethics, for late-night dorm-rdom bull sessions, and, of course, for one other realm:
On a microscopic piece of sand that floats through space is a fragment of a man's life. Left to rust is the place he lived in and the machines he used. Without use, they will disintegrate from the wind and the sand and the years that act upon them; all of Mr. Corry's machines — including the one made in his image, kept alive by love, but now obsolete ... in the Twilight Zone.
<< | {149} | >> |
S |
omewhere beyond the edge of our solar system, hurtling into interstellar space, is a phonograph and a golden record with hieroglyphic instructions on the sleeve. They are attached to the Voyager 2 space probe, launched in 1977 to transmit photographs and data back to us from the outer planets in our solar system. Now that it has flown by Neptune and its thrilling scientific mission is over, it serves as an interplanetary calling card from us to any spacefaring extraterrestrial that might snag it.
The astronomer Carl Sagan was the record producer, and he chose sights and sounds that captured our species and its accomplishments. He included greetings in fifty-five human languages and one “whale language,” a twelve-minute sound essay made up of a baby's cry, a kiss, and an EEG record of the meditations of a woman in love, and ninety minutes of music sampled from the world's idioms: Mexican mariachi, Peruvian panpipes, Indian raga, a Navajo night chant, a Pygmy girl's initiation song, a Japanese shakuhachi piece, Bach, Beethoven, Mozart, Stravinsky, Louis Armstrong, and Chuck Berry singing “Johnny B. Goode.”
The disk also bore a message of peace from our species to the cosmos. In an unintended bit of black comedy, the message was recited by the secretary-general of the United Nations at the time, Kurt Waldheim. Years later historians discovered that Waldheim had spent World War II as an intelligence officer in a German army unit that carried out brutal reprisals against Balkan partisans and deported the Jewish population of Salonika to Nazi death camps. It is too late to call Voyager back, and this mordant joke on us will circle the center of the Milky Way galaxy forever. {150}
The Voyager phonograph record, in any case, was a fine idea, if only because of the questions it raised. Are we alone? If not, do alien life forms have the intelligence and the desire to develop space travel? If so, would they interpret the sounds and images as we intended, or would they hear the voice as the whine of a modem and see the line drawings of people on the cover as showing a race of wire frames? If they understood it, how would they respond? By ignoring us? By coming over to enslave us or eat us? Or by starting an interplanetary dialogue? In a Saturday Night Live skit, the long-awaited reply from outer space was “Send more Chuck Berry.”
These are not just questions for late-night dorm-room bull sessions. In the early 1990s NASA allocated a hundred million dollars to a ten-year Search for Extraterrestrial Intelligence (SETI). Scientists were to listen with radio antennas for signals that could have come only from intelligent extraterrestrials. Predictably, some congressmen objected. One said it was a waste of federal money “to look for little green men with mis-shapen heads.” To minimize the “giggle factor,” NASA renamed the project the High-Resolution Microwave Survey, but it was toO late to save the project from the congressional ax. Currently it is funded by donations from private sources, including Steven Spielberg.
The opposition to SETI came not just from the know-nothings but from some of the world's most distinguished biologists. Why did they join the discussion? SETI depends on assumptions from evolutionary theory, not just astronomy — in particular, about the evolution of intelligence. Is intelligence inevitable, or was it a fluke? At a famous conference in 1961, the astronomer and SETI enthusiast Frank Drake noted that the number of extraterrestrial civilizations that might contact us can be estimated with the following formula:
(1) (The number of stars in the galaxy) ×
(2) (The fraction of stars with planets) ×
(3) (The number of planets per solar system with a life-supporting environment) ×
(4) (The fraction of these planets on which life actually appears) ×
(5) (The fraction of life-bearing planets on which intelligence emerges) × {151}
(6) (The fraction of intelligent societies willing and able to communicate with other worlds) ×
(7) (The longevity of each technology in the communicative state).
The astronomers, physicists, and engineers at the conference felt unable to estimate factor (6) without a sociologist or a historian. But they felt confident in estimating factor (5), the proportion of life-bearing planets on which intelligence emerges. They decided it was one hundred percent.
Finding intelligent life elsewhere in the cosmos would be the most exciting discovery in human history. So why are the biologists being such grinches? It is because they sense that the SETI enthusiasts are reasoning from a pre-scientific folk belief. Centuries-old religious dogma, the Victorian ideal of progress, and modern secular humanism all lead people to misunderstand evolution as an internal yearning or unfolding toward greater complexity, climaxing in the appearance of man. The pressure builds up, and intelligence emerges like popcorn in a pan.
The religious doctrine was called the Great Chain of Being — amoeba to monkey to man — and even today many scientists thoughtlessly use words like “higher” and “lower” life forms and the evolutionary “scale” and “ladder.” The parade of primates, from gangly-armed gibbon through stoop-shouldered caveman to upright modern man, has become an icon of pop culture, and we all understand what someone means when she says she turned down a date because the guy is not very evolved. In science fiction like H. G. Wells’ The Time Machine, episodes of Star Trek, and stories from Boy's Life, the momentum is extrapolated to our descendants, shown as bald, varicose-veined, bulbous-brained, spindly-bodied homunculi. In The Planet of the Apes and other stories, after we have blown ourselves to smithereens or choked in our pollutants, apes or dolphins rise to the occasion and take on our mantle.
Drake expressed these assumptions in a letter to Science defending SETI against the eminent biologist Ernst Mayr. Mayr had noted that only one of the fifty million species on earth had developed civilizations, so the probability that life on a given planet would include an intelligent species might very well be small. Drake replied:
The first species to develop intelligent civilizations will discover that it is the only such species. Should it be surprised? Someone must be first, and being first says nothing about how many other species had or have the potential to evolve into intelligent civilizations, or may do so in the future. . . . Similarly, among many civilizations, one will be the first, and temporarily the only one, to develop electronic technology. How else {152} could it be? The evidence does suggest that planetary systems need to exist in sufficiently benign circumstances for a few billion years for a technology-using species to evolve.
To see why this thinking runs so afoul of the modern theory of evolution, consider an analogy. The human brain is an exquisitely complex organ that evolved only once. The elephant's trunk, which can stack logs, uproot trees, pick up a dime, remove thorns, powder the elephant with dust, siphon water, serve as a snorkel, and scribble with a pencil, is another complex organ that evolved only once. The brain and the trunk are products of the same evolutionary force, natural selection. Imagine an astronomer on the Planet of the Elephants defending SETT, the Search for Extraterrestrial Trunks:
The first species to develop a trunk will discover that it is the only such species. Should it be surprised? Someone must be first, and being first says nothing about how many other species had or have the potential to evolve trunks, or may do so in the future. . . . Similarly, among many trunk-bearing species, one will be the first, and temporarily the only one, to powder itself with dust. The evidence does suggest that planetary systems need to exist in sufficiently benign circumstances for a few billion years for a trunk-using species to evolve. . . .
This reasoning strikes us as cockeyed because the elephant is assuming that evolution did not just produce the trunk in a species on this planet but was striving to produce it in some lucky species, each waiting and hoping. The elephant is merely “the first,” and “temporarily” the only one; other species have “the potential,” though a few billion years will have to pass for the potential to be realized. Of course, we are not chauvinistic about trunks, so we can see that trunks evolved, but not because a rising tide made it inevitable. Thanks to fortuitous preconditions in the elephants’ ancestors (large size and certain kinds of nostrils and lips), certain selective forces (the problems posed by lifting and lowering a huge head), and luck, the trunk evolved as a workable solution for those organisms at that time. Other animals did not and will not evolve trunks because in their bodies and circumstances it is of no great help. Could it happen again, here or elsewhere? It could, but the proportion of planets on which the necessary hand has been dealt in a given period of time is presumably small. Certainly it is less than one hundred percent.
We are chauvinistic about our brains, thinking them to be the goal of {153} evolution. And that makes no sense, for reasons articulated over the years by Stephen Jay Gould. First, natural selection does nothing even close to striving for intelligence. The process is driven by differences in the survival and reproduction rates of replicating organisms in a particular environment. Over time the organisms acquire designs that adapt them for survival and reproduction in that environment, period; nothing pulls them in any direction other than success there and then. When an organism moves to a new environment, its lineage adapts accordingly, but the organisms who stayed behind in the original environment can prosper unchanged. Life is a densely branching bush, not a scale or a ladder, and living organisms are at the tips of the branches, not on lower rungs. Every organism alive today has had the same amount of time to evolve since the origin of life — the amoeba, the platypus, the rhesus macaque, and, yes, Larry on the answering machine asking for another date.
But, a SETI fan might ask, isn't it true that animals become more complex over time? And wouldn't intelligence be the culmination? In many lineages, of course, animals have become more complex. Life began simple, so the complexity of the most complex creature alive on earth at any time has to increase over the eons. But in many lineages they have not. The organisms reach an optimum and stay put, often for hundreds of millions of years. And those that do become more complex don't always become smarter. They become bigger, or faster, or more poisonous, or more fecund, or more sensitive to smells and sounds, or able to fly higher and farther, or better at building nests or dams — whatever works for them. Evolution is about ends, not means; becoming smart is just one option.
Still, isn't it inevitable that many organisms would take the route to intelligence? Often different lineages converge on a solution, like the forty different groups of animals that evolved complex designs for eyes. Presumably you can't be too rich, too thin, or too smart. Why wouldn't humanlike intelligence be a solution that many organisms, on this planet and elsewhere, might converge on?
Evolution could indeed have converged on humanlike intelligence several times, and perhaps that point could be developed to justify SETI. But in calculating the odds, it is not enough to think about how great it is to be smart. In evolutionary theory, that kind of reasoning merits the accusation that conservatives are always hurling at liberals: they specify a benefit but neglect to factor in the costs. Organisms don't evolve toward {154} every imaginable advantage. If they did, every creature would be faster than a speeding bullet, more powerful than a locomotive, and able to leap tall buildings in a single bound. An organism that devotes some of its matter and energy to one organ must take it away from another. It must have thinner bones or less muscle or fewer eggs. Organs evolve only when their benefits outweigh their costs.
Do you have a Personal Digital Assistant, like the Apple Newton? These are the hand-held devices that recognize handwriting, store phone numbers, edit text, send faxes, keep schedules, and many other feats. They are marvels of engineering and can organize a busy life. But I don't have one, though I am a gadget-lover. Whenever I am tempted to buy a PDA, four things dissuade me. First, they are bulky. Second, they need batteries. Third, they take time to learn to use. Fourth, their sophistication makes simple tasks, like looking up a phone number, slow and cumbersome. I get by with a notebook and a fountain pen.
The same disadvantages would face any creature pondering whether to evolve a humanlike brain. First, the brain is bulky. The female pelvis barely accommodates a baby's outsize head. That design compromise kills many women during childbirth and requires a pivoting gait that makes women biomechanically less efficient walkers than men. Also, a heavy head bobbing around on a neck makes us more vulnerable to fatal injuries in accidents such as falls. Second, the brain needs energy. Neural tissue is metabolically greedy; our brains take up only two percent of our body weight but consume twenty percent of our energy and nutrients. Third, brains take time to learn to use. We spend much of our lives either being children or caring for children. Fourth, simple tasks can be slow. My first graduate advisor was a mathematical psychologist who wanted to model the transmission of information in the brain by measuring reaction times to loud tones. Theoretically, the neuron-to-neuron transmission times should have added up to a few milliseconds. But there were seventy-five milliseconds unaccounted for between stimulus and response — "There's all this cogitation going on, and we just want him to push his finger down,” my advisor grumbled. Lower-tech animals can be much quicker; some insects can bite in less than a millisecond. Perhaps this answers the rhetorical question in the sporting equipment ad: The average man's IQ is 107. The average brown trout's IQ is 4. So why can't a man catch a brown trout?
Intelligence isn't for everyone, any more than a trunk is, and this should give SETI enthusiasts pause. But I am not arguing against the {155} search for extraterrestrial intelligence; my topic is lerrestrial intelligence. The fallacy that intelligence is some exalted ambition of evolution is part of the same fallacy that treats it as a divine essence or wonder tissue or all-encompassing mathematical principle. The mind is an organ, a biological gadget. We have our minds because their design attains outcomes whose benefits outweighed the costs in the lives of Plio-Pleistocene African primates. To understand ourselves, we need to know the how, why, where, and when of this episode in history. They are the subject of this chapter.
One evolutionary biologist has made a prediction about extraterrestrial life — not to help us look for life on other planets, but to help us understand life on this planet. Richard Dawkins has ventured that life, anywhere it is found in the universe, will be a product of Darwinian natural selection. That may seem like the most overreaching prognosis ever made from an armchair, but in fact it is a straightforward consequence of the argument for the theory of natural selection. Natural selection is the only explanation we have of how complex life can evolve, putting aside the question of how it did evolve. If Dawkins is right, as I think he is, natural selection is indispensable to understanding the human mind. If it is the only explanation of the evolution of little green men, it certainly is the only explanation of the evolution of big brown and beige ones.
The theory of natural selection — like the other foundation of this book, the computational theory of mind — has an odd status in modern intellectual life. Within its home discipline, it is indispensable, explaining thousands of discoveries in a coherent framework and constantly inspiring new ones. But outside its home, it is misunderstood and reviled. As in Chapter 2, I want to spell out the case for this foundational idea: how it explains a key mystery that its alternatives cannot explain, how it has been verified in the lab and the field, and why some famous arguments against it are wrong.
Natural selection has a special place in science because it alone explains what makes life special. Life fascinates us because of its adaptive complexity or complex design. Living things are not just pretty bits of bric-a-brac, but do amazing things. They fly, or swim, or see, or digest {156} food, or catch prey, or manufacture honey or silk or wood or poison. These are rare accomplishments, beyond the means of puddles, rocks, clouds, and other nonliving things. We would call a heap of extraterrestrial matter “life” only if it achieved comparable feats.
Rare accomplishments come from special structures. Animals can see and rocks cant because animals have eyes, and eyes have precise arrangements of unusual materials capable of forming an image: a cornea that focuses light, a lens that adjusts the focus to the object's depth, an iris that opens and closes to let in the right amount of light, a sphere of transparent jelly that maintains the eye's shape, a retina at the focal plane of the lens, muscles that aim the eyes up-and-down, side-to-side, and in-and-out, rods and cones that transduce light into neural signals, and more, all exquisitely shaped and arranged. The odds are mind-bog-glingly stacked against these structures’ being assembled out of raw materials by tornados, landslides, waterfalls, or the lightning bolt vaporizing swamp goo in the philosopher's thought experiment.
The eye has so many parts, arranged so precisely, that it appears to have been designed in advance with the goal of putting together something that sees. The same is true for our other organs. Our joints are lubricated to pivot smoothly, our teeth meet to sheer and grind, our hearts pump blood — every organ seems to have been designed with a function in mind. One of the reasons God was invented was to he the mind that formed and executed life's plans. The laws of the world work forwards, not backwards: rain causes the ground to be wet; the ground's benefiting from being wet cannot cause the rain. What else but the plans of God could effect the teleology (goal-directedness) of life on earth?
Darwin showed what else. He identified a forward-causation physical process that mimics the paradoxical appearance of backward causation or teleology. The trick is replication. A replicator is something that can make a copy of itself, with most of its traits duplicated in the copy, including the ability to replicate in turn. Consider two states of affairs, A and B. B can't cause A if A comes first. (Seeing well can't cause an eye to have a clear lens.) {157}
But let's say that A causes B, and B in turn causes the protagonist of A to make a copy of itself — let's call it AA. AA looks just like A, so it appears as if B has caused A. But it hasn't; it has only caused AA, the copy of A. Suppose there are three animals, two with a cloudy lens, one with a clear lens. Having a clear lens (A) causes an eye to see well (B); seeing well causes the animal to reproduce by helping it avoid predators and find mates. The offspring (AA) have clear lenses and can see well, too. It looks as if the offspring have eyes so that they can see well (bad, teleolog-ical, backward causation), but that's an illusion. The offspring have eyes because their parents' eyes did see well (good, ordinary, forward causation). Their eyes look like their parents’ eyes, so it's easy to mistake what happened for backward causation.
There's more to an eye than a clear lens, but the special power of a replicator is that its copies can replicate, too. Consider what happens when the clear-lensed daughter of our hypothetical animal reproduces. Some of her offspring will have rounder eyeballs than others, and the round-eyed versions see better because the images are focused from center to edge. Better vision leads to better reproduction, and the next generation has both clear lenses and round eyeballs. They, too, are replicators, and the sharper-visioned of their offspring are more likely to leave a new generation with sharp vision, and so on. In every generation, the traits that lead to good vision are disproportionately passed down to the next generation. That is why a late generation of replicators will have traits that seem to have been designed by an intelligent engineer (see figure on page 158).
I have introduced Darwin's theory in an unorthodox way that highlights its extraordinary contribution: explaining the appearance of design without a designer, using ordinary forward causation as it applies to replicators. The full story runs as follows. In the beginning was a replicator. {158}
This molecule or crystal was a product not of natural selection but of the laws of physics and chemistry. (If it were a product of selection, we would have an infinite regress.) Replicators are wont to multiply, and a single one multiplying unchecked would fill the universe with its great-great-great-. . .-great-grandcopies. But replicators use up materials to make their copies and energy to power the replication. The world is finite, so the replicators will compete for its resources. Because no copying process is one hundred percent perfect, errors will crop up, and not all of the daughters will be exact duplicates. Most of the copying errors will be changes for the worse, causing a less efficient uptake of energy and materials or a slower rate or lower probability of replication. But by dumb luck a few errors will be changes for the better, and the replicators bearing them will proliferate over the generations. Their descendants will accumulate any subsequent errors that are changes for the better, including ones that assemble protective covers and supports, manipulators, catalysts for useful chemical reactions, and other features of what we call bodies. The resulting replicator with its apparently well-engineered body is what we call an organism.
Natural selection is not the only process that changes organisms over time. But it is the only process that seemingly designs organisms over time. Dawkins stuck out his neck about extraterrestrial evolution because he reviewed every alternative to selection that has been proposed in the history of biology and showed that they are impotent to explain the signature of life, complex design.
The folk theory that organisms respond to an urge to unfold into more complex and adaptive forms obviously won't do. The urge — and, more important, the power to achieve its ambitions — is a bit of magic that is left unexplained.
The two principles that have come to be associated with Darwin's predecessor Jean Baptiste Lamarck — use and disuse, and the inheritance {159} of acquired characteristics — are also not up to the job. The problem goes beyond the many demonstrations that Lamarck was wrong in fact. (For example, if acquired traits really could be inherited, several hundred generations of circumcision should have caused Jewish boys today to be born without foreskins.) The deeper problem is that the theory would not be able to explain adaptive complexity even if it had turned out to be correct. First, using an organ does not, by itself, make the organ function better. The photons passing through a lens do not somehow wash it clear, and using a machine does not improve it but wears it out. Now, many parts of organisms do adjust adaptively to use: exercised muscle bulks up, rubbed skin thickens, sunlit skin darkens, rewarded acts increase and punished ones decrease. But these responses are themselves part of the evolved design of the organism, and we need to explain how they arose: no law of physics or chemistry makes rubbed things thicken or illuminated surfaces darken. The inheritance of acquired characteristics is even worse, for most acquired characteristics are cuts, scrapes, scars, decay, weathering, and other assaults by the pitiless world, not improvements. And even if a blow did lead to an improvement, it is mysterious how the size and shape of the helpful wound could be read off the affected flesh and encoded back into DNA instructions in the sperm or egg.
Yet another failed theory is the one that invokes the macromutation: a mammoth copying error that begets a new kind of adapted organism in one fell swoop. The problem here is that the laws of probability astronomically militate against a large random copying error creating a complex functioning organ like the eye out of homogeneous flesh. Small random errors, in contrast, can make an organ a bit more like an eye, as in our example where an imaginable mutation might make a lens a tiny bit clearer or an eyeball a tiny bit rounder. Indeed, way before our scenario begins, a long sequence of small mutations must have accumulated to give the organism an eye at all. By looking at organisms with simpler eyes, Darwin reconstructed how that could have happened. A few mutations made a patch of skin cells light-sensitive, a few more made the underlying tissue opaque, others deepened it into a cup and then a spherical hollow. Subsequent mutations added a thin translucent cover, which subsequently was thickened into a lens, and so on. Each step offered a small improvement in vision. Each mutation was improbable, but not astronomically so. The entire sequence was not astronomically impossible because the mutations were not dealt all at once like a big gin {160} rummy hand; each beneficial mutation was added to a set of prior ones that had been selected over the eons.
A fourth alternative is random genetic drift. Beneficial traits are beneficial only on average. Actual creatures suffer the slings and arrows of outrageous fortune. When the number of individuals in a generation is small enough, an advantageous trait can vanish if its bearers are unlucky, and a disadvantageous or neutral one can take over if its bearers are lucky. Genetic drift can, in principle, explain why a population has a simple trait, like being dark or light, or an inconsequential trait, like the sequence of DNA bases in a part of the chromosome that doesn't do anything. But because of its very randomness, random drift cannot explain the appearance of an improbable, useful trait like an ability to see or fly. The required organs need hundreds or thousands of parts to work, and the odds are astronomically stacked against the required genes accumulating by sheer chance.
Dawkins’ argument about extraterrestrial life is a timeless claim about the logic of evolutionary theories, about the power of an explanans to cause the explanandum. And indeed his argument works against two subsequent challenges. One is a variant of Lamarckism called directed or adaptive mutation. Wouldn't it be nice if an organism could react to an environmental challenge with a slew of new mutations, and not wasteful, random ones, but mutations for traits that would allow it to cope? Of course it would be nice, and that's the problem — chemistry has no sense of niceness. The DNA inside the testes and ovaries cannot peer outside and considerately mutate to make fur when it's cold and fins when it's wet and claws when there are trees around, or to put a lens in front of the retina as opposed to between the toes or inside the pancreas. That is why a cornerstone of evolutionary theory — indeed, a cornerstone of the scientific worldview — is that mutations are indifferent overall to the benefits they confer on the organism. They cannot be adaptive in general, though of course a tiny few can be adaptive by chance. The periodic announcements of discoveries of “adaptive mutations” inevitably turn out to be laboratory curiosities or artifacts. No mechanism short of a guardian angel can guide mutations to respond to organisms’ needs in general, there being billions of kinds of organisms, each with thousands of needs.
The other challenge comes from the fans of a new field called the theory of complexity. The theory looks for mathematical principles of order underlying many complex systems: galaxies, crystals, weather {161} systems, cells, organisms, brains, ecosystems, societies, and so on. Dozens of new books have applied these ideas to topics such as AIDS, urban decay, the Bosnian war, and, of course, the stock market. Stuart Kauffman, one of the movement's leaders, suggested that feats like self-organization, order, stability, and coherence may be an “innate property of some complex systems.” Evolution, he suggests, may be a “marriage of selection and self-organization.”
Complexity theory raises interesting issues. Natural selection presupposes that a replicator arose somehow, and complexity theory might help explain the “somehow.” Complexity theory might also pitch in to explain other assumptions. Each body has to hang together long enough to function rather than fly apart or melt into a puddle. And for evolution to happen at all, mutations have to change a body enough to make a difference in its functioning but not so much as to bring it to a chaotic crash. If there are abstract principles that govern whether a web of interacting parts (molecules, genes, cells) has such properties, natural selection would have to work within those principles, just as it works within other constraints of physics and mathematics like the Pythagorean theorem and the law of gravitation.
But many readers have gone much further and conclude that natural selection is now trivial or obsolete, or at best of unknown importance. (Incidentally, the pioneers of complexity theory themselves, such as Kauffman and Murray Gell-Mann, are appalled by that extrapolation.) This letter to the New York Times Book Review is a typical example:
Thanks to recent advances in nonlinear dynamics, nonequilibrium thermodynamics and other disciplines at the boundary between biology and physics, there is every reason to believe that the origin and evolution of life will eventually be placed on a firm scientific footing. As we approach the 21 st century, those other two great 19th century prophets — Marx and Freud — have finally been deposed from their pedestals. It is high time we freed the evolutionary debate from the anachronistic and unscientific thrall of Darwin worship as well.
The letter-writer must have reasoned as follows: complexity has always been treated as a fingerprint of natural selection, but now it can be explained by complexity theory; therefore natural selection is obsolete. But the reasoning is based on a pun. The “complexity” that so impresses biologists is not just any old order or stability. Organisms are not just cohesive blobs or pretty spirals or orderly grids. They are {162} machines, and their “complexity” is functional, adaptive design: complexity in the service of accomplishing some interesting outcome. The digestive tract is not just patterned; it is patterned as a factory line for extracting nutrients from ingested tissues. No set of equations applicable to everything from galaxies to Bosnia can explain why teeth are found in the mouth rather than in the ear. And since organisms are collections of digestive tracts, eyes, and other systems organized to attain goals, general laws of complex systems will not suffice. Matter simply does not have an innate tendency to organize itself into broccoli, wombats, and ladybugs. Natural selection remains the only theory that explains how adaptive complexity, not just any old complexity, can arise, because it is the only nonmiraculous, forward-direction theory in which how well something works plays a causal role in how it came to be.
Because there are no alternatives, we would almost have to accept natural selection as the explanation of life on this planet even if there were no evidence for it. Thankfully, the evidence is overwhelming. I don't just mean evidence that life evolved (which is way beyond reasonable doubt, creationists notwithstanding), but that it evolved by natural selection. Darwin himself pointed to the power of selective breeding, a direct analogue of natural selection, in shaping organisms. For example, the differences among dogs — Chihuahuas, greyhounds, Scotties, Saint Bernards, shar-peis — come from selective breeding of wolves for only a few thousand years. In breeding stations, laboratories, and seed company greenhouses, artificial selection has produced catalogues of wonderful new organisms befitting Dr. Seuss.
Natural selection is also readily observable in the wild. In a classic example, the white peppered moth gave way in nineteenth-century Manchester to a dark mutant form after industrial soot covered the lichen on which the moth rested, making the white form conspicuous to birds. When air pollution laws lightened the lichen in the 1950s, the then-rare white form reasserted itself. There are many other examples, perhaps the most pleasing coming from the work of Peter and Rosemary Grant. Darwin was inspired to the theory of natural selection in part by the thirteen species of finches on the Galapagos islands. They clearly were related to a species on the South American mainland, but differed from them and {163} from one another. In particular, their beaks resembled different kinds of pliers: heavy-duty lineman's pliers, high-leverage diagonal pliers, straight needle-nose pliers, curved needle-nose pliers, and so on. Darwin eventually reasoned that one kind of bird was blown to the islands and then differentiated into the thirteen species because of the demands of different ways of life on different parts of the islands, such as stripping bark from trees to get at insects, probing cactus flowers, or cracking tough seeds. But he despaired of ever seeing natural selection happen in real time: “We see nothing of these slow changes in progress, until the hand of time has marked the lapse of ages.” The Grants painstakingly measured the size and toughness of the seeds in different parts of the Galapagos at different times of the year, the length of the finches’ beaks, the time they took to crack the seeds, the numbers and ages of the finches in different parts of the islands, and so on — every variable relevant to natural selection. Their measurements showed the beaks evolving to track changes in the availability of different kinds of seeds, a frame-by-frame analysis of the movie that Darwin could only imagine. Selection in action is even more dramatic among faster-breeding organisms, as the world is discovering to its peril in the case of pesticide-resistant insects, drug-resistant bacteria, and the AIDS virus in a single patient.
And two of the prerequisites of natural selection — enough variation and enough time — are there for the having. Populations of naturally living organisms maintain an enormous reservoir of genetic variation that can serve as the raw material for natural selection. And life has had more than three billion years to evolve on earth, complex life a billion years, according to a recent estimate. In The Ascent of Man, Jacob Bronowski wrote:
I remember as a young father tiptoeing to the cradle of my first daughter when she was four or five days old, and thinking, “These marvelous fingers, every joint so perfect, down to the fingernails. I could not have designed that detail in a million years.” But of course it is exactly a million years that it took me, a million years that it took mankind ... to reach its present stage of evolution.
Finally two kinds of formal modeling have shown that natural selection can work. Mathematical proofs from population genetics show how genes combining according to Gregor Mendel's laws can change in frequency under the pressure of selection. These changes can occur impressively fast. If a mutant produces just 1 percent more offspring {164} than its rivals, it can increase its representation in a population from 0.1 percent to 99.9 percent in just over four thousand generations, A hypothetical mouse subjected to a selection pressure for increased size that is so weak it cannot be measured could nonetheless evolve to the $ize of an elephant in only twelve thousand generations.
More recently, computer simulations from the new field of Artificial Life have shown the power of natural selection to evolve organisms with complex adaptations. And what better demonstration than everyone's favorite example of a complex adaptation, the eye? The computer scientists Dan Nilsson and Susanne Pelger simulated a three-layer slab of virtual skin resembling a light-sensitive spot on a primitive organism. It was a simple sandwich made up of a layer of pigmented cells on the bottom, a layer of light-sensitive cells above it, and a layer of translucent cells forming a protective cover. The translucent cells could undergo random mutations of their refractive index: their ability to bend light, which in real life often corresponds to density. All the cells could undergo small mutations affecting their size and thickness. In the simulation, the cells in the slab were allowed to mutate randomly, and after each round of mutation the program calculated the spatial resolution of an image projected onto the slab by a nearby object. If a bout of mutations improved the resolution, the mutations were retained as the starting point for the next bout, as if the slab belonged to a lineage of organisms whose survival depended on reacting to looming predators. As in real evolution, there was no master plan or project scheduling. The organism <fould not put up with a less effective detector in the short run even if its patience would have been rewarded by the best conceivable detector in the long run. Every change it retained had to be an improvement.
Satisfyingly the model evolved into a complex eye right on the computer screen. The slab indented and then deepened into a cup; the transparent layer thickened to fill the cup and bulged out to form a cornea. Inside the clear filling, a spherical lens with a higher refractive index emerged in just the right place, resembling in many subtle details the excellent optical design of a fish's eye. To estimate how long it would take in real time, rather than in computer time, for an eye to unfold, Nilsson and Pelger built in pessimistic assumptions about heritability, variation in the population, and the size of the selective advantage, and even forced the mutations to take place in only one part of the “eye” each generation. Nonetheless, the entire sequence in which flat skin became a complex eye took only four hundred thousand generations, a geological instant. {165}
I have reviewed the modern case for the theory of natural selection because so many people are hostile to it. I don't mean fundamentalists from the Bible Belt, but professors at America's most distinguished universities from coast to coast. Time and again I have heard the objections: the theory is circular, what good is half an eye, how can structure arise from random mutation, there hasn't been enough time, Gould has disproved it, complexity just emerges, physics will make it obsolete someday.
People desperately want Darwinism to be wrong. Dennett's diagnosis in Darwin's Dangerous Idea is that natural selection implies there is no plan to the universe, including human nature. No doubt that is a reason, though another is that people who study the mind would rather not have to think about how it evolved because it would make a hash of cherished theories. Various scholars have claimed that the mind is innately equipped with fifty thousand concepts (including “carburetor” and “trombone”), that capacity limitations prevent the human brain from solving problems that are routinely solved by bees, that language is designed for beauty rather than for use, that tribal people kill their babies to protect the ecosystem from human overpopulation, that children harbor an unconscious wish to copulate with their parents, and that people could just as easily be conditioned to enjoy the thought of their spouse being unfaithful as to be upset by the thought. When advised that these claims are evolutionarily improbable, they attack the theory of evolution rather than rethinking the claim. The efforts that academics have made to impugn Darwinism are truly remarkable.
One claim is that reverse-engineering, the attempt to discover the functions of organs (which I am arguing should be done to the human mind), is a symptom of a disease called “adaptationism.” Apparently if you believe that any aspect of an organism has a function, you absolutely must believe that every aspect has a function, that monkeys are brown to hide amongst the coconuts. The geneticist Richard Lewontin, for example, has defined adaptationism as “that approach to evolutionary studies which assumes without further proof that all aspects of the morphology, physiology and behavior of organisms are adaptive optimal solutions to problems.” Needless to say, there is no such madman. A sane person can believe that a complex organ is an adaptation, that is, a product of natural selection, while also believing that features of an organism that are not {166} complex organs are a product of drift or a by-product of some other adaptation. Everyone acknowledges that the redness of blood was not selected for itself but is a by-product of selection for a molecule that carries oxygen, which just happens to be red. That does not imply that the lability of the eye to see could easily be a by-product of selection for something else.
There also are no benighted fools who fail to realize that animals carry baggage from their evolutionary ancestors. Readers young enough to have had sex education or old enough to be reading articles about the prostate may have noticed that the seminal ducts in men do not lead directly from the testicles to the penis but snake up into the body and pass over the ureter before coming back down. That is because the testes of our reptilian ancestors were inside their bodies. The bodies of mammals are too hot for the production of sperm, so the testes gradually descended into a scrotum. Like a gardener who snags a hose around a tree, natural selection did not have the foresight to plan the shortest route. Again, that does not mean that the entire eye could very well be useless phylogenetic baggage.
Similarly, because adaptationists believe that the laws of physics are not enough to explain the design of animals, they are also imagined to be prohibited from ever appealing to the laws of physics to explain anything. A Darwin critic once defiantly asked me, “Why has no animal evolved the ability to disappear and instantly reappear elsewhere, or to turn into King Kong at will (great for frightening predators)?” I think it is fair to say that “not being able to turn into King Kong at will” and “being able to see’ call for different kinds of explanations.
Another accusation is that natural selection is a sterile exercise in after-the-fact storytelling. But if that were true, the history of biology would be a quagmire of effete speculation, with progress having to wait for today's enlightened anti-adaptationists. Quite the opposite has happened. Mayr, the author of a definitive history of biology, wrote,
The adaptationist question, “What is the function of a given structure or organ?” has been for centuries the basis of every advance in physiology. If it had not been for the adaptationist program, we probably would still not yet know the functions of thymus, spleen, pituitary, and pineal. Harvey's question “Why are there valves in the veins?” was a major stepping stone in his discovery of the circulation of blood.
From the shape of an organism's body to the shape of its protein {167} molecules, everything we have learned in biology has come from an understanding, implicit or explicit, that the organized complexity of an organism is in the service of its survival and reproduction. This includes what we have learned about the nonadaptive by-products, because they can be found only in the course of a search for the adaptations. It is the bald claim that a feature is a lucky product of drift or of some poorly understood dynamic that is untestable and post hoc.
Often I have heard it said that animals are not well engineered after all. Natural selection is hobbled by shortsightedness, the dead hand of the past, and crippling constraints on what kinds of structures are biologically and physically possible. Unlike a human engineer, selection is incapable of good design. Animals are clunking jalopies saddled with ancestral junk and occasionally blunder into barely serviceable solutions.
People are so eager to believe this claim that they seldom think it through or check the facts. Where do we find this miraculous human engineer who is not constrained by availability of parts, manufacturing practicality, and the laws of physics? Of course, natural selection does not have the foresight of engineers, but that cuts both ways: it does not have their mental blocks, impoverished imagination, or conformity to bourgeois sensibilities and ruling-class interests, either. Guided only by what works, selection can home in on brilliant, creative solutions. For millennia, biologists have discovered to their astonishment and delight the ingenious contrivances of the living world: the biomechanical perfection of cheetahs, the infrared pinhole cameras of snakes, the sonar of bats, the superglue of barnacles, the steel-strong silk of spiders, the dozens of grips of the human hand, the DNA repair machinery in all complex organisms. After all, entropy and more malevolent forces like predators and parasites are constantly gnawing at an organism's right to life and do not forgive slapdash engineering.
And many of the examples of bad design in the animal kingdom turn out to be old spouses’ tales. Take the remark in a book by a famous cognitive psychologist that natural selection has been powerless to eliminate the wings of any bird, which is why penguins are stuck with wings even though they cannot fly. Wrong twice. The moa had no trace of a wing, and penguins do use their wings to fly — under water. Michael French makes the point in his engineering textbook using a more famous example:
It is an old joke that a camel is a horse designed by a committee, a joke which does grave injustice to a splendid creature and altogether too {168} much honour to the creative power of committees. For a camel is no chimera, no odd collection of bits, but an elegant design of the tightest unity. So far as we can judge, every part is contrived to suit the difficult role of the whole, a large herbivorous animal to live in harsh climates with much soft going, sparse vegetation and very sparse water. The specification for a camel, if it were ever written down, would be a tough one in terms of range, fuel economy and adaptation to difficult terrains and extreme temperatures, and we must not be surprised that the design that meets it appears extreme. Nevertheless, every feature of the camel is of a piece: the large feet to diffuse load, the knobbly knees that derive from some of the design principles of Chapter 7 [bearings and pivots], the hump for storing food and the characteristic profile of the lips have a congruity that derives from function and invests the whole creation with a feeling of style and a certain bizarre elegance, borne out by the beautiful rhythms of its action at a gallop.
Obviously, evolution is constrained by the legacies of ancestors and the kinds of machinery that can be grown out of protein. Birds could not have evolved propellers, even if that had been advantageous. But many claims of biological constraints are howlers. One cognitive scientist has opined that “many properties of organisms, like symmetry, for example, do not really have anything to do with specific selection but just with the ways in which things can exist in the physical world.” In fact, most things that exist in the physical world are not symmetrical, for obvious reasons of probability: among all the possible arrangements of a volume of matter, only a tiny fraction are symmetrical. Even in the living world, the molecules of life are asymmetrical, as are livers, hearts, stomachs, flounders, snails, lobsters, oak trees, and so on. Symmetry has everything to do with selection. Organisms that move in straight lines have bilaterally symmetrical external forms because otherwise they would go in circles. Symmetry is so improbable and difficult to achieve that any disease or defect can disrupt it, and many animals size up the health of prospective mates by checking for minute asymmetries.
Gould has emphasized that natural selection has only limited freedom to alter basic body plans. Much of the plumbing, wiring, and architecture of the vertebrates, for example, has been unchanged for hundreds of millions of years. Presumably they come from embryological recipes that cannot easily be tinkered with. But the vertebrate body plan accommodates eels, cows, hummingbirds, aardvarks, ostriches, toads, gerbils, seahorses, giraffes, and blue whales. The similarities ate important, {169} but the differences are important, too! Developmental constraints only rule out broad classes of options. They cannot, by themselves, force a functioning organ to come into being. An embryological constraint like “Thou shalt grow wings” is an absurdity. The vast majority of hunks of animal flesh do not meet the stringent engineering demands of powered flight, so it is infinitesimally unlikely that the creeping and bumping cells in the microscopic layers of the developing embryo are obliged to align themselves into bones, skin, muscles, and feathers with just the right architecture to get the bird aloft — unless, of course, the developmental program had been shaped to bring about that outcome by the history of successes and failures of the whole body.
Natural selection should not be pitted against developmental, genetic, or phylogenetic constraints, as if the more important one of them is, the less important the others are. Selection versus constraints is a phony dichotomy, as crippling to clear thinking as the dichotomy between innateness and learning. Selection can only select from alternatives that are growable as carbon-based living stuff, but in the absence of selection that stuff could just as easily grow into scar tissue, scum, tumors, warts, tissue cultures, and quivering amorphous protoplasm as into functioning organs. Thus selection and constraints are both important but are answers to different questions. The question “Why does this creature have such-and-such an organ?” by itself is meaningless. It can only be asked when followed by a compared-to-what phrase. Why do birds have wings (as opposed to propellers)? Because you can't grow a vertebrate with propellers. Why do birds have wings (as opposed to forelegs or hands or stumps)? Because selection favored ancestors of birds that could fly.
Another widespread misconception is that if an organ changed its function in the course of evolution, it did not evolve by natural selection. One discovery has been cited over and over in support of the misconception: the wings of insects were not originally used for locomotion. Like a friend-of-a-friend legend, that discovery has mutated in the retelling: wings evolved for something else but happened to be perfectly adapted for flight, and one day the insects just decided to fly with them; the evolution of insect wings refutes Darwin because they would have had to evolve gradually and half a wing is useless; the wings of birds were not originally used for locomotion (probably a misremembering of another fact, that the first feathers evolved not for flight but for insulation). All one has to do is say “the evolution of wings” and audiences will nod knowingly, completing the anti-adaptationist argument for themselves. {170} How can anyone say that any organ was selected for its current function? Maybe it evolved for something else and the animal is only using it for that function now, like the nose holding up spectacles and all that stuff about insect wings that everyone knows about (or was it bird wihgs?).
Here is what you find when you check the facts. Many organs that we see today have maintained their original function. The eye was always an eye, from light-sensitive spot to image-focusing eyeball. Others changed their function. That is not a new discovery. Darwin gave many examples, such as the pectoral fins of fishes becoming the forelimbs of horses, the flippers of whales, the wings of birds, the digging claws of moles, and the arms of humans. In Darwin's day the similarities were powerful evidence for the fact of evolution, and they still are. Darwin also cited changes in function to explain the problem of “the incipient stages of useful structures,” perennially popular among creationists. How could a complex organ gradually evolve when only the final form is usable? Most often the premise of unusability is just wrong. For example, partial eyes have partial sight, which is better than no sight at all. But sometimes the answer is that before an organ was selected to assume its current form, it was adapted for something else and then went through an intermediate stage in which it accomplished both. The delicate chain of middle-ear bones in mammals (hammer, anvil, stirrup) began as parts of the jaw hinge of reptiles. Reptiles often sense vibrations by lowering their jaws to the ground. Certain bones served both as jaw hinges and as vibration transmitters. That set the stage for the bones to specialize more and more as sound transmitters, causing them to shrink and move into their current shape and role. Darwin called the earlier forms “pre-adaptations,” though he stressed that evolution does not somehow anticipate next year's model.
There is nothing mysterious about the evolution of birds’ wings. Half a wing will not let you soar like an eagle, but it will let you glide or parachute from trees (as many living animals do), and it will let you leap or take off in bursts while running, like a chicken trying to escape a farmer. Paleontologists disagree about which intermediate stage is best supported by the fossil and aerodynamic evidence, but there is nothing here to give comfort to a creationist or a social scientist.
The theory of the evolution of insect wings proposed by Jd>el King-solver and Mimi Koehl, far from being a refutation of adaptati«t)nism, is one of its finest moments. Small cold-blooded animals like insects struggle to regulate their temperature. Their high ratio of surface area to volume makes them heat up and cool down quickly. (That is why there are {171} no bugs outside in cold months; winter is the best insecticide.) Perhaps the incipient wings of insects first evolved as adjustable solar panels, which soak up the sun's energy when it is colder out and dissipate heat when it's warmer. Using thermodynamic and aerodynamic analyses, Kingsolver and Koehl showed that proto-wings too small for flight are effective heat exchangers. The larger they grow, the more effective they become at heat regulation, though they reach a point of diminishing returns. That point is in the range of sizes in which the panels could serve as effective wings. Beyond that point, they become more and more useful for flying as they grow larger and larger, up to their present size. Natural selection could have pushed for bigger wings throughout the range from no wings to current wings, with a gradual change of function in the middle sizes.
So how did the work get garbled into the preposterous story that one day an ancient insect took off by flapping unmodified solar panels and the rest of them have been doing it ever since? Partly it is a misunderstanding of a term introduced by Gould, exaltation, which refers to the adaptation of an old organ to a new function (Darwin's “pre-adaptation”) or the adaptation of a non-organ (bits of bone or tissue) to an organ with a function. Many readers have interpreted it as a new theory of evolution that has replaced adaptation and natural selection. It's not. Once again, complex design is the reason. Occasionally a machine designed for a complicated, improbable task can be pressed into service to do something simpler. A book of cartoons called 101 Uses for a Dead Computer showed PCs being used as a paperweight, an aquarium, a boat anchor, and so on. The humor comes from the relegation of sophisticated technology to a humble function that cruder devices can fulfill. But there will never be a book of cartoons called 101 Uses for a Dead Paperweight showing one being used as a computer. And so it is with exaptation in the living world. On engineering grounds, the odds are against an organ designed for one purpose being usable out of the box for some other purpose, unless the new purpose is quite simple. (And even then the nervous system of the animal must often be adapted for it to find and keep the new use.) If the new function is at all difficult to accomplish, natural selection must have revamped and retrofitted the part considerably, as it did to give modern insects their wings. A housefly dodging a crazed human can decelerate from rapid flight, hover, turn in its own length, fly upside down, loop, roll, and land on the ceiling, all in less than a second. As an article entitled “The Mechanical Design of Insect Wings” notes, {172} “Subtle details of engineering and design, which no man-made airfoil can match, reveal how insect wings are remarkably adapted to the acrobatics of flight.” The evolution of insect wings is an argument for natural selection, not against it. A change in selection pressure is not the same as no selection pressure.
Complex design lies at the heart of all these arguments, and that offers a final excuse to dismiss Darwin. Isn't the whole idea a bit squishy? Since no one knows the number of kinds of possible organisms, how can anyone say that an infinitesimal fraction of them have eyes? Perhaps the idea is circular: the things one calls “adaptively complex” are just the things that one believes couldn't have evolved any other way than by natural selection. As Noam Chomsky wrote,
So the thesis is that natural selection is the only physical explanation of design that fulfills a function. Taken literally, that cannot be true. Take my physical design, including the property that I have positive mass. That fulfills some function — namely, it keeps me from drifting into outer space. Plainly, it has a physical explanation which has nothing to do with natural selection. The same is true of less trivial properties, which you can construct at will. So you can't mean what you say literally. I find it hard to impose an interpretation that doesn't turn it into the tautology that where systems have been selected to satisfy some function, then the process is selection.
Claims about functional design, because they cannot be stated in exact numbers, do leave an opening for a skeptic, but a little thought about the magnitudes involved closes it. Selection is not invoked to explain mere usefulness; it's invoked to explain improbable usefulness. The mass that keeps Chomsky from floating into outer space is not an improbable condition, no matter how you measure the probabilities. “Less trivial properties” — to pick an example at random, the vertebrate eye — are improbable conditions, no matter how you measure the probabilities. Take a dip net and scoop up objects from the solar system; go back to life on the planet a billion years ago and sample the organisms; take a collection of molecules and calculate all their physically possible configurations; divide the human body into a grid of one-inch cubes. Calculate the proportion of samples that have positive mass. Now calculate the proportion of samples that can form an optical image. There will be a statistically significant difference in the proportions, and itineeds to be explained. {173}
At this point the critic can say that the criterion — seeing versus not seeing — is set a posteriori, after we know what animals can do, so the probability estimates are meaningless. They are like the infinitesimal probability that I would have been dealt whatever poker hand I happened to have been dealt. Most hunks of matter cannot see, but then most hunks of matter cannot flern either, where I hereby define flern as the ability to have the exact size and shape and composition of the rock I just picked up.
Recently I visited an exhibition on spiders at the Smithsonian. As I marveled at the Swiss-watch precision of the joints, the sewing-machine motions by which it drew silk from its spinnerets, the beauty and cunning of the web, I thought to myself, “How could anyone see this and not believe in natural selection!” At that moment a woman standing next to me exclaimed, “How could anyone see this and not believe in God!” We agreed a -priori on the facts that need to be explained, though we disagreed about how to explain them. Well before Darwin, theologians such as William Paley pointed to the engineering marvels of nature as proof of the existence of God. Darwin did not invent the facts to be explained, only the explanation.
But what, exactly, are we all so impressed by? Everyone might agree that the Orion constellation looks like a big guy with a belt, but that does not mean we need a special explanation of why stars align themselves into guys with belts. But the intuition that eyes and spiders show “design” and that rocks and Orion don't can be unpacked into explicit criteria. There has to be a heterogeneous structure: the parts or aspects of an object are unpredictably different from one another. And there has to be a unity of function: the different parts are organized to cause the system to achieve some special effect — special because it is improbable for objects lacking that structure, and special because it benefits someone or something. If you can't state the function more economically than you can describe the structure, you don't have design. A lens is different from a diaphragm, which in turn is different from a pho-topigment, and no unguided physical process would deposit the three in the same object, let alone align them perfectly. But they do have something in common — all are needed for high-fidelity image formation — and that makes sense of why they are found together in an eye. For the flerning rock, in contrast, describing the structure and stating the function are one and the same. The notion of function adds nothing. {174}
And most important, attributing adaptive complexity to natural selection is not just a recognition of design excellence, like the expensive appliances in the Museum of Modern Art. Natural selection is a falsifiable hypothesis about the origin of design and imposes onerous empirical requirements. Remember how it works: from competition among replicators. Anything that showed signs of design but did not come from a long line of replicators could not be explained by — in fact, would refute — the theory of natural selection: natural species that lacked reproductive organs, insects growing like crystals out of rocks, television sets on the moon, eyes spewing out of vents on the ocean floor, caves shaped like hotel rooms down to the details of hangers and ice buckets. Moreover, the beneficial functions all have to be in the ultimate service of reproduction. An organ can be designed for sieeing or eating or mating or nursing, but it had better not be designed for the beauty of nature, the harmony of the ecosystem, or instant self-destruction. Finally, the beneficiary of the function has to be the replicator. Darwin pointed out that if horses had evolved saddles, his theoty would immediately be falsified.
Rumors and folklore notwithstanding, natural selection remains the heart of explanation in biology. Organisms can be understood only as interactions among adaptations, by-products of adaptations, and noise. The by-products and noise don't rule out the adaptations, nor do they leave us staring blankly, unable to tell them apart. It is exactly what makes organisms so fascinating — their improbable adaptive design — that calls for reverse-engineering them in the light of natural selection. The by-products and noise, because they are defined negatively as un-adaptations, also can be discovered only via reverse-engineering.
This is no less true for human intelligence. The major faculties of the mind, with their feats no robot can duplicate, show the handiwork of selection. That does not mean that every aspect of the mind is adaptive. From low-level features like the sluggishness and noisiness of neurons, to momentous activities like art, music, religion, and dreams, we should expect to find activities of the mind that are not adaptations in the biologists’ sense. But it does mean that our understanding of how the mind works will be woefully incomplete or downright wrong unless it meshes with our understanding of how the mind evolved. That is the topic of the rest of the chapter. {175}
Why did brains evolve to start with? The answer lies in the value of information, which brains have been designed to process.
Every time you buy a newspaper, you are paying for information. Economic theorists have explained why you should: information confers a benefit that is worth paying for. Life is a choice among gambles. One turns left or right at the fork in the road, stays with Rick or leaves with Victor, knowing that neither choice guarantees fortune or happiness; the best one can do is play the odds. Stripped to its essentials, every decision in life amounts to choosing which lottery ticket to buy. Say a ticket costs $1.00 and offers a one-in-four chance of winning $10.00. On average, you will net $1.50 per play ($10.00 divided by 4 equals $2.50, minus $1.00 for the ticket). The other ticket costs $1.00 and offers a one-in-five chance of winning $12.00. On average, you will net $1.40 per play. The two kinds of tickets come in equal numbers, and neither has the odds or winnings marked on it. How much should you pay for someone to tell you which is which? You should pay up to four cents. With no information, you would have to choose at random, and you could expect to make $1.45 on average ($1.50 half the time, $1.40 half the time). If you knew which had the better average payoff, you would make an average of $1.50 each play, so even if you paid four cents you would be ahead by one cent each play.
Most organisms don't buy lottery tickets, but they all choose between gambles every time their bodies can move in more than one way. They should be willing to “pay” for information — in tissue, energy, and time — if the cost is lower than the expected payoff in food, safety, mating opportunities, and other resources, all ultimately valuated in the expected number of surviving offspring. In multicellular animals the information is gathered and translated into profitable decisions by the nervous system.
Often, more information brings a greater reward and earns back its extra cost. If a treasure chest has been buried somewhere in your neighborhood, the single bit of information that locates it in the north or the south half is helpful, because it cuts your digging time in half. A second bit that told you which quadrant it was in would be even more useful, and so on. The more digits there are in the coordinates, the less time you will waste digging fruitlessly, so you should be willing to pay for more bits, up to {176} the point where you are so close that further subdivision would not be worth the cost. Similarly, if you were trying to crack a combination lock, every number you bought would cut down the number of possibilities to try, and could be worth its cost in the time saved. So very often more information is better, up to a point of diminishing returns, and that is why some lineages of animals have evolved more and more complex nervous isystems.
Natural selection cannot directly endow an organism with information about its environment, or with the computational networks, demons, modules, faculties, representations, or mental organs that process the information. It can only select among genes. But genes build brains, and different genes build brains that process information in different ways. The evolution of information processing has to be accomplished at the nuts-and-bolts level by selection of genes that affect the brain-assembly process.
Many kinds of genes could be the targets of selection for better information processing. Altered genes could lead to different numbers of pro-liferative units along the walls of the ventricles (the cavities in the center of the brain), which beget the cortical neurons making up the gray matter. Other genes could allow the proliferative units to divide for different numbers of cycles, creating different numbers and kinds of cortical areas. Axons connecting the neurons can be re-routed by shifting the chemical trails and molecular guideposts that coax the axons in particular directions. Genes can change the molecular locks and keys that encourage neurons to connect with other ones. As in the old joke about how to carve a statue of an elephant (remove all the bits that don't look like an elephant), neural circuits can be sculpted by programming certain cells and synapses to commit suicide on cue. Neurons can become active at different points in embryogenesis, and their firing patterns, both spontaneous and programmed, can be interpreted downstream as information about how to wire together. Many of these processes interact in cascades. For example, increasing the size of one area allows it to compete better for real estate downstream. Natural selection does not care how baroque the brain-assembly process is, or how ugly the resulting brain. Modifications are evaluated strictly on how well the brain's algorithms work in guiding the perception, thought, and action of the whole animal. By these processes, natural selection can build a better and better functioning brain.
But could the selection of random variants really improve the design of a nervous system? Or would the variants crash it, like a corrupted byte {177} in a computer program, and the selection merely preserve the systems that do not crash? A new field of computer science called genetic algorithms has shown that Darwinian selection can create increasingly intelligent software. Genetic algorithms are programs that are duplicated to make multiple copies, though with random mutations that make each one a tiny bit different. All the copies have a go at solving a problem, and the ones that do best are allowed to reproduce to furnish the copies for the next round. But first, parts of each program are randomly mutated again, and pairs of programs have sex: each is split in two, and the halves are exchanged. After many cycles of computation, selection, mutation, and reproduction, the surviving programs are often better than anything a human programmer could have designed.
More apropos of how a mind can evolve, genetic algorithms have been applied to neural networks. A network might be given inputs from simulated sense organs and outputs to simulated legs and placed in a virtual environment with scattered “food” and many other networks competing for it. The ones that get the most food leave the most copies before the next round of mutation and selection. The mutations are random changes in the connection weights, sometimes followed by sexual recombination between networks (swapping some of their connection weights). During the early iterations, the “animals” — or, as they are sometimes called, “animats” — wander randomly over the terrain, occasionally bumping into a food source. But as they evolve they come to zip directly from food source to food source. Indeed, a population of networks that is allowed to evolve innate connection weights often does better than a single neural network that is allowed to learn them. That is especially true for networks with multiple hidden layers, which complex animals, especially humans, surely have. If a network can only learn, not evolve, the environmental teaching signal gets diluted as it is propagated backward to the hidden layers and can only nudge the connection weights up and down by minuscule amounts. But if a population of networks can evolve, even if they cannot learn, mutations and recombinations can reprogram the hidden layers directly, and can catapult the network into a combination of innate connections that is much closer to the optimum. Innate structure is selected for.
Evolution and learning can also go on simultaneously, with innate structure evolving in an animal that also learns. A population of networks can be equipped with a generic learning algorithm and can be allowed to evolve the innate parts, which the network designer would ordinarily {178} have built in by guesswork, tradition, or trial and error. The innate specs include how many units there are, how they are connected, what the initial connection weights are, and how much the weights should be nudged up and down on each learning episode. Simulated Evolution gives the networks a big head start in their learning careers.
So evolution can guide learning in neural networks. Surprisingly, learning can guide evolution as well. Remember Darwin's discussion of “the incipient stages of useful structures” — the what-good-is-half-an-eye problem. The neural-network theorists Geoffrey Hinton and Steven Nowlan invented a fiendish example. Imagine an animal controlled by a neural network with twenty connections, each either excitatory (on) or neutral (off). But the network is utterly useless unless all twenty connections are correctly set. Not only is it no good to have half a network; it is no good to have ninety-five percent of one. In a population of animals whose connections are determined by random mutation, a fitter mutant, with all the right connections, arises only about once every million (220) genetically distinct organisms. Worse, the advantage is immediately lost if the animal reproduces sexually, because after having finally found the magic combination of weights, it swaps half of them away. In simulations of this scenario, no adapted network ever evolved.
But now consider a population of animals whose connections can come in three forms: innately on, innately off, or settable to on or off by learning. Mutations determine which of the three possibilities (on, off, learnable) a given connection has at the animal's birth. In an avelrage animal in these simulations, about half the connections are learnable, the other half on or off. Learning works like this. Each animal, as it lives its life, tries out settings for the learnable connections at random until it hits upon the magic combination. In real life this might be figuring out how to catch prey or crack a nut; whatever it is, the animal senses its good fortune and retains those settings, ceasing the trial and error. From then on it enjoys a higher rate of reproduction. The earlier in life the animal acquires the right settings, the longer it will have to reproduce at the higher rate.
Now with these evolving learners, or learning evolvers, there is an advantage to having less than one hundred percent of the correct network. Take all the animals with ten innate connections. About one in a thousand (210) will have all ten correct. (Remember that only one in a million wonlearning animals had all twenty of its innate connections correct.) That well-endowed animal will have some probability of attaining the {179} completely correct network by learning the other ten connections; if it has a thousand occasions to learn, success is fairly likely. The successful animal will reproduce earlier, hence more often. And among its descendants, there are advantages to mutations that make more and more of the connections innately correct, because with more good connections to begin with, it takes less time to learn the rest, and the chances of going through life without having learned them get smaller. In Hinton and Nowlan's simulations, the networks thus evolved more and more innate connections. The connections never became completely innate, however. As more and more of the connections were fixed, the selection pressure to fix the remaining ones tapered off, because with only a few connections to learn, every organism was guaranteed to learn them quickly. Learning leads to the evolution of innateness, but not complete innateness.
Hinton and Nowlan submitted the results of their computer simulations to a journal and were told that they had been scooped by a hundred years. The psychologist James Mark Baldwin had proposed that learning could guide evolution in precisely this way, creating an illusion of Lamarck-ian evolution without there really being Lamarckian evolution. But no one had shown that the idea, known as the Baldwin effect, would really work. Hinton and Nowlan showed why it can. The ability to learn alters the evolutionary problem from looking for a needle in a haystack to looking for the needle with someone telling you when you are getting close.
The Baldwin effect probably played a large role in the evolution of brains. Contrary to standard social science assumptions, learning is not some pinnacle of evolution attained only recently by humans. All but the simplest animals learn. That is why mentally uncomplicated creatures like fruit flies and sea slugs have been convenient subjects for neurosci-entists searching for the neural incarnation of learning. If the ability to learn was in place in an early ancestor of the multicellular animals, it could have guided the evolution of nervous systems toward their specialized circuits even when the circuits are so intricate that natural selection could not have found them on its own.
Complex neural circuitry has evolved in many animals, but the common image of animals climbing up some intelligence ladder is wrong. The {180} common view is that lower animals have a few fixed reflexes, and that in higher ones the reflexes can be associated with new stimuli (as in Pavlov's experiments) and the responses can be associated with rewards (as in Skinner's). On this view, the ability to associate gets better in still higher organisms, and eventually it is freed from bodily drives and physical stimuli and responses and can associate ideas directly to each other, reaching an apex in man. But the distribution of intelligence in ieal animals is nothing like this.
The Tunisian desert ant leaves its nest, travels some distance, and then wanders over the burning sands looking for the carcass of an insect that has keeled over from the heat. When it finds one, it bites off a chunk, turns, and makes a beeline for the nest, a hole one millimeter in diameter as much as fifty meters away. How does it find its way back? The navigation depends on information gathered during the outward journey, not on sensing the nest like a beacon. If someone lifts the ant as it emerges from the nest and plunks it down some distance away, the ant wanders in random circles. If someone moves the ant after it finds food, it runs in a line within a degree or two of the direction of its nest with respect to the abduction site, slightly overshoots the point where the nest should be, does a quick U-turn, and searches for the nonexistent nest. This shows that the ant has somehow measured and stored the direction and distance back to the nest, a form of navigation called path integration or dead reckoning.
This example of information processing in animals, discovered by the biologist Rudiger Wehner, is one of many that the psychologist Randy Gallistel has used to try to get people to stop thinking about leafning as the formation of associations. He explains the principle:
Path integration is the integration of the velocity vector with respect to time to obtain the position vector, or some discrete equivalent of this computation. The discrete equivalent in traditional marine navigation is to record the direction and speed of travel (the velocity) at intervals, multiply each recorded velocity by the interval since the previous recording to get interval-by-interval displacements (e.g., making 5 knots on a northeast course for half an hour puts the ship 2.5 nautical miles northeast of where it was), and sum the successive displacements (changes in position) to get the net change in position. These running sums of the longitudinal and latitudinal displacements are the deduced reckoning of the ship's position. {181}
Audiences are incredulous. All that computation inside the little bitty pinhead of an ant? Actually, as computation goes, this is pretty simple stuff; you could build a device to do it for a few dollars out of little parts hanging on the pegboard at Radio Shack. But intuitions about the nervous system have been so impoverished by associationism that a psychologist would be accused of wild, profligate speculation if she were to attribute this machinery to a human brain, let alone an ant brain. Could an ant really do calculus, or even arithmetic? Not overtly, of course, but then neither do we when we exercise our own faculty of dead reckoning, our “sense of direction.” The path integration calculations are done unconsciously, and their output pokes into our awareness — and the ant's, if it has any — as an abstract feeling that home is thataway, yea far.
Other animals execute even more complicated sequences of arithmetic, logic, and data storage and retrieval. Many migratory birds fly thousands of miles at night, maintaining their compass direction by looking at the constellations. As a Cub Scout I was taught how to find the North Star: locate the tip of the handle of the Little Dipper, or extrapolate from the front lip of the Big Dipper a distance seven times its depth. Birds are not born with this knowledge, not because it is unthinkable that it could be innate, but because if it were innate it would soon be obsolete. The earth's axis of rotation, and hence the celestial pole (the point in the sky corresponding to north), wobbles in a 27,000-year cycle called the precession of the equinoxes. The cycle is rapid in an evolutionary timetable, and the birds have responded by evolving a special algorithm for learning where the celestial pole is in the night sky. It all happens while they are still in the nest and cannot fly. The nestlings gaze up at the night sky for hours, watching the slow rotation of the constellations. They find the point around which the stars appear to move, and record its position with respect to several nearby constellations, acquiring the information imparted to me by the Cub Scout manual. Months later they can use any of these constellations to maintain a constant heading — say, keeping north behind them while flying south, or flying into the celestial pole the next spring to return north.
Honeybees perform a dance that tells their hivemates the direction and distance of a food source with respect to the sun. As if that weren't impressive enough, the bees have evolved a variety of calibrations and backup systems to deal with the engineering complexities of solar navigation. The dancer uses an internal clock to compensate for the movement of the sun between the time she discovered the source and the time she {182} passes on the information. If it's cloudy, the other bees estirtiate the direction using the polarization of light in the sky. These feats are the tip of an iceberg of honeybee ingenuity, documented by Karl Von Frisch, James Gould, and others. A psychologist colleague of mine once thought that bees offered a good pedagogical opportunity to convey the Sophistication of neural computation to our undergraduates. He devoted the first week of his entry-level course in cognitive science to some of the ingenious experiments. The next year the lectures spilled over to the second week, then the third, and so on, until the students complained that the course had become an Introduction to Bee Cognition.
There are dozens of comparable examples. Many species compute how much time to forage at each patch so as to optimize their rate of return of calories per energy expended in foraging. Some birds learn the emphemeris function, the path of the sun above the horizon Over the course of the day and the year, necessary for navigating by the sun. The barn owl uses sub-millisecond discrepancies between the arrival times of a sound at its two ears to swoop down on a rustling mouse in pitch blackness. Cacheing species place nuts and seeds in unpredictable hiding places to foil thieves, but months later must recall them all. I mentioned in the preceding chapter that the Clark's Nutcracker can remember ten thousand hiding places. Even Pavlovian and operant conditioning, the textbook cases of learning by association, turn out to be not a general stickiness of coinciding stimuli and responses in the brain, but complex algorithms for multivariate, nonstationary time series analysis (predicting when events will occur, based on their history of occurrences).
The moral of this animal show is that animals’ brains are just as specialized and well engineered as their bodies. A brain is a precision instrument that allows a creature to use information to solve the problems presented by its lifestyle. Since organisms’ lifestyles differ, and since they are related to one another in a great bush, not a great chain, species cannot be ranked in IQ or by the percentage of human intelligence they have achieved. Whatever is special about the humsln mind cannot be just more, or better, or more flexible animal intelligence, because there is no such thing as generic animal intelligence. Each animal has evolved information-processing machinery to solve its problems, and we evolved machinery to solve ours. The sophisticated algorithms found in even the tiniest dabs of nervous tissue serve as yet another eye-opener — joining the difficulty of building a robot, the circumscribed effects of brain damage, and the similarities between twins {183} reared apart — for the hidden complexity we should expect to find in the human mind.
The brains of mammals, like the bodies of mammals, follow a common general plan. Many of the same cell types, chemicals, tissues, sub-organs, way-stations, and pathways are found throughout the class, and the major visible differences come from inflating or shrinking the parts. But under the microscope, differences appear. The number of cortical areas differs widely, from twenty or fewer in rats to fifty or more in humans. Primates differ from other mammals in the number of visual areas, their interconnections, and their hookup to the motor and decision regions of the frontal lobes. When a species has a noteworthy talent, it is reflected in the gross anatomy of its brain, sometimes in ways visible to the naked eye. The; takeover of monkeys' brains by visual areas (about one-half the territory) reflects more accurately, allows — their aptitude for depth, color, motion, and visually guided grasping. Hals that rely on sonar have additional brain areas dedicated to their ultrasonic hearing, and desert mice that cache seeds are bom with a bigger hippocampus — -a seat of the cognitive map — than closely related species that don't cache.
The human brain, too, tells an evolutionary story. Even a quick side-by-side comparison shows that the primate brain must have been considerably re-engineered to end up as a human brain. Our brains are about three times too big for a generic monkey or ape of our body size. The inflation is accomplished by prolonging fetal brain growth for a year after birth. If our bodies grew proportionally during thai period, we would be ten feel tall and weigh half a ton.
The major lobes and patches of the brain have been revamped as well. The olfactory bulbs, which underlie the sense of smell, have shriveled to a third of the expected primate si/e (already puny by mammalian standards), and the main cortical areas for vision and movement have shrunk proportionally as well. Within the visual system, die first slop lor information, the primary visual cortex, takes up a smaller proportion of the whole brain, while the later areas for complex-form processing expand, as do the temporo-parietal areas that shunt visual information to the language and conceptual regions. The areas for hearing, especially {184} for understanding speech, have grown, and the prefrontal lobes, jthe seat of deliberate thought and planning, have ballooned to twice what a primate our size should have. While the brains of monkeys and apes are subtly asymmetrical, the human brain, especially in the areas devoted to language, is so lopsided that the two hemispheres can be disltnguished by shape in the jar. And there have been takeovers of primate, brain areas for new functions. Broca's area, involved in speech, has a homologue (evolutionary counterpart) in monkeys, but they obviously don't use it for speech, and they don't even seern to use it to produce shrieks, barks, and other calls.
Its interesting to find these differences, but the human brain could be radically different from an ape's brain even if one looked like a perfect scale model of the other. The real action is in the patterns of connections among neurons, just as the differences in content among different computer programs, microchips, books, or videocassettes lie not in their gross shapes but in the combinatorial arrangements of their tiny constituents. Virtually nothing is known about the functioning microeir-cuitry of the human brain, because there is a shortage of volunteers willing to give up their brains to science before they are dead. If we could somehow read the code in the neural circuitry of growing humans and apes, we would surely find substantial differences.
Are the marvelous algorithms of animals mere “instincts” that we have lost or risen above? Humans are often said to have no instincts beyond the vegetative functions; we are said to reason and behave flexibly, freed from specialized machinery. The featberless biped surely understands astronomy in a sense; that the feathered biped does not! True enough, but it is not because we have fewer instincts than other animals; it is because we have more. Our vaunted flexibility comes from scores of instincts assembled into programs and pitted in competitions. Darwin called human language, the epitome of flexible behavior, “an instinct to acquire an art” (giving me the title for The Language Instinct), and his follower William James pressed the point:
Now, why do the various animals do what seem to us such strange things, in the presence of such outlandish stimuli? Why does the hen, for example, {185} submit herself to the tedium of incubating such a fearfully uninteresting set of objects as a nestful of eggs, unless she have some sort of a prophetic inkling of the result? The only answer is ad hominem. We can only interpret the instincts of brutes by what we know of instincts in ourselves. Why do men always lie down, when they can, on soft beds rather than on hard floors? Why do they sit round the stove on a cold day? Why, in a room, do they place themselves, ninety-nine times out of a hundred, with their faces towards its middle rather than to the wall? Why do they prefer saddle of mutton and champagne to hard-tack and pond-water? Why does the maiden interest the youth so that everything about her seems more important and significant than anything else in the world? Nothing more can be said than that these are human ways, and that every creature likes its own ways, and takes to the following them as a matter of course. Science may come and consider these ways, and find that most of them are useful. But it is not for the sake of their utility that they are followed, but because at the moment of following them we feel that that is the only appropriate and natural thing to do. Not one man in a billion, when taking his dinner, ever thinks of utility. He eats because the food tastes good and makes him want more. If you ask him why he should want to eat more of what tastes like that, instead of revering you as a philosopher we will probably laugh at you for a fool. . . .
And so, probably, does each animal feel about the particular things it tends to do in presence of particular objects. To the broody hen the notion would probably seem monstrous that there should be a creature in the world to whom a nestful of eggs was not the utterly fascinating and precious and never-to-be-too-much sat-upon object which it is to her.
The human reactions described in the passage still may strike you as versions of animal instincts. What about our rational, flexible thought? Can it be explained as a set of instincts? In the preceding chapter I showed how our precision intelligence can be broken down into smaller and smaller agents or networks of information processing. At the lowest levels, the steps have to be as automatic and unanalyzed as the reactions of the most brutish animal. Remember what the tortoise said to Achilles. No rational creature can consult rules all the way down; that way infinite regress lies. At some point a thinker must execute a rule, because he just can't help it: it's the human way, a matter of course, the only appropriate and natural thing to do — in short, an instinct. When all goes well, our reasoning instincts link up into complex programs for rational analysis, but that is not because we somehow commune with a realm of truth and reason. The same instincts can be seduced by sophistry, bump up against {186} paradoxes like Zeno's beguiling demonstrations that motion is impossible, or make us dizzy as they ponder mysteries like sentience And free will. Just as an ethologist unmasks an animal's instincts with clever manipulations of its world, such as slipping a mechanical bee into a hive or rearing a chick in a planetarium, psychologists can unmask human reasoning instincts by couching problems in devilish ways, as we shall see in Chapter 5.
Ambrose Bierce's Devil's Dictionary defines our species as follows:
Man, n. An animal so lost in rapturous contemplation of what he thinks he is as to overlook what he indubitably ought to be. His chief occupation is extermination of other animals and his own species, which, however, multiplies with such insistent rapidity as to infest the whole habitable earth and Canada.
Homo sapiens sapiens is indeed an unprecedented animal, with many zoologically unique or extreme traits. Humans achieve their goals by complex chains of behavior, assembled on the spot and tailored to the situation. They plan the behavior using cognitive models of the causal structure of the world. They learn these models in their lifetimes and communicate them through language, which allows the knowledge to accumulate within a group and over generations. They manufacture and depend upon many kinds of tools. They exchange goods and favors over long periods of time. Food is transported long distances, processed extensively, stored, and shared. Labor is divided between the sexes. Humans form large, structured coalitions, especially among males, and coalitions wage war against each other. Humans use fire. Kinship systems are complex and vary with other aspects of their lifestyles. Mating relations are negotiated by kin, often by groups exchanging daughters. Ovulation is concealed, and females may choose to have sex at any time rather than at certain points in a reproductive cycle.
A few of these traits are found among some of the great apes, but to a much lesser degree, and most are not found at all. And humans have rediscovered traits that are rare among primates but found in other animals. {187} They are bipedal. They live longer than other apes, and bear helpless offspring who stay children (that is, sexually immature) for a substantial part of their lives. Hunting is important, and meat a large part of the diet. Males invest in their offspring: they tote children around, protect them against animals and other humans, and give them food. And as The Devil's Dictionary points out, humans occupy every ecozone on earth.
Aside from the retooling of the skeleton that gives us upright posture and precision manipulation, what makes us unusual is not our body but our behavior and the mental programs that organize it. In the comic strip Calvin and Hobbes, Calvin asks his tiger companion why people are never content with what they have. Hobbes replies, “Are you kidding? Your fingernails are a joke, you've got no fangs, you can't see at night, your pink hides are ridiculous, your reflexes are nil, and you don't even have tails! Of course people aren't content!” But despite these handicaps, humans control the fate of tigers, rather than vice versa. Human evolution is the original revenge of the nerds.
Perhaps recoiling from the image of the pasty-faced, pocket-protected, polyester-clad misfits, theorists on human evolution have looked far and wide for alternative theories. Human ingenuity has been explained away as a by-product of blood vessels in the skull that radiate heat, as a runaway courtship device like the peacock's tail, as a stretching of chimpanzee childhood, and as an escape hatch that saved the species from the evolutionary dead end of bearing fewer and fewer offspring. Even in theories that acknowledge that intelligence itself was selected for, the causes are badly underpowered in comparison with the effects. In various stories the full human mind sprang into existence to solve narrow problems like chipping tools out of stone, cracking open nuts and bones, throwing rocks at animals, keeping track of toddlers, following herds to scavenge their dead, and maintaining social bonds in a large group.
There are grains of truth in these accounts, but they lack the leverage of good reverse-engineering. Natural selection for success in solving a particular problem tends to fashion an idiot savant like the dead-reckoning ants and stargazing birds. We need to know what the more general kinds of intelligence found in our species are good for. That requires a good description of the improbable feats the human mind accomplishes, not just one-word compliments like “flexibility” or “intelligence.” That description must come from the study of the modern mind, cognitive science. And because selection is driven by the fate of the whole individual, {188} it is not enough to explain the evolution of a brain in a vat. A good theory has to connect all the parts of the human lifestyle — all ages, both sexes, anatomy, diet, habitat, and social life. That is, it has to characterize the ecological niche that humans entered.
The only theory that has risen to this challenge comes from John Tooby and the anthropologist Irven DeVore. Tooby and DeVbre begin by noting that species evolve at one another's expense. We fantasize about the land of milk and honey, the big rock candy mountain and tangerine trees with marmalade skies, but real ecosystems are different. Except for fruits (which trick hungry animals into dispersing seeds), virtually every food is the body part of some other organism, which would just as soon keep that part for itself. Organisms evolve defenses against being eaten, and would-be diners evolve weapons to overcome these defenses, prodding the would-be meals to evolve better defenses, and so on, in an evolutionary arms race. These weapons and defenses are genetically based and relatively fixed within the lifetime of the individual; therefore they change slowly. The balance between eater and eaten develops only over evolutionary time.
Humans, Tooby and DeVore suggest, entered the “cognitive niche.” Remember the definition of intelligence from Chapter 2: using knowledge of how things work to attain goals in the face of obstacles. By learning which manipulations achieve which goals, humans have mastered the art of the surprise attack. They use novel, goal-oriented courses of action to overcome the Maginot Line defenses of other organisms, which can respond only over evolutionary time. The manipulations can be novel because human knowledge is not just couched in concrete instructions like “how to catch a rabbit.” Humans analyze the world using intuitive theories of objects, forces, paths, places, manners, states, substances, hidden biochemical essences, and, for other animals and people, beliefs and desires. (These intuitive theories are the topic of Chapter 5.) People compose new knowledge and plans by mentally playing out combinatorial interactions among these laws in their mind's eye.
Many theorists have wondered what illiterate foragers do with their capacity for abstract intelligence. The foragers would have) better grounds for asking the question about modern couch potatoes. Life for foragers (including our ancestors) is a camping trip that never ends, but without the space blankets, Swiss Army knives, and freeze-dried pasta al pesto. Living by their wits, human groups develop sophisticated {189} technologies and bodies of folk science. All human cultures ever documented have words for the elements of space, time, motion, speed, mental states, tools, flora, fauna, and weather, and logical connectives (not, and, same, opposite, part-whole, and general-particular). They combine the words into grammatical sentences and use the underlying propositions to reason about invisible entities like diseases, meteorological forces, and absent animals. Mental maps represent the locations of thousands of noteworthy sites, and mental calendars represent nested cycles of weather, animal migrations, and the life histories of plants. The anthropologist Louis Liebenberg recounts a typical experience with the !Xo of the central Kalahari Desert:
While tracking down a solitary wildebeest spoor [tracks] of the previous evening !Xo trackers pointed out evidence of trampling which indicated that the animal had slept at that spot. They explained consequently that the spoor leaving the sleeping place had been made early that morning and was therefore relatively fresh. The spoor then followed a straight course, indicating that the animal was on its way to a specific destination. After a while, one tracker started to investigate several sets of footprints in a particular area. He pointed out that these footprints all belonged to the same animal, but were made during the previous days. He explained that the particular area was the feeding ground of that particular wildebeest. Since it was, by that time, about mid-day, it could be expected that the wildebeest may be resting in the shade in the near vicinity.
All foraging peoples manufacture cutters, pounders, containers, cordage, nets, baskets, levers, and spears and other weapons. They use fire, shelters, and medicinal drugs. Their engineering is often ingenious, exploiting poisons, smokeouts, glue traps, gill nets, baited lines, snares, corrals, weirs, camouflaged pits and clifftops, blowguns, bows and arrows, and kites trailing sticky fishing lines made out of spider silk.
The reward is an ability to crack the safes of many other living things: burrowing animals, plants’ underground storage organs, nuts, seeds, bone marrow, tough-skinned animals and plants, birds, fish, shellfish, turtles, poisonous plants (detoxified by peeling, cooking, soaking, parboiling, fermenting, leaching, and other tricks of the kitchen magician), quick animals (which can be ambushed), and large animals (which cooperating groups can drive, exhaust, surround, and dispatch with weapons). Ogden Nash wrote: {190}
The hunter crouches in his blind
‘Neath camouflage of every kind,
And conjures up a quacking noise
To lend allure to his decoys.
This grown-up man, with pluck and luck
Is hoping to outwit a duck.
And outwit it he does. Humans have the unfair advantage of attacking in this lifetime organisms that can beef up their defenses only in subsequent ones. Many species cannot evolve defenses rapidly enough, even over evolutionary time, to defend themselves against humans. That is why species drop like flies whenever humans first enter an ecosystem. And it's not just the snail darters and snowy owls recently threatened by dams and loggers. The reason you have never seen a living mastodon, saber-tooth, giant woolly rhinoceros, or other fantastic Ice Age animal is that humans apparently extinguished them thousands of years ago.
The cognitive niche embraces many of the zoologically unusual features of our species. Tool manufacture and use is the application of knowledge about causes and effects among objects in the effort to bring about goals. Language is a means of exchanging knowledge. It multiplies the benefit of knowledge, which can not only be used but exchanged for other resources, and lowers its cost, because knowledge can be acquired from the hard-won wisdom, strokes of genius, and trial and error of others rather than only from risky exploration and experimentation. Information can be shared at a negligible cost: if I give you a fish, I no longer possess the fish, but if I give you information on how to fish, I still possess the information myself. So an information-exploiting lifestyle goes well with living in groups and pooling expertise — that is, with culture. Cultures differ from one another because they pool bodies of expertise fashioned in different times and places. A prolonged childhood is an apprenticeship for knowledge and skills. That shifts the balance of payoffs for males toward investing time and resources in their offspring and away from competing over sexual access to females (see Chapter 7). And that in turn makes kinship a concern of both sexes and all ages. Human lives are long to repay the investment of a long apprenticeship. New habitats can be colonized because even if their local conditions differ, they obey the laws of physics and biology that are already within humans’ ken, and can be exploited and outsmarted in their turn. {191}
Why did some miocene ape first enter the cognitive niche? Why not a groundhog, or a catfish, or a tapeworm? It only happened once, so no one knows. But I would guess that our ancestors had four traits that made it especially easy and worth their while to evolve better powers of causal reasoning.
First, primates are visual animals. In monkeys such as the rhesus macaque, half the brain is dedicated to sight. Stereoscopic vision, the use of differences in the vantage points of the two eyes to give a sense of depth, developed early in the primate lineage, allowing early nocturnal primates to move among treacherous fine branches and to grab insects with their hands. Color vision accompanied the switch of the ancestors of monkeys and apes to the day shift and their new taste for fruits, which advertise their ripeness with gaudy hues.
Why would the vision thing make such a difference? Depth perception defines a three-dimensional space filled with movable solid objects. Color makes objects pop out from their backgrounds, and gives us a sensation that corresponds to the stuff an object is made of, distinct from our perception of the shape of the stuff. Together they have pushed the primate brain into splitting the flow of visual information into two streams: a “what” system, for objects and their shapes and compositions, and a “where” system, for their locations and motions. It can't be a coincidence that the human mind grasps the world — even the most abstract, ethereal concepts — as a space filled with movable things and stuff (see Chapters 4 and 5). We say that John went from being sick to being well, even if he didn't move an inch; he could have been in bed the whole time. Mary can give him many pieces of advice, even if they merely talked on the phone and nothing changed hands. Even scientists, when they try to grasp abstract mathematical relationships, plot them in graphs that show them as two- and three-dimensional shapes. Our capacity for abstract thought has co-opted the coordinate system and inventory of objects made available by a well-developed visual system.
It is harder to see how a standard mammal could have moved in that direction. Most mammals hug the ground sniffing the rich chemical tracks and trails left behind by other living things. Anyone who has {192} walked a frisky cocker spaniel as it explores the invisible phantasmagoria on a sidewalk knows that it lives in an olfactory world beyond our understanding. Here is an exaggerated way of stating the difference. Rather than living in a three-dimensional coordinate space hung with movable objects, standard mammals live in a two-dimensional flatland which they explore through a zero-dimensional peephole. Edwin Abbott's Flatland, a mathematical novel about the denizens of a plane, showed that a two-dimensional world differs from our own in ways other than just lacking one third of the usual dimensions. Many geometric arrangements are simply impossible. A full-faced human figure has no way of getting food into his mouth, and a profiled one would be divided into two pieces by his digestive tract. Simple devices like tubes, knots, and wheels with axles are unbuildable. If most mammals think in a cognitive flatland, they would lack the mental models of movable solid objects in 3-D spatial and mechanical relationships that became so essential to our mental life.
A second possible prerequisite, this one found in the common ancestor of humans, chimpanzees, and gorillas, is group living. Most apes and monkeys are gregarious, though most mammals are not. Living together has advantages. A cluster of animals is not much more detectable to a predator than a single animal, and if it is detected, the likelihood that any individual will be picked off is diluted. (Drivers feel less vulnerable speeding when they are in a group of speeders, because chances are the traffic cop will stop someone else.) There are more eyes, ears, and noses to detect a predator, and the attacker can sometimes be mobbed. A second advantage is in foraging efficiency. The advantage is most obvious in cooperative hunting of large animals, such as in wolves and lions, but it also helps in sharing and defending other ephemeral food resources too big to be consumed by the individual who found it, such as a tree laden with ripe fruit. Primates that depend on fruit, and primates that spend time on the ground (where they are more vulnerable to predators), tend to hang out in groups.
Group living could have set the stage for the evolution of humanlike intelligence in two ways. With a group already in place, the value of having better information is multiplied, because information is the one coitnmodity that can be given away and kept at the same time. Therefore a smarter animal living in a group enjoys a double advantage: the benefit of the knowledge and the benefit of whatever it can get in trade for the knowledge.
The other way in which a group can be a crucible of intelligence is that group living itself poses new cognitive challenges. There are also {193} disadvantages to the madding crowd. Neighbors compete over food, water, mates, and nest sites. And there is the risk of exploitation. Hell is other people, said Jean-Paul Sartre, and if baboons were philosophers no doubt they would say that hell is other baboons. Social animals risk theft, cannibalism, cuckoldry, infanticide, extortion, and other treachery.
Every social creature is poised between milking the benefits and suffering the costs of group living. That creates a pressure to stay on the right side of the ledger by becoming smarter. In many kinds of animals, the largest-brained and smartest-behaving species are social: bees, parrots, dolphins, elephants, wolves, sea lions, and, of course, monkeys, gorillas, and chimpanzees. (The orangutan, smart but almost solitary, is a puzzling exception.) Social animals send and receive signals to coordinate preda-tion, defense, foraging, and collective sexual access. They exchange favors, repay and enforce debts, punish cheaters, and join coalitions.
The collective expression for hominoids, “a shrewdness of apes,” tells a story. Primates are sneaky baldfaced liars. They hide from rivals’ eyes to flirt, cry wolf to attract or divert attention, even manipulate their lips into a poker face. Chimpanzees monitor one another's goals, at least crudely, and sometimes appear to use them in pedagogy and deception. One chimp, shown a set of boxes with food and one with a snake, led his companions to the snake, and after they fled screaming, feasted in peace. Vervet monkeys are yentas who keep close track of everyone's comings and goings and friends and enemies. But they are so dense about the nonsocial world that they ignore the tracks of a python and the ominous sight of a carcass in a tree, the unique handiwork of a leopard.
Several theorists have proposed that the human brain is the outcome of a cognitive arms race set in motion by the Machiavellian intelligence of our primate forebears. There's only so much brain power you need to subdue a plant or a rock, the argument goes, but the other guy is about as smart as you are and may use that intelligence against your interests. You had better think about what he is thinking about what you are thinking he is thinking. As far as brain power goes, there's no end to keeping up with the Joneses.
My own guess is that a cognitive arms race by itself was not enough to launch human intelligence. Any social species can begin a never-ending escalation of brain power, but none except ours has, probably because without some other change in lifestyle, the costs of intelligence (brain size, extended childhood, and so on) would damp the positive feedback loop. Humans are exceptional in mechanical and biological, not just {194} social, intelligence. In a species that runs on information, each faculty multiplies the value of the others. (Incidentally, the expansion of the human brain is no evolutionary freak crying out for a runaway positive feedback loop. The brain tripled in size in five million years, biit that is leisurely by evolutionary timekeeping. There was enough time in hominid evolution for the brain to shoot up to human size, shrink back down, and shoot up again several times over.)
A third pilot of intelligence, alongside good vision and big groups, is the hand. Primates evolved in trees and have hands to grasp the branches. Monkeys use all four limbs to run along the tops of branches, but apes hang from them, mainly by their arms. They have put their well-developed hands to use in manipulating objects. Gorillas meticulously dissect tough or thorny plants to pick out the edible matter, and chimpanzees use simple tools such as stems to fish out termites, rocks to bash open nuts, and mashed leaves to sponge up water. As Samuel Johnson said about dogs walking on their hind legs, while it is not done well, you are surprised to find it done at all. Hands are levers of influence on the world that make intelligence worth having. Precision hands and precision intelligence co-evolved in the human lineage, and the fossil record shows that hands led the way.
Finely tooled hands are useless if you have to walk on them all the time, and they could not have evolved by themselves. Every bone in our bodies has been reshaped to give us our upright posture, which frees the hands for carrying and manipulating. Once again we have our ape ancestors to thank. Hanging from trees calls for a body plan that is different from the horizontal four-wheel-drive design of most mammals. Apes’ bodies are already tilted upward with arms that differ from their legs, and chimpanzees (and even monkeys) walk upright for short distances to carry food and objects.
Fully upright posture may have evolved under several selection pressures. Bipedal walking is a biomechanically efficient way to retool a tree-hanging body to cover distance on the flat ground of the newly entered savanna. Upright posture also allows one to peer over grass like a marmot. Hominids go out in the midday sun; this zoologically unusual work shift brought in several human adaptations for keeping cool, such as hairlessness and profuse sweating. Upright posture might be another; it is the opposite of lying down to get a tan. But carrying and manipulation must have been crucial inducements. With the hands free, tools could be assembled out of materials from different locations and brought to {195} where they were most useful, and food and children could be carried to safe or productive areas.
A final usher of intelligence was hunting. Hunting, tool use, and bipedalism were for Darwin the special trinity that powered human evolution. “Man the Hunter” was the major archetype in both serious and pop accounts through the 1960s. But the macho image that resonated with the decade of John Glenn and James Bond lost its appeal in the feminist-influenced small planet of the 1970s. A major problem for Man the Hunter was that it credited the growth of intelligence to the teamwork and foresight needed by men in groups to fell large game. But natural selection sums over the lives of both sexes. Women did not wait in the kitchen to cook the mastodon that Dad brought home, nor did they forgo the expansion of intelligence enjoyed by evolving men. The ecology of modern foraging peoples suggests that Woman the Gatherer provided a substantial portion of the calories in the form of highly processed plant foods, and that requires mechanical and biological acumen. And, of course, in a group-living species, social intelligence is as important a weapon as spears and clubs.
But Tooby and DeVore have argued that hunting was nonetheless a major force in human evolution. The key is to ask not what the mind can do for hunting, but what hunting can do for the mind. Hunting provides sporadic packages of concentrated nutrients. We did not always have tofu, and the best natural material for building animal flesh is animal flesh. Though plant foods supply calories and other nutrients, meat is a complete protein containing all twenty amino acids, and provides energy-rich fat and indispensable fatty acids. Across the mammals, carnivores have larger brains for their body size than herbivores, partly because of the greater skill it takes to subdue a rabbit than to subdue grass, and partly because meat can better feed ravenous brain tissue. Even in the most conservative estimates, meat makes up a far greater proportion of foraging humans’ diet than of any other primate's. That may have been one of the reasons we could afford our expensive brains.
Chimpanzees collectively hunt small animals like monkeys and bush pigs, so our common ancestor probably hunted as well. The move to the savanna must have made hunting more appealing. Notwithstanding the teeming wildlife in the Save-the-Rainforest posters, real forests have few large animals. Only so much solar energy falls on a patch of ground, and if the biomass it supports is locked up in wood it is not available to make animals. But grass is like the legendary self-replenishing goblet, growing {196} back as soon as it is grazed. Grasslands can feed vast herds of herbivores, who in turn feed carnivores. Evidence of butchery appears in the fossil record almost two million years ago, the time of Homo habilis. Hunting must be even older, since we know that chimpanzees do it, and their activities would not leave evidence in the fossil record. Once our ancestors increased their hunting, the world opened up. Plant foods are scarce during the winter at higher altitudes and latitudes, but hunters can survive there. There are no vegetarian Eskimos.
Our ancestors have sometimes been characterized as meek scavengers rather than brave hunters, in keeping with today's machismo-puncturing ethos. But while hominids may occasionally have scavenged, they probably could not have made a living from it, and if they did, they were no wimps. Vultures get away with scavenging because they can scan large territories for carcasses and flee on short notice when more formidable competitors show up. Otherwise, scavenging is not for the faint of heart. A carcass is jealously guarded by its hunter or an animal fierce enough to have stolen it. It is attractive to microorganisms, who quickly poison the meat to repel other would-be scavengers. So when modern primates or hunter-gatherers come across a carcass, they usually leave it alone. In a poster widely available in head shops in the early 1970s, one vulture says to the other, “Patience, my ass! I'm going to kill something.” The poster got it right, except for the vulture part: mammals that do scavenge, such as hyenas, also hunt.
Meat is also a major currency of our social life. Imagine a cow who tries to win the favors of a neighbor by dropping a clump of grass at its feet. One could forgive the second cow for thinking, “Thanks, but I can get my own grass.” The nutritional jackpot of a felled animal is another matter. Miss Piggy once advised, “Never eat anything bigger than you can lift.” A hunter with a dead animal larger than he can eat and about to become a putrefying mass is faced with a unique opportunity. Hunting is largely a matter of luck. In the absence of refrigeration, a good place to store meat for leaner times is in the bodies of other hunters who will return the favor when fortunes reverse. This eases the way for the male coalitions and the extensive reciprocity that are ubiquitous in foraging societies.
And there are other markets for a hunter's surplus. Having concentrated food to offer one's offspring changes the relative payoffs for males between investing in their young and competing with other males for access to females. The robin bringing a worm to the nestlings reminds us {197} that most animals that provision their young do so with prey, the only food that repays the effort to obtain it and transport it.
Meat also figures into sexual politics. In all foraging societies, presumably including our ancestors', hunting is overwhelmingly a male activity. Women are encumbered with children, which makes hunting inconvenient, and men are bigger and more adept at killing because of their evolutionary history of killing each other. As a result, males can invest surplus meat in their children by provisioning the children's pregnant or nursing mothers. They also can trade meat with females for plant foods or for sex. Brazen bartering of the carnal for the carnal has been observed in baboons and chimpanzees and is common in foraging peoples. Though people in modern societies are ever-so-more discreet, an exchange of resources for sexual access is still an important part of the interactions between men and women all over the world. (Chapter 7 explores these dynamics and how they originated in differences in reproductive anatomy, though of course anatomy is not destiny in modern ways of life.) In any case, we have not lost the association completely. Miss Manners’ Guide to Excruciatingly Correct Behavior advises:
There are three possible parts to a date, of which at least two must be offered: entertainment, food, and affection. It is customary to begin a series of dates with a great deal of entertainment, a moderate amount of food, and the merest suggestion of affection. As the amount of affection increases, the entertainment can be reduced proportionately. When the affection is the entertainment, we no longer call it dating. Under no circumstances can the food be omitted.
Of course no one really knows whether these four habits formed the base camp for the ascent of human intelligence. And no one knows whether there are other, untried gradients to intelligence in biological design space. But if these traits do explain why our ancestors were the only species out of fifty million to follow that route, it would have sobering implications for the search for extraterrestrial intelligence. A planet with life may not be enough of a launching pad. Its history might have to include a nocturnal predator (to get stereo vision), with descendants that switched to a diurnal lifestyle (for color) in which they depended on fruit {198} and were vulnerable to predators (for group living), which then changed their means of locomotion to swinging beneath branches (for hands and for precursors to upright posture), before a climate shift sent them from the forest into grasslands (for upright posture and hunting). What is the probability that a given planet, even a planet with life, has such a history?
Species |
Date |
Height |
Physique |
Brain |
Chimp-hominid ancestor (if similar to modern chimps) |
8 — 6 million years ago |
1-1.7 meters |
long arms, short thumbs, curved fingers and toes; adapted for knuclde-walking and tree-climbing |
450 cc |
Ardipithecus ramidus |
4.4 million years ago |
? |
probably bipedal |
? |
Australopithecus anamensis |
4.2-3.9 million years ago |
? |
bipedal |
? |
Australopithecus afarensis (Lucy) |
4-2.5 million years ago |
1-1,2 meters |
fully bipedal with modified hands but ape-like features: thorax, long arms, curved fingers and toes |
400-500 cc |
Homo habilis (Handyman) |
2.3-1.6 million years ago |
1-1.5 meters |
some specimens: small with long arms; others: robust but human |
500-800 cc |
Homo erectus |
1.9 million-300,000 (maybe 27,000) years ago |
1.3-1.5 meters |
robust but human |
750-1250 cc |
Archaic Homo sapiens |
400,000-100,000 years ago |
? |
robust but modern |
1100-1400 cc |
Early Homo sapiens |
130,000-60,000 years ago |
1.6-1.85 meters |
robust but modern |
1200-1700 cc |
Homo sapiens |
45,000-12,000 years ago |
1.6-1.8 meters |
modern |
1300-1600 (cf. today: 1000-2000, average 1350) |
{199} |
The dry bones of the fossil record tell of a gradual entry into the cognitive niche. A summary of the current evidence on the species thought to be our direct ancestors is shown in the table below.
Skull |
Teeth |
Tools |
Distribution |
very low forehead; projecting face; huge brow ridges |
large canines |
stone hammers, leaf sponges, stem probes, branch levers |
West Africa |
? |
chimplike molars but not canines |
? |
East Africa |
apelike fragments |
chimplike size and placement; humanlike enamel |
? |
East Africa |
low flat forehead; projecting face; big brow ridges |
large canines and molars |
none? flakes? |
East Africa (maybe also west) |
smaller face; rounder skull |
smaller molars |
flakes, choppers, scrapers |
East and South Africa |
thick; large brow ridges (Asia); smaller, protruding face |
smaller teeth |
symmetrical hand axes |
Africa (may be separate species), Asia, Europe |
higher skull; smaller, protruding face; large brow ridges |
smaller teeth |
better hand axes, retouched flakes |
Africa, Asia, Europe |
high skull; medium brow ridges; slightly protruding face; chin |
smaller teeth |
retouched flakes; Hake-blades; points |
Africa, Western Asia |
modern |
modern |
blades; drills; spear throwers; needles; engravers; bone |
worldwide |
{200} |
Millions of years before our brains billowed out, some descendants of the common ancestor of chimpanzees and humans walked upright. In the 1920s that discovery came as a shock to human chauvinists who imagined that our glorious brains led us up the ladder, perhaps as our ancestors decided at each rung what use to make of their newfound smarts. But natural selection could not have worked that way. Why bulk up your brain if you can't put it to use? The history of paleoanthropology is the discovery of earlier and earlier birthdays for upright posture. The most recent discoveries put it at four or even four and a half million years ago. With hands freed, subsequent species ratchet upward, click by click, in the features that distinguish us: the dexterity of hands, the sophistication of tools, the dependence on hunting, the size of brains, the range of habitats. The teeth and jaw become smaller. The face that opposes it becomes less muzzle-like. The brow ridges that anchor the muscles that close the jaw shrink and disappear. Our delicate faces differ from the brutes’ because tools and technology have taken over from teeth. We slaughter and skin animals with blades, and soften plants and meats with fire. That eases the mechanical demands on the jaw and skull, allowing us to shave bone from our already heavy heads. The sexes come to differ less in size, suggesting that males spent less of their resources beating each other up and perhaps more on their children and the children's mothers.
The stepwise growth of the brain, propelled by hands and feet and manifested in tools, butchered bones, and increased range, is good evidence, if evidence were needed, that intelligence is a product of natural selection for exploitation of the cognitive niche. The package was not an inexorable unfolding of hominid potential. Other species, omitted from the table, spun off in every epoch to occupy slightly different niches: nutcracking and root-gnawing australopithecines, perhaps one of the two habiline subtypes, quite possibly the Asian branches of erectus and archaic sapiens, and probably the Ice Age-adapted Neanderthals. Each species might have been outcompeted when a neighboring, more sapiens-\ike population had entered far enough into the cognitive niche to duplicate the species’ more specialized feats and do much else besides. The package was also not the gift of a macromutation or random drift — for how could such luck have held up in one lineage for millions of years, over hundreds of thousands of generations, in species after bigger-brained species? Moreover, the bigger brains were no mere ornaments but allowed their owners to make finer tools and infest more of the planet. {201}
According to the standard timetable in paleoanthropology, the human brain evolved to its modern form in a window that began with the appearance of Homo habilis two million years ago and ended with the appearance of “anatomically modern humans,” Homo sapiens sapiens, between 200,000 and 100,000 years ago. I suspect that our ancestors were penetrating the cognitive niche for far longer than that. Both ends of the R&D process might have to be stretched beyond the textbook dates, providing even more time for our fantastic mental adaptations to have evolved.
At one end of the timetable is the four-million-year-old australop-ithecines-like afarensis (the species of the charismatic fossil called Lucy). They are often described as chimpanzees with upright posture because their brain size was in the chimpanzee ballpark and they left no clear evidence of tool use. That implies that cognitive evolution did not begin till two million years later, when larger-brained habilines earned their “handyman” name by chipping choppers.
But that can't be right. First, it is ecologically improbable that a tree-dweller could have moved onto open ground and retooled its anatomy for upright walking without repercussions on every other aspect of its lifestyle and behavior. Modern chimps use tools and transport objects, and would have had much more incentive and success if they could carry them around freely. Second, though australopithecines’ hands retain some apelike curvature of the fingers (and may have been used at times to run up trees for safety), the hands visibly evolved for manipulation. Compared to chimps’ hands, their thumbs are longer and more oppos-able to the other fingers, and their index and middle fingers are angled to allow cupping the palm to grasp a hammerstone or a ball. Third, it's not so clear that they had a chimp-sized brain, or that they lacked tools. The paleoanthropologist Yves Coppens argues that their brains are thirty to forty percent bigger than expected for a chimpanzee of their body size, and that they left behind modified quartz flakes and other tools. Fourth, skeletons of the tool-using habilines (handymen) have now been found, and they do not look so different from the australopithecines'.
Most important, hominids did not arrange their lives around the convenience of anthropologists. We are lucky that a rock can be carved into a cutter and that it lasts for millions of years, so some of our ancestors inadvertently left us time capsules. But it's much harder to carve a rock {202} into a basket, a baby sling, a boomerang, or a bow and arrow. Contemporary hunter-gatherers use many self-composting implements for every lasting one, and that must have been true of hominids at every stage. The archeological record is bound to underestimate tool use.
So the standard timetable for human brain evolution begins the story too late; I think it also ends the story too early. Modern humans (us) are said to have first arisen between 200,000 and 100,000 years ago in Africa. One kind of evidence is that the mitochondrial DNA (mDNA) of everyone on the planet (which is inherited only from one's mother) can be traced back to an African woman living sometime in that period. (The claim is controversial, but the evidence is growing.) Another is that anatomically modern fossils first appear in Africa more than 100,000 years ago and in the Middle East shortly afterward, around 90,000 years ago. The assumption is that human biological evolution had pretty much stopped then. This leaves an anomaly in the timeline. The anatomically modern early humans had the same toolkit and lifestyle as their doomed Neanderthal neighbors. The most dramatic change in the archeological record, the Upper Paleolithic transition — also called the Great Leap Forward and the Human Revolution — had to wait another 50,000 years. Therefore, it is said, the human revolution must have been a cultural change.
Calling it a revolution is no exaggeration. All other hominids come out of the comic strip B.C., but the Upper Paleolithic people were the Flint-stones. More than 45,000 years ago they somehow crossed sixty miles of open ocean to reach Australia, where they left behind hearths, cave paintings, the world's first polished tools, and today's aborigines. Europe (home of the Cro-Magnons) and the Middle East also saw unprecedented arts and technologies, which used new materials like antler, ivory, and bone as well as stone, sometimes transported hundreds of miles. The toolkit included fine blades, needles, awls, many kinds of axes and scrapers, spear points, spear throwers, bows and arrows, fishhooks, engravers, flutes, maybe even calendars. They built shelters, and they slaughtered large animals by the thousands. They decorated everything in sight — tools, cave walls, their bodies — and carved knick-knacks in the shapes of animals and naked women, which archeologists euphemistically call “fertility symbols.” They were us.
Ways of life certainly can shoot off without any biological change, as in the more recent agricultural, industrial, and information revolutions. That is especially true when populations grow to a point where the {203} insights of thousands of inventors can be pooled. But the first human revolution was not a cascade of changes set off by a few key inventions. Ingenuity itself was the invention, manifested in hundreds of innovations tens of thousands of miles and years apart. I find it hard to believe that the people of 100,000 years ago had the same minds as those of the Upper Paleolithic revolutionaries to come — indeed, the same minds as ours — and sat around for 50,000 years without it dawning on a single one of them that you could carve a tool out of bone, or without a single one feeling the urge to make anything look pretty.
And there is no need to believe it — the 50,000-year gap is an illusion. First, the so-called anatomically modern humans of 100,000 years ago may have been more modern than their Neanderthal contemporaries, but no one would mistake them for contemporary humans. They had brow ridges, protruding faces, and heavily built skeletons outside the contemporary range. Their bodies had to evolve to become us, and their brains surely did as well. The myth that they are completely modern grew out of the habit of treating species labels as if they were real entities. When applied to evolving organisms, they are no more than a convenience. No one wants to invent a new species every time a tooth is found, so intermediate forms tend to get shoehorned into the nearest available category. The reality is that hominids must always have come in dozens or hundreds of variants, scattered across a large network of occasionally interacting subpopulations. The tiny fraction of individuals immortalized as fossils at any point were not necessarily our direct ancestors. The “anatomically modern” fossils are closer to us than to anyone else, but either they had more evolving to do or they were away from the hotbed of change.
Second, the revolution probably began well before the commonly cited watershed of 40,000 years ago. That's when fancy artifacts begin to appear in European caves, but Europe has always attracted more attention than it deserves, because it has lots of caves and lots of archeolo-gists. France alone has three hundred well-excavated paleolithic sites, including one whose cave paintings were scrubbed off by an overenthusi-astic boy scout troop that mistook them for graffiti. The entire continent of Africa has only two dozen. But one, in Zaire, contains finely crafted bone implements including daggers, shafts, and barbed points, together with grindstones brought from miles away and the remains of thousands of catfish, presumably the victims of these instruments. The collection looks postrevolutionary but is dated at 75,000 years ago. One commentator {204} said it was like finding a Pontiac in Leonardo da Vinci's attic. But as archeologists are starting to explore this continental attic and date its contents, they are finding more and more Pontiacs: fine stone blades, decorated tools, useless but colorful minerals transported hundreds of miles.
Third, the mitochondrial Eve of 200,000 to 100,000 years ago was not a party to any evolutionary event. Contrary to some fantastic misunderstandings, she did not undergo some mutation that left her descendants smarter or more talkative or less brutish. Nor did she mark the end of human evolution. She is merely a mathematical necessity: the most recent common ancestor of all living people along the female-female-female line of great-great-. . .-great-grandmothers. For all the definition says, Eve could have been a fish.
Eve, of course, turned out to be not a fish but an African hominid. Why would anyone assume that she was a special hominid, or even that she lived in special times? One reason is that she made many other times and places non-special. If twentieth-century Europeans’ and Asians’ mDNA is a variant of 200,000-year-old African mDNA, they must be descendants of an African population at the time. Eve's contemporary Europeans and Asians left no mDNA in today's Europeans and Asians, and thus presumably were not their ancestors (at least — and this is a big proviso — not their all-maternal-line ancestors).
But that says nothing about evolution's having stopped with Eve. We can assume that most evolution was done with by the time the ancestors of the modern races separated and stopped exchanging genes, since today we are birds of a feather. But that did not happen as soon as Eve breathed her last. The diaspora of the races, and the end of significant human evolution, must have occurred much later. Eve is not our most recent common ancestor, only our most recent common ancestor in the all-maternal line. The most recent common ancestor along a mixed-sex line of descendants lived much later. You and a first cousin share an ancestor of just two generations ago, your common grandmother or grandfather. But in looking for a shared all-female-line ancestor (your mother's mother's mother, and so on), then except for one kind of cousin (the child of your mother's sister), there's almost no limit to howfar back you might have to go. So if someone were to guess the degree of related-ness between you and your cousin based on your most recent ancestor, he would say you were closely related. But if he could check only the most recent all-female-line ancestor, he might guess that you are not {205} related at all! Similarly, the birthday of humanity's most recent common all-female-line ancestor, mitochondrial Eve, overestimates how long ago all of humanity was still interbreeding.
Well after Eve's day, some geneticists think, our ancestors passed through a population bottleneck. According to their scenario, which is based on the remarkable sameness of genes across modern human populations, around 65,000 years ago our ancestors dwindled to a mere ten thousand people, perhaps because of a global cooling triggered by a volcano in Sumatra. The human race was as endangered as mountain gorillas are today. The population then exploded in Africa and spun off small bands that moved to other corners of the world, possibly mating now and again with other early humans in their path. Many geneticists believe that evolution is especially rapid when scattered populations exchange occasional migrants. Natural selection can quickly adapt each group to local conditions, so one or more can cope with any new challenge that arises, and their handy genes will then be imported by the neighbors. Perhaps this period saw a final flowering in the evolution of the human mind.
All reconstructions of our evolutionary history are controversial, and the conventional wisdom changes monthly. But I predict that the closing date of our biological evolution will creep later, and the opening date of the archeological revolution will creep earlier, until they coincide. Our minds and our way of life evolved together.
Are we still evolving? Biologically, probably not much. Evolution has no momentum, so we will not turn into the creepy bloat-heads of science fiction. The modern human condition is not conducive to real evolution either. We infest the whole habitable and not-so-habitable earth, migrate at will, and zigzag from lifestyle to lifestyle. This makes us a nebulous, moving target for natural selection. If the species is evolving at all, it is happening too slowly and unpredictably for us to know the direction.
But Victorian hopes spring eternal. If genuine natural selection cannot improve us, maybe a human-made substitute can. The social sciences are filled with claims that new kinds of adaptation and selection have extended the biological kind. But the claims, I think, are misleading. {206}
The first claim is that the world contains a wonderful process called “adaptation” that causes organisms to solve problems. Now, in Darwin's strict sense, adaptation in the present is caused by selection in the past. Remember how natural selection gives an illusion of teleology: selection may look like it is adapting each organism to its needs in the present, but really it is just favoring the descendants of organisms that were adapted to their own needs in the past. The genes that built the most adaptive bodies and minds among our ancestors got passed down to build the innate bodies and minds of today (including innate abilities to track certain kinds of environmental variation, as in tanning, callusing, and learning)-
But for some, that does not go far enough; adaptation happens daily. “Darwinian social scientists” such as Paul Turke and Laura Betzig believe that “modern Darwinian theory predicts that human behavior will be adaptive, that is, designed to promote maximum reproductive success . . . through available descendent and nondescendent relatives.” “Functionalists” such as the psychologists Elizabeth Bates and Brian MacWhin-ney say that they “view the selectional processes operating during evolution and the selectional processes operating during [learning] as part of one seamless natural fabric.” The implication is that thdre is no need for specialized mental machinery: if adaptation simply makes organisms do the right thing, who could ask for anything more? The optimal solution to a problem — eating with one's hands, finding the right mate, inventing tools, using grammatical language — is simply inevitable.
The problem with functionalism is that it is Lamarckian. Not in the sense of Lamarck's second principle, the inheritance of acquired characteristics — the giraffes who stretched their necks and begat baby giraffes with necks pre-stretched. Everyone knows to stay away from that. (Well, almost everyone: Freud and Piaget stuck to it long after it was abandoned by biologists.) It is Lamarckian in the sense of his first principle, “felt need” — the giraffes growing their necks when they hungrily eyed the leaves just out of reach. As Lamarck put it, “New needs which establish a necessity for some part really bring about the existence of that part as a result of efforts.” If only it were so! As the saying goes, if wishes were horses, beggars would ride. There are no guardian angels seeing to it that every need is met. They are met only when mutations appear that are capable of building an organ that meets the need, when the organism finds itself in an environment in which meeting the need translates into more surviving babies, and in which that selection pressure persists over {207} thousands of generations. Otherwise, the need goes unmet. Swimmers do not grow webbed fingers; Eskimos do not grow fur. I have studied three-dimensional mirror-images for twenty years, and though I know mathematically that you can convert a left shoe into a right shoe by turning it around in the fourth dimension, I have been unable to grow a 4-D mental space in which to visualize the flip.
Felt need is an alluring idea. Needs really do feel like they bring forth their own solutions. You're hungry, you have hands, the food's in front of you, you eat with your hands; how else could it be? Ah, but you're the last one we should ask. Your brain was fashioned by natural selection so that it would find such problems obvious. Change the mind (to a robot's, or to another animal's, or to a neurological patient's), or change the problem, and it's no longer so obvious what's obvious. Rats can't learn to drop a piece of food for a larger reward. When chimpanzees try to imitate someone raking in an inaccessible snack, they don't notice that the rake has to be held business-end down, even if the role model makes a conspicuous show of aligning it properly. Lest you feel smug, the chapters to come will show how the design of our own minds gives rise to paradoxes, brain-teasers, myopias, illusions, irrationalities, and self-defeating strategies that prevent, rather than guarantee, the meeting of our everyday needs.
But what about the Darwinian imperative to survive and reproduce? As far as day-to-day behavior is concerned, there is no such imperative. People watch pornography when they could be seeking a mate, forgo food to buy heroin, sell their blood to buy movie tickets (in India), postpone childbearing to climb the corporate ladder, and eat themselves into an early grave. Human vice is proof that biological adaptation is, speaking literally, a thing of the past. Our minds are adapted to the small foraging bands in which our family spent ninety-nine percent of its existence, not to the topsy-turvy contingencies we have created since the agricultural and industrial revolutions. Before there was photography, it was adaptive to receive visual images of attractive members of the opposite sex, because those images arose only from light reflecting off fertile bodies. Before opiates came in syringes, they were synthesized in the brain as natural analgesics. Before there were movies, it was adaptive to witness people's emotional struggles, because the only struggles you could witness were among people you had to psych out every day. Before there was contraception, children were unpostponable, and status and wealth could be converted into more children and healthier ones. Before {208} there was a sugar bowl, salt shaker, and butter dish on every table, and when lean years were never far away, one could never get too much sweet, salty, and fatty food. People do not divine what is adaptive for them or their genes; their genes give them thoughts and feelings that were adaptive in the environment in which the genes were selected.
The other extension of adaptation is the seemingly innocuous cliche that “cultural evolution has taken over from biological evolution.” For millions of years, genes were transmitted from body to body and were selected to confer adaptations on organisms. But after humans emerged, units of culture were transmitted from mind to mind and were selected to confer adaptations on cultures. The torch of progress has been passed to a swifter runner. In 2001: A Space Odyssey, a hairy arm hurls a bone into the air, and it fades into a space station.
The premise of cultural evolution is that there is a single phenomenon — the march of progress, the ascent of man, apes to Armageddon — that Darwin explained only up to a point. My own view is that human brains evolved by one set of laws, those of natural selection and genetics, and now interact with one another according to other sets of laws, those of cognitive and social psychology, human ecology, and history. The reshaping of the skull and the rise and fall of empires may have little in common.
Richard Dawkins has drawn the clearest analogy between the selection of genes and the selection of bits of culture, which he dubbed memes. Memes such as tunes, ideas, and stories spread from brain to brain and sometimes mutate in the transmission. New features of a meme that make its recipients more likely to retain and disseminate it, such as being catchy, seductive, funny, or irrefutable, will lead to the meme's becoming more common in the meme pool. In subsequent rounds of retelling, the most spreadworthy memes will spread the most and will eventually take over the population. Ideas will therefore evolve to become better adapted to spreading themselves. Note that we are talking about ideas evolving to become more spreadable, not -peo-ple evolving to become more knowledgeable.
Dawkins himself used the analogy to illustrate how natural selection pertains to anything that can replicate, not just DNA. Others treat it as a {209} genuine theory of cultural evolution. Taken literally, it predicts that cultural evolution works like this. A meme impels its bearer to broadcast it, and it mutates in some recipient: a sound, a word, or a phrase is randomly altered. Perhaps, as in Monty Python's Life of Brian, the audience of the Sermon on the Mount mishears “Blessed are the peacemakers” as “Blessed are the cheesemakers.” The new version is more memorable and comes to predominate in the majority of minds. It too is mangled by typos and speakos and hearos, and the most spreadable ones accumulate, gradually transforming the sequence of sounds. Eventually they spell out, “That's one small step for a man, one giant leap for mankind.”
I think you'll agree that this is not how cultural change works. A complex meme does not arise from the retention of copying errors. It arises because some person knuckles down, racks his brain, musters his ingenuity, and composes or writes or paints or invents something. Granted, the fabricator is influenced by ideas in the air, and may polish draft after draft, but neither of these progressions is like natural selection. Just compare the input and the output — draft five and draft six, or an artist's inspiration and her oeuvre. They do not differ by a few random substitutions. The value added with each iteration comes from focusing brainpower on improving the product, not from retelling or recopying it hundreds of thousands of times in the hope that some of the malaprops or typos will be useful.
Stop being so literal-minded! respond the fans of cultural evolution. Of course cultural evolution is not an exact replica of the Darwinian version. In cultural evolution, the mutations are directed and the acquired characteristics are inherited. Lamarck, while being wrong about biological evolution, turned out to be right about cultural evolution.
But this won't do. Lamarck, recall, was not just unlucky in his guess about life on this planet. As far as explaining complex design goes, his theory was, and is, a non-starter. It is mute about the beneficent force in the universe or all-knowing voice in the organism that bestows the useful mutations. And it's that force or voice that's doing all the creative work. To say that cultural evolution is Lamarckian is to confess that one has no idea how it works. The striking features of cultural products, namely their ingenuity, beauty, and truth (analogous to organisms’ complex adaptive design), come from the mental computations that “direct” — that is, invent — the “mutations,” and that “acquire” — that is, understand — the “characteristics.”
Models of cultural transmission do offer insight on other features of {210} cultural change, particularly their demographics — how memes can become popular or unpopular. But the analogy is more from epidemiology than from evolution: ideas as contagious diseases that caluse epidemics, rather than as advantageous genes that cause adaptations. They explain how ideas become popular, but not where ideas come from.
Many people unfamiliar with cognitive science see cultural evolution as the only hope for grounding wispy notions like ideas and culture in rigorous evolutionary biology. To bring culture into biology, they reason, one shows how it evolved by its own version of natural selection. But that is a non sequitur; the products of evolution don't have to look like evolution. The stomach is firmly grounded in biology, but it does not randomly secrete variants of acids and enzymes, retain the ones that break down food a bit, let them sexually recombine and reproduce, and so on for hundreds of thousands of meals. Natural selection already went through such trial and error in designing the stomach, and now the stomach is an efficient chemical processor, releasing the right acids and enzymes on cue. Likewise, a group of minds does not have to recapitulate the process of natural selection to come up with a good idea. Natural selection designed the mind to be an information processor, and now it perceives, imagines, simulates, and plans. When ideas are passed around, they aren't merely copied with occasional typographical errors; they are evaluated, discussed, improved on, or rejected. Indeed, a mind that passively accepted ambient memes would be a sitting duck for exploitation by others and would have quickly been selected against.
The geneticist Theodosius Dobzhansky famously wrote that nothing in biology makes sense except in the light of evolution. We can add that nothing in culture makes sense except in the light of psychology. Evolution created psychology, and that is how it explains culture. The most important relic of early humans is the modern mind.
<< | {211} | >> |
To gaze is to think.
— Salvador Dali
L |
ast decades had hula hoops, black-light posters, CB radios, and Rubik's cube. The craze of the 1990s is the autostereogram, also called Magic Eye, Deep Vision, and Superstereogram. These are the computer-generated squiggles that when viewed with crossed eyes or a distant gaze spring into a vivid illusion of three-dimensional, razor-edged objects majestically suspended in space. The fad is now five years old and autostereograms are everywhere, from postcards to Web pages. They have been featured in editorial cartoons, in the Blondie comic strip, and in situation comedies like Seinfeld and Ellen. In one episode, the comedian Ellen DeGeneres belongs to a reading club that has chosen a stereogram book as its weekly selection. Ashamed that she cannot see the illusions, she sets aside an evening to train herself, without success. In desperation she joins a support group for people who cannot “get” stereograms.
Visual illusions fascinated people long before the psychologist Christopher Tyler inadvertently created this sensation in his research on binocular (two-eyed) vision. Simpler illusions made up of parallel lines that seem to converge and congruent lines that look unequal have long appeared in cereal-box reading material, Crackerjack prizes, children's museums, and psychology courses. Their fascination is obvious. “Who are you going to believe, me or your own eyes?” says Groucho Marx to Margaret Dumont, playing on our faith that vision is a certain route to knowledge. As the sayings go: I call them as I see them; Seeing is believing; We have an eyewitness; I saw it with my own eyes. But if a devilish {212} display can make us see things that aren't there, how can we trust our eyes at other times?
Illusions are no mere curiosities; they set the intellectual agenda for centuries of Western thought. Skeptical philosophy, as old as philosophy itself, impugns our ability to know anything by rubbing our faces in illusions: the oar in the water that appears bent, the round tower that from a distance looks flat, the cold finger that perceives tepid water as hot while the hot finger perceives it as cold. Many of the great ideas of the Enlightenment were escape hatches from the depressing conclusions skeptical philosophers drew from illusions. We can know by faith, we can know by science, we can know by reason, we can know that we think and therefore that we are.
Perception scientists take a lighter view. Vision may not work all the time, but we should marvel that it works at all. Most of the time we don't bump into walls, bite into plastic fruit, or fail to recognize our mothers. The robot challenge shows that this is no mean feat. The medieval philosophers were wrong when they thought that objects conveniently spray tiny copies of themselves in all directions and the eye captures a few and grasps their shape directly. We can imagine a science-fiction creature that embraces an object with calipers, prods it with probes and dipsticks, makes rubber molds, drills core samples, and snips off bits for biopsies. But real organisms don't have these luxuries. When they apprehend the world by sight, they have to use the splash of light reflected off its objects, projected as a two-dimensional kaleidoscope of throbbing, heaving streaks on each retina. The brain somehow analyzes the moving collages and arrives at an impressively accurate sense of the objects out there that gave rise to them.
The accuracy is impressive because the problems the brain is solving are literally unsolvable. Recall from Chapter 1 that inverse optics, the deduction of an object's shape and substance from its projection, is an “ill-posed problem,” a problem that, as stated, has no unique solution. An elliptical shape on the retina could have come from an oval viewed head-on or a circle viewed at a slant. A patch of gray could have come from a snowball in the shade or a lump of coal in the sun. Vision has evolved to convert these ill-posed problems into solvable ones by adding premises: assumptions about how the world we evolved in is, on average, put together. For example, I will explain how the human visual system “assumes” that matter is cohesive, surfaces are uniformly colored, and objects don't go out of their way to line up in confusing arrangements. {213} When the current world resembles the average ancestral environment, we see the world as it is. When we land in an exotic world where the assumptions are violated — because of a chain of unlucky coincidences or because a sneaky psychologist concocted the world to violate the assumptions — we fall prey to an illusion. That is why psychologists are obsessed with illusions. They unmask the assumptions that natural selection installed to allow us to solve unsolvable problems and know, much of the time, what is out there.
Perception is the only branch of psychology that has been consistently adaptation-minded, seeing its task as reverse-engineering. The visual system is not there to entertain us with pretty patterns and colors; it is contrived to deliver a sense of the true forms and materials in the world. The selective advantage is obvious: animals that know where the food, the predators, and the cliffs are can put the food in their stomachs, keep themselves out of the stomachs of others, and stay on the right side of the clifftop.
The grandest vision of vision has come from the late artificial intelligence researcher David Marr. Marr was the first to describe vision as solving ill-posed problems by adding assumptions about the world, and was a forceful defender of the computational theory of mind. He also offered the clearest statement of what vision is for. Vision, he said, “is a process that produces from images of the external world a description that is useful to the viewer and not cluttered with irrelevant information.”
It may seem strange to read that the goal of vision is a “description.” After all, we don't walk around muttering a play-by-play narration of everything we see. But Marr was referring not to a publicly spoken description in English but to an internal, abstract one in mentalese. What does it mean to see the world? We can describe it in words, of course, but we can also negotiate it, manipulate it physically and mentally, or file it away in memory for future reference. All these feats depend on construing the world as real things and stuff, not as the psychedelia of the retinal image. We call a book “rectangular,” not “trapezoidal,” though it projects a trapezoid on the retina. We mold our fingers into a rectangular (not trapezoidal) posture as we reach for it. We build rectangular (not trapezoidal) shelves to hold it, and we deduce that it can support a broken couch by fitting into the rectangular space beneath it. Somewhere in the mind there must be a mental symbol for “rectangle,” delivered by vision but available at once to the rest of the verbal and non-verbal {214} mind. That mental symbol, and the mental propositions that capture the spatial relations among objects (“book lying face down on shelf near door”), are examples of the “description” that Marr charged vision with computing.
If vision did not deliver a description, every mental faculty — language, walking, grasping, planning, imagining — would need its own procedure for deducing that the trapezoid on the retina is a rectangle in the world. That alternative predicts that a person who can call a slanted rectangle a “rectangle” may still have to learn how to hold it as a rectangle, how to predict that it will fit into rectangular spaces, and so on. That seems unlikely. When vision deduces the shape of an object that gave rise to a pattern on the retina, all parts of the mind can exploit the discovery. Though some parts of the visual system siphon off information to motor-control circuits that need to react quickly to moving targets, the system as a whole is not dedicated to any one kind of behavior. It creates a description or representation of the world, couched in objects and 3-D coordinates rather than retinal images, and inscribes it on a blackboard readable by all the mental modules.
This chapter explores how vision turns retinal depictions into mental descriptions. We will work our way up from splashes of light to concepts of objects, and beyond them to a kind of interaction between seeing and thinking known as mental imagery. The repercussions reach to the rest of the psyche. We are primates — highly visual creatures — with minds that evolved around this remarkable sense.
Let's begin with the stereograms. How do they work, and why, ffor some people, don't they work? Despite all the posters, books, and jigsaw puzzles, I have not seen a single attempt at explaining them to the millions of curious consumers. Understanding stereograms is not only a good way to grasp the workings of perception but it is also a treat for the intellect. Stereograms are yet another example of the marvelous contrivances of natural selection, this one inside our own heads.
Autostereograms exploit not one but four discoveries on how to trick the eye. The first, strange to say, is the picture. We are so jaded* by photographs, drawings, television, and movies that we forget that they are a {215} benign illusion. Smears of ink or flickering phosphor dots can make us laugh, cry, even become sexually aroused. Humans have made pictures for at least thirty thousand years, and contrary to some social-science folklore, the ability to see them as depictions is universal. The psychologist Paul Ekman created a furor in anthropology by showing that isolated New Guinean highlanders could recognize the facial expressions in photographs of Berkeley students. (Emotions, like everything else, were thought to be culturally relative.) Lost in the brouhaha was a more basic discovery: that the New Guineans were seeing things in the photographs at all rather than treating them as blotchy gray paper.
The picture exploits projection, the optical law that makes perception such a hard problem. Vision begins when a photon (unit of light energy) is reflected off a surface and zips along a line through the pupil to stimulate one of the photoreceptors (rods and cones) lining the curved inner surface of the eyeball. The receptor passes a neural signal up to the brain, and the brain's first task is to figure out where in the world that photon came from. Unfortunately, the ray defining the photon's path extends out to infinity, and all the brain knows is that the originating patch lies somewhere along the ray. For all the brain knows, it could be a foot away, a mile away, or many light-years away; information about the third dimension, distance from the eye, has been lost in the process of projection. The ambiguity is multiplied combinatorially by the million other receptors in the retina, each fundamentally confused about how far away its stimulating patch lies. Any retinal image, then, could have been produced by an infinite number of arrangements of three-dimensional surfaces in the world (see the diagram on p. 9).
Of course, we don't perceive infinite possibilities; we home in on one, generally close to the correct one. And here is an opening for a crafter of illusions. Arrange some matter so that it projects the same retinal image as an object the brain is biased to recognize, and the brain should have no way of telling the difference. A simple example is the Victorian novelty in which a peephole in a door revealed a sumptuously furnished room, but when the door was opened the room was empty. The sumptuous room was in a dollhouse nailed to the door over the peephole.
The painter-turned-psychologist Adelbert Ames, Jr., made a career out of carpentering even stranger illusory rooms. In one, rods and slabs were suspended from wires higgledy-piggledy throughout the room. But when the room was seen from outside through a peephole in a wall, the rods and slabs lined up into a projection of a kitchen chair. In another {216} room, the rear wall slanted away from left to right, but it had craky angles that made its left side just short enough to cancel its expansion in perspective, and its right side just tall enough to cancel its contraction. Through a peephole on the opposite side, the wall projected a rectangle. The visual system hates coincidences: it assumes that a regular image comes from something that really is regular and that it doesn't ijust look that way because of the fortuitous alignment of an irregular shape. Ames did align an irregular shape to give a regular image, and he reinforced his cunning trick with crooked windows and floor tiles. When a child stands in the near corner and her mother stands in the far one, the child projects a larger retinal image. The brain takes depth into account when assessing size; that's why a looming toddler never seems to dominate her distant parent in everyday life. But now the viewer's sense of depth is a victim of its distaste for coincidence. Every inch of the wall appears the same distance away, so the retinal images of the bodies are interpreted at face value, and Junior towers over Mom. When they change places by walking along the rear wall, Junior shrinks to lapdog size and Mom becomes Wilt Chamberlain. Ames’ room has been built in several museums of science, such as the Exploratorium in San Francisco, and you can see (or be seen in) this astonishing illusion for yourself.
Now, a picture is nothing but a more convenient way of arranging matter so that it projects a pattern identical to real objects. The mimicking matter sits on a flat surface, rather than in a dollhouse or suspended by wires, and it is formed by smearing pigments rather than by cutting shapes out of wood. The shapes of the smears can be determined without the twisted ingenuity of an Ames. The trick was stated succiinctly by Leonardo da Vinci: “Perspective is nothing else than seeing a place behind a pane of glass, quite transparent, on the surface of which the objects behind the glass are drawn.” If the painter sights the scene from {217} a fixed viewing position and copies the contours faithfully, down to the last hair of the dog, a person who then views the painting from the position of the painter would have his eye impaled by the same sheaf of light rays that the original scene projected. In that part of the visual field the painting and the world would be indistinguishable. Whatever assumptions impel the brain to see the world as the world and not as smeared pigment will impel it to see the fainting as the world and not as smeared pigment.
What are those assumptions? We'll explore them later, but here is a preview. Surfaces are evenly colored and textured (that is, covered with regular grain, weave, or pockmarking), so a gradual change in the markings on a surface is caused by lighting and perspective. The world often contains parallel, symmetrical, regular, right-angled figures lying on the flat ground, which only appear to taper in tandem; the tapering is written off as an effect of perspective. Objects have regular, compact silhouettes, so if Object A has a bite taken out that is filled by Object B, A is behind B; accidents don't happen in which a bulge in B fits flush into the bite in A. You can feel the force of the assumptions in these line drawings, which convey an impression of depth.
In practice, realist painters do not daub paint on windows but use visual images from memory and a host of tricks to accomplish the same thing on a canvas. They use grids made of wire or etched in glass, taut strings running from the scene through pinholes in the canvas to a viewing reticle, the camera obscura, the camera lucida, and now the camera Nikon. And, of course, no painter reproduces every hair of the dog. Brush strokes, the texture of the canvas, and the shape of the frame make a painting depart from the idealization of Leonardo's window. Also, we almost always see a painting from a vantage point different from the one the painter assumed in front of his window, and this makes the sheaf of light rays impaling the eye different from the one the real scene would send out. That is why paintings are only partly illusory: we see what the painting depicts, but we simultaneously see it as a painting, not as reality. The canvas and frame tip us off, and remarkably, we use these very clues about picturehood to ascertain our vantage point relative to the painting {218} and to compensate for its difference from the painter's. We undo the distortion of the picture as If seeing it from the painter's perspective, and interpret the adjusted shapes correctly. The compensation works only up to a point. When we arrive late to a movie and sit in the front row, the difference between our vantage point and the camera's (analogous to the painter at Leonardo's window) is too much of a stretch, and we see warped actors slithering across a trapezoid.
There is another difference between art and life. The painter had to sight the scene from a single vantage point. The viewer peeps at the world from two vantage points: his left eye's and his right eye's. Hold out a finger and remain still while you close one eye, then the other. The finger obscures different parts of the world behind it. The two eyes have slightly different views, a fact of geometry called binocular parallax.
Many kinds of animals have two eyes, and whenever they aim forward, so that their fields overlap (rather than aiming outward for a panoramic view), natural selection must have faced the problem of combining their pictures into a unified image that the rest of the brain can use. That hypothetical image is named after a mythical creature with a single eye in the middle of its forehead: the Cyclops, a member of a race of monocular giants encountered by Odysseus in his travels. The problem in making a cyclopean image is that there is no direct way to overlay the views of the two eyes. Most objects fall on different places in the two images, and the difference depends on how far away they are: the closer the object, the farther apart its facsimiles lie in the two eyes' projections. Imagine looking at an apple on a table, with a lemon behind it and cherries in front. {219}
Your eyes are aimed at the apple, so its image lands on each eye's fovea (the dead center of the retina, where vision is sharpest). The apple is at six o'clock in both retinas. Now look at the projections of the cherries, which are nearer. In the left eye they sit at seven o'clock, but in the right eye they sit at five o'clock, not seven. The lemon, which is farther, projects an image at five-thirty in the left eye but at six-thirty in the right eye. Objects closer than the point of fixation wander outward toward the temples; objects that are farther squeeze inward toward the nose.
But the impossibility of a simple overlay presented evolution with an opportunity. With a bit of high school trigonometry, one can use the difference in an object's projection in the two eyes, together with the angle formed by the two eyes’ gaze and their separation in the skull, to calculate how far away the object is. If natural selection could wire up a neural computer to do the trig, a two-eyed creature could shatter Leonardo's window and sense an object's depth. The mechanism is called stereoscopic vision, stereo for short.
Incredibly, for thousands of years no one noticed. Scientists thought that animals have two eyes for the same reason they have two kidneys: as a by-product of a bilaterally symmetrical body plan, and perhaps so that one could serve as a spare if the other got damaged. The possibility of stereo vision escaped Euclid, Archimedes, and Newton, and even Leonardo did not fully appreciate it. He did notice that the two eyes have different views of a sphere, the left eye seeing slightly farther around it on the left and the right eye seeing farther around it on the right. If only he had used a cube in his example instead of a sphere, he would have noticed that the shapes on the retinas are different. Stereo vision was not discovered until 1838, by Charles Wheatstone, a physicist and inventor after whom the “Wheatstone bridge” electrical circuit is named. Wheatstone wrote:
It will now be obvious why it is impossible for the artist to give a faithful representation of any near solid object, that is, to produce a painting which shall not be distinguished in the mind from the object itself. When the painting and the object are seen with both eyes, in the case of the painting two similar pictures are projected on the retinae, in the case of the solid object the two pictures are dissimilar; there is therefore an essential difference between the impressions on the organs of sensation in the two cases, and consequently between the perceptions formed in the mind; the painting therefore cannot be confounded with the solid object. {220}
The late discovery of stereo vision is surprising, because it is, not hard to notice in everyday experience. Keep one eye closed for a few minutes as you walk around. The world is a flatter place, and you might find yourself grazing doorways and spooning sugar into your lap. Of course, the world does not flatten completely. The brain still has the kinds of information that are present in pictures and television, like tapering, occlusion, placement on the ground, and gradients of texture. Most important, it has motion. As you move around, your vantage point changes continuously, making nearby objects whiz by and farther ones budge more slowly. The brain interprets the flow pattern as a three-dimensional world going by. The perception of structure from optical flow is obvious in Star Trek, Star Wars, and popular computer screen-savers where white dots fleeing the center of the monitor convey a vivid impression of flying through space (though real stars would be too far away to give that impression to a real-life starfleet crew). All these monocular cues to depth allow people who are blind in one eye to get around pretty well, including the aviator Wiley Post and a wide receiver for the New York Giants football team in the 1970s. The brain is an opportunistic and mathematically adroit consumer of information, and perhaps that is why its use of one cue, binocular disparity, eluded scientists for so long.
Wheatstone proved that the mind turns trigonometry into consciousness when he designed the first fully three-dimensional picture, the stereogram. The idea is simple. Capture a scene using two of Leonardo's windows, or, more practically, two cameras, each positioned where one eye would be. Place the right picture in front of a person's right eye and the left picture in front of his left eye. If the brain assumes that the two eyes look at one three-dimensional world, with differences in the views coming from binocular parallax, it should be fooled by the pictures and combine them into a cyclopean image in which objects appear at different depths. {221}
But here Wheatstone ran into a problem, one that still challenges all stereoscopic gadgets. The brain physically adjusts the eyes to the depth of a surface in two ways. First, though I have been describing the pupil as if it were a pinhole, in fact it has a lens to accumulate many rays of light emanating from a point in the world and to focus them all at a point on the retina. The closer the object, the more the rays have to be bent for them to converge to a point rather than to a blurry disk, and the fatter the lens of the eye has to be. Muscles inside the eyeball have to thicken the lens to focus nearby objects and flatten it to focus distant objects.
The squeezing is controlled by the focusing reflex, a feedback loop that adjusts the shape of the lens until the fine detail on the retina is at a maximum. (The circuit is similar to the one used in some autofocus cameras.) Poorly focused movies are annoying to watch because the brain keeps trying to eliminate the blur by accommodating the lens, a futile gesture.
The second physical adjustment is to aim the two eyes, which are about two and a half inches apart, at the same spot in the world. The closer the object, the more the eyes must be crossed. {222}
The eyes are crossed and uncrossed by muscles attached to their sides; the muscles are controlled by a brain circuit that tries to eliminate double images. (Seeing double is often a sign that the brain has been poisoned, suffocated, or bruised.) The circuit is similar to the rangefinders in old cameras, in which a prism superimposes the views from two viewfinder windows and the photographer angles the prism (which is geared to the camera lens) until the images line up. The brain uses the rangefinder principle as another source of information about depth, perhaps an indispensable one. Stereo vision gives information only about relative depth — depth in front of or behind the point on which the eyes have converged — and feedback from eyeball direction must be used to anchor a sense of absolute depth.
Now here's the problem for the stereoscope designer. The focusing reflex and the eye-crossing reflex are coupled. If you focus on a nearby point to eliminate blur, the eyes converge; if you focus on a distant one, they become parallel. If you converge your eyes on a nearby point to eliminate double vision, the eyes squeeze the lens to close-up focus; if you diverge your eyes on a distant point, they relax for distant focus. The coupling defeats the most straightforward design for a stereoscope, in which a small picture is placed in front of each eye and both eyes point straight ahead, each at its own picture. Pointing the eyes straight ahead is what you do for distant objects, and it drags the focus of each eye to distance vision, blurring the pictures. Focusing the pictures then brings the eyes together, so the eyes are pointing at the same picture rather than each eye aiming at a different one, and that's no good, either. The eyes bob in and out and the lenses thicken and flatten, but not at the right times. To get a stereoscopic illusion, something has to give.
One solution is to uncouple the responses. Many experimental psychologists have trained themselves like fakirs to wrest control of their reflexes and to “free-fuse” stereograms by an act of will. Some cross their eyes at an imaginary point in front of the picture, so that the left eye is staring at the right picture and vice versa, while they focus each eye on the picture behind the imaginary point. Others lock their eyes straight ahead to infinity while maintaining focus. I once took an afternoon out to train myself to do this after I learned that William James said it was a skill every good psychologist should master. But people with lives cannot be expected to show such dedication.
Wheatstone's invention was a bit ungainly because he faced a second problem: the drawings and daguerreotypes of his age were too big to fit {223} in front of the eyes without overlapping, and people could not point their eyes outward to gaze at one on each side like fish. So he put one picture off to each side, the two facing each other like bookends, and between them he placed two mirrors glued together like the cover of an open book, each mirror reflecting a picture. He then put a prism in front of each mirror and adjusted them so that the two mirrors appeared to be superimposed. When people looked through the prisms and saw the superimposed reflections of the two pictures, the scene in the pictures leapt into three-dimensionality. The advent of better cameras and smaller film led to a simpler, hand-held design that is still with us. Small pictures — as always, photographed from two vantage points positioned like the eyes — are placed side by side with a perpendicular blinker between them and a glass lens in front of each eye. The glass lens relieves the eye of having to focus its nearby picture, and the eye can relax to its infinity setting. That spreads the eyes so they are pointing straight ahead, one at each picture, and the pictures easily fuse.
The stereoscope became the television of the nineteenth century. Victorian-era families and friends spent cozy hours taking turns to view stereo photographs of Parisian boulevards, Egyptian pyramids, or Niagara Falls. Beautiful wooden stereoscopes and the software for them (cards with side-by-side photographs) are still sold in antique stores to avid collectors. A modern version is the ViewMaster, available at tourist traps the world over: an inexpensive viewer that displays a ring of stereo slides of the local attractions.
A different technique, the anaglyph, overlays the two images on one surface and uses clever tricks so that each eye sees only the image intended for it. A familiar example is the notorious red-and-green cardboard eyeglasses associated with the 3-D movie craze of the early 1950s. The left eye's image is projected in red and the right eye's image is projected in green onto a single white screen. The left eye peers at the screen through a green filter, which makes the white background look green and the green lines intended for the other eye invisible; the red lines intended for the left eye stand out as black. Similarly, the red filter over the right eye makes the background red, the red lines invisible, and the green lines black. Each eye gets its own image, and the Sludge Monsters from Alpha Centauri rise out of the screen. An unfortunate side effect is that when the two eyes see very different patterns like the red and green backgrounds, the brain cannot fuse them. It carves the visual field into a patchwork and seesaws between seeing each patch as green or red, a {224} disconcerting effect called binocular rivalry. You can experience a milder case by holding a finger a few inches in front of you with both eyes open gazing into the distance so you get a double image. If you pay attention to one of the double images, you will notice that portions slowly become opaque, dissolve into transparency, fill in again, and so on.
A better kind of anaglyph puts polarizing filters, rather than colored filters, over two projector lenses and in the cardboard glasses. The image intended for the left eye is projected from the left projector in light waves that oscillate in a diagonal plane, like this: /. The light can pass through a filter in front of the left eye which has microscopic slits that are also in that orientation, but cannot pass through a filter in front of the right eye with slits in the opposite orientation, like this: V Conversely, the filter in front of the right eye allows in only the light coming from the right projector. The superimposed images can be in color, and they do not incite rivalry between the eyes. The technique was used to excellent effect by Alfred Hitchcock in Dial “M” for Murder in the scene in which Grace Kelly reaches out for the scissors to stab her would-be strangler. The same cannot be said for the film adaptation of Cole Porter's Kiss Me Kate, in which a dancer belts out “Too Darn Hot” on a coffee table while flinging scarves at the camera.
Modern anaglyph glasses have panes made of liquid crystal displays (like the numbers on a digital watch) which act as silent, electrically controlled shutters. At any moment one shutter is transparent and the other is opaque, forcing the eyes to take turns at seeing a computer screen in front of them. The glasses are synchronized with the screen, which shows the left eye's image while the left shutter is open and the right eye's image while the right shutter is open. The views alternate too quickly for the eyes to notice the flicker. The technology is used in some virtual reality displays. But the state of the art in virtual reality is a high-tech version of the Victorian stereoscope. A computer displays each image on a little LCD screen with a lens in front of it, mounted in front of each eye on the inside of a helmet or visor.
These technologies all force the viewer to don or peer through some kind of apparatus. The illusionist's dream is a stereogram that can be seen with the naked eye — an autostereogram. {225}
The principle was discovered a century and a half ago by David Brew-ster, the Scottish physicist who also studied polarized light and invented the kaleidoscope and the Victorian-era stereoscope. Brewster noticed that the repeating patterns on wallpaper can leap out in depth. Adjacent copies of the pattern, say a flower, can each lure one eye into fixating on it. That can happen because identical flowers are positioned at the same places on the two retinas, so the double image looks like a single image. In fact, like a misbuttoned shirt, a whole parade of double images can falsely mesh into a single image, except for the unpaired members at each end. The brain, seeing no double image, is prematurely satisfied that it has converged the eyes properly, and locks them into the false alignment. This leaves the eyes aimed at an imaginary point behind the wall, and the flowers seem to float in space at that distance. They also seem inflated, because the brain does its trigonometry and calculates how big the flower would have to be at that depth to project its current retinal image.
An easy way to experience the wallpaper effect is to stare at a tile wall a few inches away, too close to focus and converge on comfortably. (Many men rediscover the effect as they stand at a urinal.) The tiles in front of each eye easily fuse, creating the surreal impression of a very large tile wall a great distance away. The wall bows outward, and as the head moves from side to side the wall rocks in the opposite direction. Both would have to happen in the world if the wall were really at that distance while projecting the current retinal image. The brain creates those illusions in its headlong attempt to keep the geometry of the whole hallucination consistent. {226}
Brewster also noticed that any irregularity in the spacing of a pair of copies makes them protrude or recess from the rest. Imagine that the flowers pierced by the lines of sight in the diagram are printed a bit closer to each other. The lines of sight are brought together and cross each other closer to the eyes. The images on the retina will splay out to the temples, and the brain sees the imaginary flower as being nearer. Similarly, if the flowers had been printed a bit farther apart, the lines of sight will cross farther away, and their retinal projections will crowd toward the nose. The brain hallucinates the ghost object at a slightly greater distance.
We have now arrived at a simple kind of “magic eye” illusion, the wallpaper autostereogram. Some of the stereograms in the books and greeting cards show rows of repeating figures — trees, clouds, mountains, people. When you view the stereogram, each tier of objects drifts in or out and lands at its own depth (although in these autostereograms, unlike the squiggly ones, no new shape emerges; we'll come to those soon). Here is an example, designed by Ilavenil Subbiah.
It is like Brewster's wallpaper, but with the unequal separations put in deliberately rather than by a paperhanger's sloppiness. The picture accommodates seven sailboats because they are closely packed, but only five arches because they are spaced farther apart. When you look behind the picture, the sailboats seem closer than the arches because their mis-buttoned lines of sight meet in a nearer plane.
If you don't already know how to fuse stereograms, try holding the book right up to your eyes. It is too close to focus; just let your eyes point straight ahead, seeing double. Slowly move the book away while keeping your eyes relaxed and “looking through” the book to an imaginary point {227} beyond it. (Some people place a pane of glass or a transparency on top of the stereogram, so they can focus on the reflections of distant objects.) You should still be seeing double. The trick is to let one of the double images drift on top of the other, and then to keep them there as if they were magnets. Try to keep the images aligned. The superimposed shapes should gradually come into focus and pop in or out to different depths. As Tyler has noted, stereo vision is like love: if you're not sure, you're not experiencing it.
Some people have better luck holding a finger a few centimeters in front of the stereogram, focusing on the finger, and then removing it while keeping the eyes converged to that depth. With this technique, the false fusion comes from the eyes crossing so that the left eye sights a boat on the right while the right eye sights a boat on the left. Don't worry about what your mother said; your eyes will not freeze into that position forever. Whether you can fuse stereograms with your eyes crossed too much or not enough probably depends on whether you are slightly crosseyed or wall-eyed to begin with.
With practice, most people can fuse wallpaper autostereograms. They do not need the yogi-like concentration of the psychologists who free-fuse the two-picture stereograms, because they do not have to uncouple their focusing reflex from their convergence reflex to the same degree. Free-fusing a two-picture stereogram requires jamming your eyes far enough apart that each eye remains aimed at one of the pictures. Fusing a wallpaper stereogram requires merely keeping the eyes far enough apart that each eye remains aimed at neighboring clones inside a single picture. The clones are close enough together that the convergence angle is not too far out of line from what the focusing reflex wants it to be. It shouldn't be too hard for you to exploit this small wiggle in the mesh between the two reflexes and focus a wee bit closer than your eyes converge. If it is, Ellen DeGeneres may be able to get you into her support group.
The trick behind the wallpaper stereogram — identical drawings luring the eyes into mismatching their views — uncovers a fundamental problem the brain has to solve to see in stereo. Before it can measure the positions of a spot on the two retinas, the brain has to be sure that the {228} spot on one retina came from the same mark in the world as the spot on the other retina. If the world had only one mark in it, it would be easy. But add a second mark, and their retinal images can be matched in two ways: spot 1 in the left eye with spot 1 in the right eye, and spot 2 in the left eye with spot 2 in the right eye — the correct matchup — or spot 1 in the left eye with spot 2 in the right eye, and spot 2 in the left eye with spot 1 in the right eye — a mismatch that would lead to the hallucination of two ghost marks instead.
Add more marks, and the matching problems multiply. With three marks, there are six ghost matches; with ten marks, ninety; with a hundred marks, almost ten thousand. This “correspondence problem” was noticed in the sixteenth century by the astronomer Johannes Kepler, who thought about how stargazing eyes match up their thousands of white dots and how an object's position in space could be determined: from its multiple projections. The wallpaper stereogram works by coaxing the brain to accept a plausible but false solution to the correspondence problem.
Until recently, everyone thought that the brain solved the correspondence problem in everyday scenes by first recognizing the objects in each eye and then matching up images of the same object. Lemon in left eye goes with lemon in right eye, cherries in left eye go with cherries in right eye. Stereo vision, guided by the intelligence of the whole person, could head off the mismatches by only joining up points that came from the same kind of object. A typical scene may contain millions of dots, but it will contain far fewer lemons, maybe only one. So if the brain tnatched whole objects, there would be fewer ways for it to go wrong.
But nature did not opt for that solution. The first hint came from another of Ames’ wacky rooms. This time the indefatigable Ames built an ordinary rectangular room but glued leaves on every inch of its floor, {229} walls, and ceiling. When the room was viewed with one eye through a peephole, it looked like an amorphous sea of green. But when it was viewed with both eyes, it sprang into its correct three-dimensional shape. Ames had built a world that could be seen only by the mythical cyclo-pean eye, not by the left eye or the right eye alone. But how could the brain have matched up the two eyes' views if it had to depend on recognizing and linking the objects in each one? The left eye's view was “leaf leaf leaf leaf leaf leaf leaf leaf.” The right eye's view was “leaf leaf leaf leaf leaf leaf leaf leaf.” The brain was faced with the hardest correspondence problem imaginable. Nonetheless it effortlessly coupled the views and conjured up a cydopean vision.
The demonstration is not airtight. What if the edges and corners of the room were not perfectly masked by the leaves'? Perhaps each eye had a rough inkling of the room's shape, and when the brain fused the two images it became more confident that the inklings were accurate. The airtight proof that the brain can solve the correspondence problem without recognizing objects came from an ingenious early use of computer graphics by the psychologist Bela Julesz. Before he fled Hungary for the United States in 1956, julesz was a radar engineer with an interest in aerial reconnaissance. Spying from the air uses a clever trick: stereo views penetrate camouflage. A camouflaged object is covered with markings resembling the background it lies on, making the boundary between the object and its background invisible. But as long as the object is not pancake-flat, when it is viewed from two vantage points its markings will appear in slightly different positions in the two views, whereas the background markings will not have moved quite as much because they are farther away. The trick in aerial reconnaissance is to photograph the land, let the plane fly a bit, and photograph it again. The pictures are placed side by side and then fed into a hypersensitive detector of disparity in two images: a human being. A person literally looks at the photographs with a stereo viewer, as if he were a giant peering down from the sky with one eye at each position from which the airplane took a picture, and the camouflaged objects pop out in depth. Since a camouflaged object, by definition, is near-invisible in a single view, we have another example of the cyclopean eye seeing what neither real eye can see.
The proof had to come from perfect, camouflage, and here Julesz went to the computer. For the left eye's view, he had the computer make a square covered with random dots, like television snow, julesz then had {230} the computer make a copy for the right eye, but with one fjwist: he shifted a patch of dots a bit over to the left, and inserted a new istripe of random dots ieto the gap at the right so the shifted patch woald be perfectly camouflaged. Each picture on its own looked like pepper. But when put in the stereoscope, the patch levitated into the air.
Many authorities on stereo vision at the time refused to believe it because the correspondence problem the brain had to solve was just too hard. They suspected that Julesz had somehow left little cut marks behind in one of the pictures. But of course the computer did :no such thing. Anyone who sees a random-dot stereogram is immediately convinced.
All it took for Julesz’ occasional collaborator, Christopher Tyler, to invent the magic-eye autostereogram was to combine the Wallpaper autostereogram with the random-dot stereogram. The computer generates a vertical stripe of dots and lays copies of it side by side, (creating random-dot wallpaper. Say each stripe is ten dots wide, and we number the dots from 1 to 10 (using “0” for 10):
123456789012345678901234567890123456789012345678901234567890 123456789012345678901234567890123456789012345678901234567890 123456789012345678901234567890123456789012345678901234567890 |
— and so on. Any clump of dots — say, “5678” — repeats itself every ten spaces. When the eyes fixate on neighboring stripes, they falsely fuse, just as they do with a wallpaper stereogram, except that the brain is superimposing stretches of random dots rather than flowers. Remember that in a wallpaper stereogram, copies of a pattern that have been squashed closer together will float above the rest because their lines of sight cross closer to the viewer. To make a patch float out of a rrtagic-eye {231} autostereogram, the designer identifies the patch and makes each clump of dots inside it closer to the nearest copy of itself. In the picture below, I want to make a floating rectangle. So I snip out two copies of dot 4 in the stretch between the arrows; you can spot the snipped rows because they are now two spaces shorter. Inside the rectangle, every clump of dots, like “5678,” repeats itself every nine spaces instead of every ten. The brain interprets copies that are closer together as coming from nearer objects, so the rectangle levitates. The diagram, by the way, not only shows how autostereograms are made, but it works as a passable autostereogram itself. If you fuse it like wallpaper, a rectangle should arise. (The asterisks at the top are there to help you fuse it; let your eyes drift until you have a double image with four asterisks and slowly try to bring the images together until the middle two asterisks fuse and you are seeing three asterisks in a row rather than four. Carefully look down at the diagram without re-aiming your eyes, and you may see the floating rectangle.)
* * ¯ ¯ 1234567890123456789012345678901234567890123456789012345678901234567890 1234567890123456789012345678901234567890123456789012345678901234567890 1234567890123456789012345678901234567890123456789012345678901234567890 12345678901234567890123456789012356789012356789012345678901234567890 12345678901234567890123456789012356789012356789012345678901234567890 12345678901234567890123456789012356789012356789012345678901234567890 12345678901234567890123456789012356789012356789012345678901234567890 12345678901234567890123456789012356789012356789012345678901234567890 12345678901234567890123456789012356789012356789012345678901234567890 1234567890123456789012345678901234567890123456789012345678901234567890 1234567890123456789012345678901234567890123456789012345678901234567890 1234567890123456789012345678901234567890123456789012345678901234567890 123456789012345678901234567890123X4567890123X456789012345678901234567890 123456789012345678901234567890123X4567890123X456789012345678901234567890 123456789012345678901234567890123X4567890123X456789012345678901234567890 123456789012345678901234567890123X4567890123X456789012345678901234567890 123456789012345678901234567890123X4567890123X456789012345678901234567890 123456789012345678901234567890123X4567890123X456789012345678901234567890 1234567890123456789012345678901234567890123456789012345678901234567890 1234567890123456789012345678901234567890123456789012345678901234567890 1234567890123456789012345678901234567890123456789012345678901234567890 |
You should also see a cutout window lower in the picture. I made it by picking out a rectangular patch and doing the opposite of what I did before: I stuffed an extra dot (labeled “X”) next to every copy of dot 4 inside the patch. That pushes the clumps of dots farther apart, so they {232} repeat themselves every eleven spaces. (The stuffed rows, you will notice, are longer than the rest.) Copies that are more widely spaced equals a surface that is more distant. A real random-dot autostereogram, of course, is made of dots, not numbers, so you don't notice the snipped-out or stuffed-in material, and the uneven lines are filled out with extra dots. Here is an example. The fun in viewing a real random-dot autostereogram is that the moment of pop-out surprises the viewer with previously invisible shapes:
When the autostereogram craze hit Japan, it soon developed into an art form. Dots are not necessary; any tapestry of small contours rich enough to fool the brain into locking the eyes on neighboring stripes will do. The first commercial autostereograms used colored squiggles, and the Japanese ones use flowers, ocean waves, and, taking a leaf out of Ames’ book, leaves. Thanks to the computer, the shapes don't have to be flat cutouts like in a diorama. By reading in the three-dimensional coordinates of the points on a surface, the computer can shift every dot by a slightly different amount to sculpt the solid shape in cyclopean space, rather than shifting the entire patch rigidly. Smooth, bulbous {233} shapes materialize, looking as if they are shrink-wrapped in leaves or flowers.
Why did natural selection equip us with true Cyclopean vision — an ability to see shapes in stereo that neither eye can see in mono — rather than with a simpler stereo system that would match up the lemons and cherries that are seeable by each eye? Tyler points out that our ancestors really did live in Ames’ leaf room. Primates evolved in trees and had to negotiate a network of branches masked by a veil of foliage. The price of failure was a long drop to the forest floor below. Building a stereo computer into these two-eyed creatures must have been irresistible to natural selection, but it could have worked only if the disparities were calculated over thousands of bits of visual texture. Single objects that allow unambiguous matches were just too few and far between.
Julesz points out another advantage of cyclopean vision. Camouflage was discovered by animals long before it was discovered by armies. The earliest primates were similar to today's prosimians, the lemurs and tar-siers of Madagascar, who snatch insects off trees. Many insects hide from predators by freezing, which defeats the hunter's motion detectors, and by camouflage, which defeats its contour detectors. Cyclopean vision is an effective countermeasure, revealing the prey just as aerial reconnaissance reveals tanks and planes. Advances in weaponry spawn arms races in nature no less than in war. Some insects have outwitted their predators’ stereo vision by flattening their bodies and lying flush against the background, or by turning into living sculptures of leaves and twigs, a kind of three-dimensional camouflage.
How does the cyclopean eye work? The correspondence problem — matching up the marks in one eye with their counterparts in the other — is a fearsome chicken-and-egg riddle. You can't measure the stereo disparity of a pair of.marks until you have picked a pair of marks to measure. But in a leaf room or a random-dot stereogram, there are thousands of candidates for the matchmaker. If you knew how far away the surface was, you would know where to look on the left retina to find the mate of a mark on the right. But if you knew that, there would be no need to do the stereo computation; you would already have the answer. How does the mind do it? {234}
David Marr noted that built-in assumptions about the world we evolved in can come to the rescue. Among the n2 possible matches of n points, not all are likely to have come from this goodly frame, the earth. A well-engineered matcher should consider only the matchups that are physically likely.
First, every mark in the world is anchored to one position on one surface at one time. So a legitimate match must pair up identical points in the two eyes that came from a single splotch in the world. A black dot in one eye should match a black dot in the other, not a white dot, because the matchup has to represent a single position on some surface, and that position cannot be a black splotch and a white splotch at the same time. Conversely, if a black dot does match a black dot, they must come from a single position on some surface in the world. (That is the assumption violated by autostereograms: each of their splotches appears in several positions.)
Second, a dot in one eye should be matched with no more than one dot in the other. That means that a line of sight from one eye is assumed to end at a splotch on one and only one surface in the world. At first glance it looks as if the assumption rules out a line of sight passing through a transparent surface to an opaque one, like the bottom of a shallow lake. But the assumption is more subtle; it only rules out the coincidence in which two identical splotches, one on the lake's surface and one on the bottom, line up one behind the other from the left eye's vantage point while both being visible from the right eye's.
Third, matter is cohesive and smooth. Most of the time a line of sight will end up on a surface in the world that is not drastically closer or farther than the surface hit by the neighboring line of sight. That is, neighboring patches of the world tend to lie on the same smooth surface. Of course, at the boundary of an object the assumption is violated: the edge of the back cover of this book is a couple of feet away from you, but if you glance just to its right you might be looking at the moon a quarter of a million miles away. But boundaries make up a small portion of the visual field (you need much less ink to sketch a line drawing than to color it in), and these exceptions can be tolerated. What the assumption rules out is a world made up of dust storms, swarms of gnats, fine wires, deep crevasses between craggy peaks, beds of nails viewed point-on, and so on.
The assumptions sound reasonable in the abstract, but something still has to find the matches that satisfy them. Chicken-and-egg problems {235} can sometimes be solved with the technique called constraint satisfaction that we met in Chapter 2 when looking at Necker cubes and accented speech. When the parts of a puzzle cannot be solved one at a time, the puzzle-solver can keep in mind several guesses for each one, compare the guesses for the different parts of the puzzle, and see which ones are mutually consistent. A good analogy is working on a crossword puzzle with a pencil and an eraser. Often a clue for a horizontal word is so vague that several words can be penciled in, and a clue for a vertical word is so vague that several words can be penciled in. But if only one of the vertical guesses shares a letter with any of the horizontal guesses, that pair of words is kept and the others are erased. Imagine doing that for all the clues and squares at once and you have the idea of constraint satisfaction. In the case of solving the correspondence problem in stereo vision, the dots are the clues, the matchups and their depths are the guesses, and the three assumptions about the world are like the rules that say that every letter of every word must sit in a box, every box must have a letter in it, and all the sequences of letters must spell out words.
Constraint satisfaction can sometimes he implemented in a constraint network like the one 1 presented on page 107. Man and the theoretical neuroscientist Tomaso Poggio designed one for stereo vision. The input units stand for points, such as the black and white squares of a random-dot stereogram. They feed into an array of units that represent all of the n × n possible matchups of a point in the left eye with some other point in the right eye. When one of these units turns on, the network is guessing that there is a splotch at a particular depth in the world (relative to where the eyes have converged). Here is a bird's-eye view of one plane of the network, showing a fraction of the units. {236}
The model works as follows. A unit turns on only if it gets the same inputs from the two eyes (Hack or white), embodying the first assumption (each mark anchored to a surface). Because the units are interconnected, the activation of one unit nudges the activations of its neighbors up or down. Units for different matches lying along the same line of sight inhibit one another, embodying the second assumption (no coincidental markings aligned along a line of sight). Units for neighboring points at nearby depths excite one another, embodying the third assumption (matter is cohesive). The activations reverberate around the network, and it eventually stabilizes, with the activated units tracing out a contour in depth. In the diagram, the filled-in units are showing an edge hovering over its background.
Tbe constraint-satisfaction technique, in which thousands of processors make tentative guesses and hash it out among themselves until a global solution emerges, is consistent with the general idea that the brain works with lots of interconnected processors computing in parallel. It captures some of the psychology, too. When viewing a complicated random-dot stereogram, often you don't see the hidden figure erupt instantaneously. A bit of edge might pop out from the pepper, which then lifts up a sheet, which cleans and straightens a fuzzy border on the other side, and so on until the whole shape coalesces. We experience the solution emerging, but not the struggle of the processors to come up with it. The experience is a good reminder that as we see and think, dozens of iterations of information processing go on beneath the level of consciousness.
‘The Marr-Poggio model captures the flavor of the brain's computation of stereo vision, but our real circuitry is surely more sophisticated. Experiments have shown that when people are put in artificial worlds that violate assumptions about uniqueness and smoothness, they don't see as badly as the model predicts. The brain must be using additional kinds of information to help solve the matchup problem. For one thing, the world is not made up of random dots. The brain can match up all the little diagonals, T's, zigzags, inkblots, and other jots and tittles in the two eyes’ views (which even a random-dot stereogram has in abundance). There are far fewer false matches among jots and tittles than there are among dots, so the number of matches that have to be ruled out is radically shaved.
Another matchmaking trick is to exploit a different geometric consequence of having two eyes, the one noticed by Leonardo: there are parts of an object that one eye can see but that tbe other eye cannot. Hold a {237} pen vertically in front of you, with the clip facing away at eleven o'clock. When you close each eye in turn, you will notice that only the left eye can see the clip; it is hidden from the right eye by the rest of the pen. Was natural selection as astute as Leonardo when it designed the brain, letting it use this valuable clue to an object's boundary? Or does the brain ignore the clue, grudgingly chalking up each mismatch as an exception to the cohesive-matter assumption? The psychologists Ken Nakayama and Shinsuke Shimojo have shown that natural selection did not ignore the clue. They created a random-dot stereogram whose depth information lay not in shifted dots hut in dots that were visible in one eye's view and absent in the other's. Those dots lay at the corners of an imaginary square, with dots at the top and bottom right corners only in the right eye's picture, and dots in the top and bottom left corners only in the left eye's picture. When people view the stereogram, they see a floating square defined by the four points, showing that the brain indeed interprets features visible to only one eye as coming from an edge in space. Nakayama and the psychologist riarton Anderson suggest that there are neurons that detect these occlusions; they would respond to a pair of marks in one eye, one of which can be matched with a mark in the other eye and the other of which cannot be matched. These 3-D boundary detectors would help a stereo network home in on the outlines of the rloating patches.
Stereo vision does not come free with the two eyes; the circuitry has to he wired into the brain. We know this because about two percent of the population can see perfectly well out of each eyeball but not with the cyclopean eye; random-dot stereograms remain that. Another four percent can see stereo only poorly An even larger minority has more selective deficits. Some can't see stereo depth behind the point of fixation; others can't see it in front. Whitman Richards, who discovered these forms of stereoblindness, hypothesized that the brain has three pools of neurons that detect differences in the position of a spot in the two eyes. One pool is for pairs of snots thai coincide exactly or almost exactly, for fine-grained depth perception at the point of focus. Another is for pairs of spots flanding the nose, for farther objects. A third is for pairs of spots approaching the temples, for nearer objects. Neurons with all these {238} properties have since been found in the brains of monkeys and cats. The different kinds of stereoblindness appear to be genetically determined, suggesting that each pool of neurons is installed by a different, combination of genes.
Stereo vision is not present at birth, and it can be permanently damaged in children or young animals if one of the eyes is temporarily deprived of input by a cataract or a patch. So far, this sounds like the tiresome lesson that stereo vision, like everything else, is a mixture of nature and nurture. But a better way of thinking about it is that the brain has to be assembled, and the assembly requires project scheduling over an extended timetable. The timetable does not care about when the organism is extruded from the womb; the installation sequence can carry on after birth. The process also requires, at critical junctures, the intake of information that the genes cannot predict.
Stereo vision appears abruptly in infants. When newborns are brought into a lab at regular intervals, for week after week they are unimpressed by stereograms, and then suddenly they are captivated. Close to that epochal week, usually around three or four months of age, the babies converge their eyes properly for the first time (for example, they smoothly track a toy brought up to their nose), and they find rivalrous displays — a different pattern in each eye — annoying, whereas before they had found them interesting.
It is not that babies “learn to see in stereo,” whatever that would mean. The psychologist Richard Held has a simpler explanation. When infants are born, every neuron in the receiving layer of the visnal cortex adds up the inputs from corresponding locations in the two eyes rather than keeping them separate. The brain can't tell which eye a given bit of pattern came from, and simply melts one eye's view on top of the other's in a 2-D overlay. Without information about which eye a squiggle came from, stereo vision, convergence, and rivalry are logically impossible. Around the three-month mark each neuron settles on a favorite eye to respond to. The neurons lying one connection downstream can now know when a mark falls on one spot in one eye and on the same spot, or a slightly shifted-over spot, in the other eye — the grist for stereo vision.
In cats and monkeys, whose brains have been studied directly, this is indeed what happens. As soon as the animal's cortex can tell the eyes apart, the animal sees stereograms in depth. That suggests that when the inputs are first tagged “left eye” or “right eye,” the circuitry for stereo computation one layer downstream is already installed and functioning. {239} In monkeys it's all over in two months; by then each neuron has a favorite eye and the baby monkeys see in depth. Compared with other primates, humans are “altricial”: babies are born early and helpless, and complete their development outside the womb. Because human infants are born earlier than monkeys in proportion to the length of their childhood, the installation of their binocular circuitry appears at a later age as measured fron; the date of birth. More generally, when biologists compare the milestones of the maturation of the visual systems of diilerent animals, some born early and helpless, others born late and seeing, they find that the sequence is pretty much the same whether the later steps take place in the womb or in the world.
The emergence of the crucial left-eye and right-eye neurons can be disrupted by experience. When the neurobiologists David Hubel and Torsten Wiesel raised kittens and baby monkeys with one eye covered, the input neurons of the cortex all tuned themselves to the other eye, making the animal functionally blind in the eye that was covered. The damage was permanent, even with only brief deprivation, if the eye was covered in a eritkai period in the animal's development. In monkeys, the visual system is especially vulnerable during the first two weeks of life, and the vulnerability tapers off during the first year. Covering the eye of an adult monkey, even for four years, does no harm.
At first this all looked like a case of “use it or lose it,” but a surprise was in store. When Hubel and Wiesel covered both eyes, the brain did not show twice the damage; half the cells showed no damage at all. The havoc in the single-eyepatch experiment came about not because a neuron destined for the covered eye was starved of input but because the input signals lrom the uncovered eye elbowed the covered eye's inputs out of the way. The eyes compete for real estate in the input layer of the cortex. Eacli neuron begins with a slight bias for one eye or the other, and the input from that eye exaggerates the bias until the neuron responds to it alone. The inputs do not even have to originate in the world; waves of activation from intermediate way-stations, a kind ot internally generated test pattern, can do the trick. The developmental saga, though it is sensitive to changes in the animals experience, is not exactly “learning,” in the sense of registering information from the world. Like an architect who hands a rough sketch to a low-level draftsman to straighten out the lines, the genes build eye-specific neurons crudely and then kick off a process that is guaranteed to sharpen them unless a neurobiologist meddles. {240}
Once the brain has segregated the left eye's image from the right eye's, subsequent layers of neurons can compare them for the minute signal depth. These circuits, too, can be modified by the animal experience, though again in surprising ways. If an experimenter cross-eyed or wall-eyed by cutting one of the eye muscles, the eyes point in different directions and never see the same thing on the two retinas at the same time. Of course, the eyes don't point 180 degrees apart, so in theory the brain could learn to match the out-of-whack segments that do overlap. But apparently it is not equipped for matches that stretch more, than a few degrees across the two eyes; the stereoblind, and often functionally blind in one of the two eyes as well, a condition called amblyopia. (Amblyopia is sometimes called "lazy eye," but that is misleading. It is the brain, not the eye, that is insensitive, and the insensitivity is caused by the brain actively suppressing one eye's input in a kind of permanent rivalry, not by the brain lazily ignoring it.)
The same thing can happen in children. If one of the eyes is more far-sighted than the other, the child habitually strains to focus on nearby objects, and the reflex that couples focusing and convergence draws that eye inward. The two eyes point in different directions (a condition called strabismus), and their views don't align closely enough for the brain to use the disparity information in them. The child will grow up amblyopic and stereoblind unless early surgery on the eye muscles lines the eyeballs up. Until Hubel and Wiesel discovered these effects in monkeys and Held found similar ones in children, surgery for strabismus was considered cosmetic and done only on school-aged children. But there is a critical period for the proper alignment of two-eye neurons, a bit longer than the one for one-eye neurons but probably fading out near the age of one or two. Surgery after that point is often too late.
Why is there a critical period, as opposed to rigid hard-wiring or life-long opennes to experience? In kittens, monkeys, and human babies, the face keeps growing after birth, and the eyes get pushed farther apart. There relative vantage points change, and the neurons must keup up by returning the range of intereye disparities they detect. Genes cannot anticipate the degree of spreading of the vantage points, because it depends on other genes, nutrition, and various accidents. So the neurons track the drifting eyes during the window of growth. When the eyes arrive at their grownup separation in the skull, the need disappears, and critical period ends. Some animals, like rabbits, have {241} precocious babies whose eyes are set in adult positions within faces that grow very little, (These tend to be prey animals, which don't have the luxury of a long, helpless childhood.) The neurons that receive inputs from the two eyes don't need to retime themselves, and in fact these animals are wired at birth and do without a critical period of sensitivity to the input.
The discoveries about the tunabiiity of binocular vision in different species offers a new way of thinking about learning in general. Learning is often described as indispensable shaper of amorphous brain tissue. Instead it might be an innate adaptation to the project-scheduling demands of a self-assembling animal. The genome builds as much of the animal as it can, and for the parts of the animal that cannot be specified in advance (such as the proper wiring for two eyes that are moving apart at an unpredictable rate), the genome turns on an information-gathering mechanism at the time in development at which it is most needed. In The Language Instinct I develop a similar explanation for the critical period for learning language in childhood.
I have led you through magic-eye stereograms not just because it is fun to understand how the magic works. I think stereo vision is one of the glories of nature and a paradigm of how other parts of the mind might work. Stereo vision is information processing that we experience as a particular flavor of consciousness, a connection between mental computation and awareness that is so lawful that computer programmers can manipulate it to enchant millions. It is a module in several senses: it works without the rest of the mind (not needing recognizable objects), the rest of the mind works without it (getting by, if it has to, with other depth analyzers), it imposes particular demands on the wiring of the brain, and it depends on principles specific to its problem (the geometry of binocular parallax). Though stereo vision develops in childhood and is sensitive to experience, it is not insightfully described as “learned” or as “a mixture of nature and nurture”; the development is part of an assembly schedule and the sensitivity to experience is a circumscribed intake of information by a structured system. Stereo vision shows off the engineering acumen of natural selection, exploiting subtle theorems in optics rediscovered millions of years later by the likes of Leonardo da Vinci, {242} Kepler, Wheaistone, and aerial reconnaissance engineers. It evolved in response to identifiable selection pressures in the ecology of our ancestors And it solves unsolvable problems by making tacit assumptions about the world that were true when we evolved but are not always true now.
Stereo vision is part of a crucial early stage of vision that fi depths and materials of surfaces, but it is not the only p; three dimensions doesn't require two eyes. You can get a shape and substance from the meagerest hints in a picture, drawings, designed by the psychologist Edward Adelson.
The left one appears to be white cardboard with a gray vertical stripe, folded horizontally and lit from above. The right one appears to be white cardboard with a gray horizontal stripe, folded vertically and lit from the side. (If you stare long enough, either might flip in depth, like a Necker cube; let's ignore that for now.) But the ink on the page (and the projection on your retina) is virtually the same in the two pictures. Each is a zigzag tic-tac-toe box with some of the squares shaded in. In both drawings, the corner squares are white, the top and side squares are light gray, and the middle square is a darker gray. Somehow the combination of shading and zigzagging pops them into the third dimension and colorizes each square, but in different ways. The borders labeled “1” are physically the same in the two drawings. But in the left drawing the border looks like a paint boundary — a white stripe next to a gray one — and in the right drawing it looks like a shape-and-shading boundary — a white stripe falling into a shadow on the other side of a fold. The borders labeled “2” are also identical, but you see them in the opposite way: shadow in {243} left drawing, paint stripe in the right one. All these differences come from one box zigging where the other one zags!
To see so much world in so little image, you have to undo three laws that make images from the world. Each needs a mental “expert” to do the undoing. Like stereo vision, these experts work to give us an accurate grasp of the world's surfaces, but thev run on different Kinds of information, solve eifferent kinos” of problems, and make different kinds of assumptions about trie world.
The first problem is perspective: a 3-D object gets projected into a 2-D shape on the retina. Unfortunately, any projection could have come from an infinite number of objects, so there is no way to recover a shape from its projection alone (as Ames reminded his viewers). “So,” evolution seems to have said, “no one's perfect.” Our shape analyzer plays the odds and makes us see the most probable state of the world, given the retinal image.
How can a visual system calculate the most probable state of the world from the evidence on the retina? Probability theory offers a simple answer: Bayes' theorem, the most straightforward way of assigning a probability to a hypothesis based on some evidence. Bayes' theorem says that the odds favoring one hypothesis over another can be calculated from just two numbers for each hypothesis. One is the prior probability: how confident are you in the hypothesis before you even look at the evidence? The other is the likelihood: if the hypothesis were true, what is the probability that the evidence as you are seeing it now would have appeared? Multiply the prior probability of hypothesis 1 by the likelihood of the evidence under Hypothesis 1. Multiply the prior probability of Hypothesis 2 by the likelihood of the evidence under Hypothesis 2. Take the ratio of the two numbers. You now have the odds in favor of the first hypothesis.
How does our 3-D line analyzer use Bayes' theorem? It puts its money on the object that has the greatest likelihood of producing those lines if it were really in the scene, and that has a good chance ot being in scenes in general. It assumes, as Einstein once said about God, that the world is subtle but not malicious.
So the shape analyser must be equipped with some probabilities {244} about projection (how objects appear in perspective) and some probabilities about the world (what kinds of objects it has). Some of the probabilities about projection are very good indeed. A penny, theoretically, can project to a thin line, but it does so only when it is viewed edge-on. If there's a penny in the scene, what is the probability that you are viewing it edge-on? Unless someone has choreographed the two of you, not very high. The vast majority of viewpoints will make the penny project an ellipse instead. The analyzer assumes the current viewpoint is generic— not poised with pinpoint accuracy to line things up, Ames-style — and places iys chips accordingly. A matchstick, on the other hand, will project to a straight line almost all the time, so if there is a line in an image, a stick is a better guess than a disk, all else being equal.
A collection of lines in an image can narrow the odds even further. For example, a set of parallel or near-parallel lines is seldom an accident. Nonparallel lines in the world rarely project near-parallel lines in an image: most pairs of sticks strewn on a floor cross at moderate to sharp angles. But lines that are parallel in the world, such as the edges of a telephone pole, almost always project near-parallel lines. So if there are near-parallel lines in an image, the odds favor parallel edges in the world. There are many other rules of thumb that say what kinds of sculptings of the world can be counted on to give off various markings in an image. Little T's, Y's, angles, arrows, crows feet, and parallel wiggles are the fingerprints of various straight edges, corners, right angles, and symmetrical shapes. Cartoonists have exploited those rules for millennia, and a wily shape analyzer can run them backwards when betting on what is in the world.
But of course running a likelihood backwards — saying that parallel stuff usually projects near-parallel images, therefore near-parallel images imply parallel stuff — is unsound. It is like hearing hoofbeats outside your window and concluding that they came from a zebra, because zebras often make hoofbeats. The prior probability that the world contains some entity now many zebras are out there, how many parallel edges are out there — has to be multiplied in. For an odds-playing shape analyzer to work, the world had better contain lots of the straight, regular, symmetrical, compact kinds of objects that it likes to guess. Does it? A romantic might think that the natural world is organic and soft, its hard edges bulldozed in by the Army Corps of Engineers. As a literature professor recently declared to his class, “Straight lines on the landscape are put there by man.” A skeptical student, Gail Jensen Sanford, published {245} a list of straight lines in nature, recently reprinted in Harper's magazine:
line along the top of a breaking wave; distant edge ot a prairie; paths of hard rain and hail; snow-covered fields; patterns in crystals; lines of white quartz in a granite surface; icicles, stalactites, stalagmites; surface of a calm lake; markings on zebras and tigers; bill of a duck; legs of a sandpiper; angle of migrating birds; dive of a raptor, new frond of a fern; spikes of a cactus; trunks of young, fast-growing trees; pine needles; silk strands woven by spiders; cracks in the surface of ice; strata of metamorphic rock; sides of a volcano; wisp of windblown altocumulus clouds; inside edge of a half-moon.
Some of these are arguable, and others will do a shape guesser more harm than good. (The horizon of a lake or prairie and the edge of a half-moon do not come from lines in the world.) But the point is right. Many laws of the world give it nice, analyzable shapes. Motion, tension, and gravity make straight lines. Gravity makes right angles. Cohesion makes smooth contours. Organisms that move evolve to be symmetrical. Natural selection shapes their body parts into tools, duplicating the human engineer's demand for well-machined parts. Large surfaces collect patterns with roughly equal sizes, shapes, and spacing: cracks, leaves, pebbles, sand, ripples, needles. Not only are the seemingly carpentered and wallpapered parts of the world the parts most recoverable by a shape analyzer; they are the parts most worth recovering. They are the telltale signs of potent forces that fill and shape the environment at hand, and are more worthy of attention than heaps of random detritus.
Even the best line analyzer is equipped only for a cartoon world. Surfaces are not just bounded by lines; they are composed of material. Our sense of lightness and color is a way of assaying materials. We avoid biting into a plaster apple because the color tips us off that it is not made of fruit flesh.
Analyzing matter from the light it reflects is a job for a reflectance specialist. Different kinds of matter reflect back different wavelengths of light in different amounts. (To keep things simple, I'll stay in black and white, color is, roughly, the same problem multiplied by three.) {246} Unfortunately, a given amount of reflected light could have come from an infinite number of combinations of matter and lighting. One hundred units of light could have come from coal reflecting back 10% of the light of 1,000 candles or from snow reflecting back 90% of the light of 111 candles. So there is no foolproof way to deduce an object's material from its reflected light. The lightness analyzer must somehow factor out the level of illumination. This is another ill-posed problem, exactly equivalent to this one: I give you a number, you tell me which two numbers were multiplied to get it. The problem can be solved only by adding in assumptions.
A camera is faced with the same task — how to render the snowball as white whether it is indoors or out. A camera's meter, which controls the amount of light reaching the film embodies two assumptions. The first is that lighting is uniform: the whole scene is in sun, or in shade, or under a lightbulb. When the assumption is violated, the snap-shooter is disappointed. Aunt Mimi is a muddy silhouette against the blue sky because the camera is fooled by her face being in shade while the sky is lit directly by the sun. The second assumption is that the scene is, on average, medium gray. If you throw together a random collection of objects, their many colors and lightnesses will usually average out to a medium shade or gray that reflects back 18%. of the light. The camera “assumes” it is looking at an average scene and lets in just enough light to make the middle of the range of lightnesses in the scene come out as medium gray on the film. Patches that are lighter than the middle are rendered pale gray and white; patches that are darker, deep gray and black. But when the assumption is wrong and the scene does not really average out to gray, the camera is fooled. A picture of a black cat on black velvet ccmes out medium gray, a picture of a polar bear on the snow comes out medium gray, and so on. A skilled photographer analyzes how a scene differs from the average scene and uses various tricks to compensate. A crude but effective one is to carry around a standard medium gray card (which reflects back exactly 18% of the light), lean it on the subject, and aim the meter at the card. The camera's assumption about the world is now satisfied, and its estimate of the ambient illumination level (made by dividing the light reflecting off the card by 18%) is guaranteed to be correct.
Edwin Land, inventor of the polarizing filter and the instant Polaroid Land camera, was challenged by this problem, which is all the more {247} vexing in color photography. Light from lightbulbs is orange; light from fluorescents is olive; light from the sun is yellow; light from the sky is blue. Our brain somehow factors out the color of the illumination, just as it factors out the intensity of the illumination, and sees an object in its correct color in all those lights. Cameras don't. Unless they send out their own white light from a flash, they render an indoor scene with a thick rusty cast, a shady scene as pasty blue, and so on. A knowledgeable photographer can buy special film or screw a filter on the lens to compensate, and a good lab technician can correct the color when printing the photograph, but an instant camera obviously cannot. So Land had a practical interest in how to remove the intensity and color of the illumination, a problem called color constancy.
But he was also a self-taught, ingenious perception scientist, curious about how the brain solves the problem. He set up a color perception lab and developed a clever theory of color constancy. His idea, called the Retinex theory, gave the perceiver several assumptions. One is that earthly illumination is a rich mixture of wavelengths. (The exception that proves the rule is the sodium vapor lamp, the energy-saving fixture found in parking lots. It sends out a narrow range of wavelengths which our perception system can't factor out; cars and faces are dyed a ghastly yellow.) The second assumption is that gradual changes in brightness and color across the visual field probably come from the way the scene is illuminated, whereas abrupt transitions probably come from the boundary where one object ends and another begins. To keep things simple, he tested people and his model on artificial worlds composed of 2-D rectangular patches, which he called Mondrians, after the Dutch painter. In a Mondrian lit from the side, a yellow patch at one edge can reflect very different light from the same yellow patch at the other. But people see them both as yellow, and the Retinex model, which removes the lighting gradient from edge to edge, does too.
The Retinex theory was a good start, but it turned out to be too simple. One problem is the assumption that the world is a Mondrian, a big flat plane. Go back to Adelson's drawings on page 242, which are zigzag Mondrians. The Retinex model treats all sharp boundaries alike, interpreting Edge 1 in the left drawing like Edge 1 in the right drawing. But to you, the left one looks like a border between stripes of different colors, and the right one looks like a single stripe that is folded and partly in shade. The difference comes from your interpretation of 3-D shape. Your shape analyzer has bent the Mondrians into striped room dividers, but {248} the Retinex model sees them as the same old checkerboard. Obviously, it is missing something.
That something is the effect of slant on shading, the third law that turns a scene into an image. A surface facing a light source head-on reflects back a lot of light, because the light smacks into the surface and rebounds right back. A surface angled to almost patallel to the source reflects much less, because most of the light grazes off it and continues on its way. If you are positioned near the light source, your eye picks up more light when the surface faces you than when it faces almost sideways. You may be able to see the difference by shining a flashlight at a piece of gray cardboard and tilting the cardboard.
How might our shading analyzer run the law backwards and figure out how a surface is slanted based on now much light it reflects? The benefits go beyond estimating the slant of a panel. Many objects, like cubes and gems, are composed of slanted faces, so recovering the slants is a way to ascertain their shape. In fact, any shape can he thought of as a carving made up of millions of tiny facets. Even when the surface is smoothly curved so the “facets” shrink to points, the shading law applies to the light coming off each point. If the law could be run backwards, our shading analyzer could apprehend the shape of a surface by registering the slant of the tangent plane resting on each point.
Unfortunately, a given amount of light reflecting off a patch could have come from a dark surface angled toward the light or from a light surface angled away. So there is no foolproof way to recover a surface's angle from the light it reflects without making additional assumptions.
A first assumption is that surface lightness is uniform: the world is made of plaster. When surfaces are unevenly pigmented, the assumption is violated, and our shading analyzer should he fooled. It is. Paintings and photographs are the most obvious example. A less obvious one is countershading in animal camouflage. The hides of many animals lighten from back to belly in a gradient that cancels out the effects of light on their 3-D shapes. This flattens the animal, making it harder to detect by the assumption-making, shape-from-shading analyzer in the brain of a predator. Makeup is another example. When applied in sub-Tammy Faye Bakker amounts, pigment on the skin can fool the beholder into seeing {249} the flesh and bone as having a more ideal shape. Dark blush on the sides of the nose makes them look as if they are at a shallower angle to the light, which makes the nose appear narrower. Wnite powder on the upper lip works the other way: the lip seems to intercept the light source head-on as if it were fuller, bestowing that desirable pouty look.
The shape-from-shading analyzer has to make other assumptions, too. Surfaces in the world are made of thousands of materials, and light bounces off their slanted surfaces in very different ways. A matte surface like chalk or dull paper follows a simple law, and the brain's shading analyzer often seems to assume that the world is matte. Surfaces with glosses, patinas, fuzz, pits, and prickles do other, stranger things with light, and they can fool the eye.
A famous example is the full moon. It looks like a flat disk, but of course it is a sphere. We have no trouble seeing other spheres from tneir shading, like ping-pong balls, and any good artist can sketch a sphere with charcoal. The problem with the moon is that it is pockmarked with craters of all sizes, most too small to be discerned from the earth, and they combine into a surface that behaves differently from the matte ideal that our shading analyzer takes for granted. The center of the full moon faces the viewer flat-on, so it should be brightest, but it has little nooks and crannies whose walls are seen edge-on from the viewer's earthly vantage point, making the center of the moon look darker. The surfaces near the perimeter of the moon graze the line or sight and should look darker, but they present their canyon walls face-on and reflect back lots of light, making the perimeter look lighter. Over the whole moon, the angle of its surface and the angles of the facets of its craters cancel out. All portions reflect back the same amount of light, and the eye sees it as a disk.
If we had to depend on any one of these analyzers, we would be eating bark and stepping off cliffs. Each analyzer makes assumptions, but those assumptions are often contradicted by other analyzers. Angle, shape, material, lighting — they're all scrambled together, but somehow we unscramble them and see one shape, with one color, at one angle, in one kind of light. What's the trick?
Adelson, together with the psychologist Alex Pentland, used his zigzag {250} illusion in a little parable. You are a designer who must build a stage set that looks just like the right-hand diagram. You go to a workshop where specialists build scenery for dramatic productions. One is a lighting designer. Another is a painter. A third is a sheet-metal worker. You show them the picture and ask them to build a scene that looks like it. In effect, they have to do what the visual system does: given an image, figure out the arrangement of matter and lighting that could have brought it about.
There are many ways the specialists can satisfy you. Each could almost do it alone. The painter could simply paint the arrangement of parallelograms on a flat sheet of metal and ask the lighting designer to illuminate it with a single flood:
The lighting designer could take a plain white sheet and set up nine custom spotlights, each with a special mask and filter, aimed just right to project nine parallelograms onto the sheet (six of the spotlights are shown here):
The sheet-metal worker could bend some metal into special shapes that when illuminated and viewed from just the right angle give rise to the image: {251}
Finally, the figure could be produced by the specialists cooperating. The painter would paint a stripe across the middle of a square sheet of metal, the sheet-metal worker would bend it into a zigzag, and the lighting designer would illuminate the piece with a floodlight. That, of course, is how a human being interprets the image.
Our brain faces the same embarrassment of riches as the set designer in the parable. Once we allow in a mental “expert” that can hypothesize pigmented surfaces out there, it could explain everything in the image as paint: the world would be seen as a masterful trompe l'oeil. Likewise, a lighting expert in the head could tell us that the world is a movie. Since these interpretations are undesirable, the mental specialists should somehow be discouraged from making them. One way would be to force them to stick with their assumptions, come what may (color and lighting are even, shapes are regular and parallel), but that's too extreme. The world is not always a pile of blocks on a sunny day; sometimes it does have complicated pigments and lighting, and we see them. We don't want the experts to deny that the world can be complex. We want them to propose exactly as much complexity as there is in the world, and no more. The problem now is how to get them all to do it.
Return now to the parable. Suppose the set design department is on a budget. The specialists charge for their services, using a fee schedule that reflects how difficult and unusual a request is. Simple and common operations are cheap; complex and unusual operations are expensive. {252}
Painter Fees: |
|
Paint a rectangular patch: |
$5 each |
Paint a regular polygon: |
$5 per side |
Sheet-Metal Worker Fees: |
|
Right-angle cuts: |
$2 each |
Odd-angle cuts: |
$5 each |
Right-angle bends: |
$2 each |
Odd-angle bends: |
$5 each |
Lighting Designer Fees: |
|
Floodlight: |
$5 each |
Custom spotlight: |
$30 each |
We need one more specialist: a supervisor, who decides how to contract out the job.
Supervisor Fees: |
|
Consultation: |
$30 per job |
The prices for the four solutions will differ. Here are the estimates:
Painter's Solution: |
|
Paint 9 polygons: |
$180 |
Set up 1 floodlight: |
$5 |
Cut 1 rectangle: |
$8 |
Total: |
$193 |
Lighting Designer's Solution: |
|
Cut 1 rectangle: |
$8 |
Set up 9 custom spotlights: |
$270 |
Total: |
$278 |
Sheet-Metal Worker's Solution: |
|
Cut 24 odd angles: |
$120 |
Bend 6 odd angles: |
$30 |
Setup 1 floodlight: |
$5 |
Total: |
$155 |
Supervisor's Solution: |
|
Cut 1 rectangle: |
$8 |
Bend 2 right angles: |
$4 |
Paint 3 rectangles: |
$15 |
Set up 1 floodlight: |
$5 |
Supervisor's fee: |
$30 |
Total: |
$62 |
The supervisor's solution is the cheapest because it uses each specialist optimally, and the savings more than make up for the supervisor's fee. The moral is that the specialists must be coordinated, not necessarily by a homunculus or demon, but hy some arrangement that minimizes the {253} costs, where cheap equals simple equals probable. In the parable, simple operations are easier to perform; in the visual system, simpler descriptions correspond to likelier arrangements in the world.
Adelson and Pentland have brought their parable to life by programming a computer simulation of vision that is designed to interpret scenes with painted polygons much as we do. First, a shape analyzer (a software version of the sheet-metal worker) strives for the most regular shape that duplicates the image. Take the simple shape on the left in this diagram, which people see as a folded sheet, like a book held sideways.
The shape specialist tries to assemble a 3-D model of the input shape, shown on the right. When it begins, all it knows is that the corners and edges in the model have to line up with the dots and lines in the image; it does not know how far away they are in depth. The model's vertices are beads sliding on rods (like rays of projection), and the lines between them are infinitely elastic strings. The specialist slides the beads around until it arrives at a shape with the following desiderata. Each polygon making up the shape should be as regular as possible; that is, a polygon's angles should not be too different. For example, if the polygon has four sides, the specialist strives for a rectangle. The polygon should be as planar as possible, as if the polygon is filled in with a plastic panel that is hard to bend. And the polygons should be as compact as possible, rather than elongated along the line of sight, as if the plastic panel is also hard to stretch.
When the shape specialist is done, it passes on a rigid assembly of white panels to the lighting specialist. The lighting specialist knows the laws that dictate how reflected light depends on the illumination, the lightness of the surface, and the angle of the surface. The specialist is allowed to move a single distant light source around to illuminate the model from various directions. The optimal direction is the one that makes each pair of panels meeting at an edge look as much as possible {254} like their counterparts in the image, requiring as little gray paint as p ble to finish the job.
Finally, the reflectance specialist — the painter — gets the model. It is the specialist of last resort, and its task is to take care of remaining discrepancy between the image and the model. It finishes the job by proposing different shades of pigmentation for the various surfaces.
Does the program work? Adelson and Pentland presented it with a fanfold object and let it rip. The program displays its current guess about the object's shape (first column), its current guess about the direction of the light source (second column), its current guess about where the shadows fall (third column), and its current guess about how the object is painted (fourth column). The program's very first guesses are shown in the top row.
The program initially assumed that the object was flat, like a 2-D painting lying on a table, as in the top of the first column. (It is hard to depict this for you, because your brain insists on seeing a zigzag shape as being folded in depth. The sketch is trying to show some lines sitting flat on the page.) The program assumed the light source was head-on, from the direction of the eye (top of the second column). With this flat lighting, there are no shadows (top of the the third column). The reflectance specialist bears all the responsibility for the {255} image, and it just paints it in. The program thinks it is looking at a painting.
Once the program has a chance to adjust its guesses, it settles into the interpretation shown in the middle row. The shape specialist finds the most regular 3-D shape (shown in side view in the left column): square panels joined at right angles. The lighting specialist finds that by shining the light from above, it can make the play of shadows look something like the image. Finally the reflectance specialist touches up the model with paint. The four columns — zigzag 3-D shape, lighting from above, shadow in the middle, light stripe next to a darker one — correspond to how people interpret the original image.
Does the program do anything else reminiscent of humans? Remember how the fanfold flips in depth like a Necker cube. The outer fold becomes an inner one, and vice versa. The program, in a way, can see the flip, too; the flipped interpretation is shown in the bottom row. The program assigned the same costs to the two interpretations and arrived at one or the other randomly. When people see a 3-D shape flip, they usually see the direction of the light source flip, too: top fold out, light from above; bottom fold out, light from below. The program does the same. Unlike a person, the program does not actually flip between the two interpretations, but if Adelson and Pentland had had the specialists pass around their guesses in a constraint network (like the Necker cube network on p. 107 or the stereo vision model), rather than in an assembly line, it might have done so.
The workshop parable clarifies the idea that the mind is a collection of modules, a system of organs, or a society of experts. Experts are needed because expertise is needed: the mind's problems are too technical and specialized to be solved by a jack-of-all-trades. And most of the information needed by one expert is irrelevant to another and would only interfere with its job. But working in isolation, an expert can consider too many solutions or doggedly pursue an unlikely one; at some point the experts must confer. The many experts are trying to make sense of a single world, and that world is indifferent to their travails, neither offering easy solutions nor going out of its way to befuddle. So a supervisory scheme should aim to keep the experts within a budget in which improbable guesses are more expensive. That forces them to cooperate in assembling the most likely overall guess about the state of the world. {256}
Once the experts have completed their work, what do they post on the blackboard that the rest of the brain accesses? If we could somehow show the visual field from a rest-of-the-brain's-eye view, like the hypothetical camera behind the eye of the Terminator, what would it look like? The very question may sound like a thick-witted little-man-in-the-head fallacy, but it is not. It is about the information in one of the brain's data representations and the form the information takes. Indeed, taking the question seriously sends a bracing shock to our naive intuitions about the mind's eye.
The experts in stereo, motion, contour, and shading have worked hard to recover the third dimension. It would be natural to use the fruits of their labors to build a three-dimensional representation of the world. The retinal mosaic in which the scene is depicted gives way to a mental sandbox in which it is sculpted; the picture becomes a scale model. A 3-D model would correspond to our ultimate understanding of the world. When a child looms up to us and then shrinks away, we know we are not in Wonderland, where one pill makes you larger and one pill makes you small. And unlike the proverbial (and apocryphal) ostrich, we do not think that objects vanish when we look away or cover them up. We negotiate reality because our thought and action are guided by knowledge of a large, stable, solid world. Perhaps vision gives us that knowledge in the form of a scale model.
There is nothing inherently fishy about the scale-model theory. Many computer-aided design programs use software models of solid objects, and CAT-scan and MRI machines use sophisticated algorithms to assemble them. A 3-D model might have a list of the millions of coordinates of the tiny cubes that make up a solid object, called volume elements or “voxels” by analogy to the picture elements or “pixels” making up a picture. Each coordinate-triplet is paired with a piece of information, such as the density of the tissue at that spot in the body. Of course, if the brain stored voxels, they would not have to be arranged in a 3-D cube in the head, any more than voxels are arranged in a 3-D cube inside a computer. All that matters is that each voxel have a consistent set of neurons dedicated to it, so the patterns of firing can register the contents of the voxel. {257}
But now is the time to be vigilant about the homunculus. There is no problem with the idea that some software demon or look-up algorithm or neural network accesses information from a scale model, as long as we are clear that it accesses the information directly: coordinates of a voxel in, contents of the voxel out. Just don't think about the look-up algorithm seeing the scale model. It's pitch black in there, and the looker-upper doesn't have a lens or a retina or even a vantage point; it is anywhere and everywhere. There is no projection, no perspective, no field of view, no occlusion. Indeed, the whole point of the scale model is to eliminate these nuisances. If you want to think of a homunculus at all, imagine exploring a room-sized scale model of a city in the dark. You can wander through it, coming at a building from any direction, palpating its exterior or sticking fingers through windows and doors to probe its insides. When you grasp a building, its sides are always parallel, whether you are at arm's length or up close. Or think about feeling the shape of a small toy in your hands, or a candy in your mouth.
But vision — even the 3-D, illusion-free vision that the brain works so hard to achieve — is nothing like that! At best, we have an abstract appreciation of the stable structure of the world around us; the immediate, resplendent sense of color and form that fills our awareness when our eyes are open is completely different.
First, vision is not a theater in the round. We vividly experience only what is in front of our eyes; the world beyond the perimeter of the visual field and behind the head is known only in a vague, almost intellectual way. (I know there is a bookshelf behind me and a window in front of me, but I see only the window, not the bookshelf.) Worse, the eyes flit from spot to spot several times a second, and outside the crosshairs of the fovea the view is surprisingly coarse. (Hold your hand a few inches from your line of sight; it is impossible to count the fingers.) I am not just reviewing the anatomy of the eyeball. One could imagine the brain assembling a collage out of the snapshots taken at each glimpse, like the panoramic cameras that expose a frame of film, pan a precise amount, expose the adjacent stretch of film, and so on, yielding a seamless wide-angle picture. But the brain is not a panoramic camera. Laboratory studies have shown that when people move their eyes or head, they immediately lose the graphic details of what they were looking at.
Second, we don't have x-ray vision. We see surfaces, not volumes. If you watch me put an object inside a box or behind a tree, you know it's there but don't see it there and cannot report its details. Once again, this {258} is not just a reminder that you are not Superman. We mortals could have been equipped with a photographic memory that updates a 3-D model by pasting in information from previous views wherever it belongs. But we were not so equipped. When it comes to rich visual detail, out of sight is out of mind.
Third, we see in perspective. When you stand between railroad tracks, they seem to converge toward the horizon. Of course you know they do not really converge; if they did, the train would derail. But it's impossible not to see them as converging, even though your sense of depth provides plenty of information that your brain could use to cancel the effect. We also are aware that moving objects loom, shrink, and foreshorten. In a genuine scale model, none of this can happen. To be sure, the visual system eliminates perspective to a certain degree. People other than artists have trouble seeing that the near corner of a desk projects an acute angle and the far corner an obtuse angle; both look like the right angles they are in reality. But the railroad tracks show that perspective is not completely eliminated.
Fourth, in a strict geometric sense we see in two dimensions, not three. The mathematician Henri Poincare came up with an easy way to determine the number of dimensions of some entity. Find an object that can divide the entity into two pieces, then count the dimensions of the divider and add one. A point cannot be divided at all; therefore, it has zero dimensions. A line has one dimension, because it can be severed by a point. A plane has two dimensions, because it can be rent by a line, though not by a point. A sphere has three, because nothing less than a two-dimensional blade can cleave it; a pellet or a needle leaves it whole. What about the visual field? It can be sundered by a line. The horizon, for example, divides the visual field in two. When we stand in front of a taut cable, everything we see is on one side or the other. The perimeter of a round table also partitions the visual field: every point is either within it or outside. Add one to the one-dimensionality of a line, and you get two. By this criterion, the visual field is two-dimensional. Incidentally, this does not mean that the visual field is flat. Two-dimensional surfaces can be curved in the third dimension, like a rubber mold or a blister package.
Fifth, we don't immediately see “objects,” the movable hunks of matter that we count, classify, and label with nouns. As far as vision is concerned, it's not even clear what an object is. When David Man-considered how to design a computer vision system that finds objects, he was forced to ask: {259}
Is a nose an object? Is a head one? Is it still one if it is attached to a body? What about a man on horseback? These questions show that the difficulties in trying to formulate what should be recovered as a region from an image are so great as to amount almost to philosophical problems. There is really no answer to them — all these things can be an object if you want to think of them that way, or they can be part of a larger object.
A drop of Krazy Glue can turn two objects into one, but the visual system has no way of knowing that.
We have, however, an almost palpable sense of surfaces and the boundaries between them. The most famous illusions in psychology come from the brain's unflagging struggle to carve the visual field into surfaces and to decide which is in front of the other. One example is the’ Rubin face-vase, which flips between a goblet and a pair of profiles tete-a-tete. The faces and vase cannot be seen at the same time (even if one imagines two men holding up a goblet between their noses), and whichever shape predominates “owns” the border as its demarcating line, relegating the other patch to an amorphous backdrop.
Another is the Kanisza triangle, a stretch of nothingness that blocks out a shape as real as if it had inscribed it in ink.
The faces, vase, and triangle are familiar objects, but the illusions do not depend on their familiarity; meaningless blobs are just as compelling. {260}
We perceive surfaces involuntarily, impelled by information surging up from our retinas; contrary to popular belief, we do not see what we expect to see.
So what is the product of vision? Marr called it a 2½-D sketch; others call it a visible surface representation. Depth is whimsically downgraded to half a dimension because it does not define the medium in which visual information is held (unlike the left-right and high-low dimensions); it is just a piece of information held in that medium. Think of the toy made of hundreds of sliding pins which you press against a 3-D surface (such as a face), forming a template of the surface in the contour of the pins on the other side. The contour has three dimensions, but they are not created equal. Position from side to side and position from top to bottom are defined by particular pins; position in depth is defined by how far a pin protrudes. For any depth there may be many pins; for any pin there is only one depth.
The 2½-D sketch looks a bit like this:
It is a mosaic of cells or pixels, each dedicated to a line of sight from the cyclopean eye's vantage point. It is wider than it is tall because our two eyes sit side by side in our skulls rather than one being above the other. The cells are smaller in the center of the visual field than in the periphery because our resolution is greater in the center. Each cell can represent information about a surface or about an edge, as if it had two kinds {261} of forms with blanks to be filled in. The form for a piece of surface has blanks for depth, for slant (how much the surface leans backward or forward), for tilt (how much it lists left or right), and for color, plus a label for which surface it is seen as belonging to. The form for a piece of edge has boxes to be checked, indicating whether it is at the boundary of an object, a groove, or a ridge, plus a dial for its orientation, which also shows (in the case of an object boundary) which side belongs to the surface that “owns” the boundary and which side is merely the backdrop. Of course, we won't literally find bureaucratic forms in the head. The diagram is a composite that depicts the kinds of information in the 2½-D sketch. The brain presumably uses clusters of neurons and their activities to hold the information, and they may be distributed across different patches of cortex as a collection of maps that are accessed in register.
Why do we see in two and a half dimensions? Why not a model in the head? The costs and benefits of storage give part of the answer. Any computer user knows that graphics files are voracious consumers of storage space. Rather than agglomerating the incoming gigabytes into a composite model, which would be obsolete as soon as anything moved, the brain lets the world itself store the information that falls outside a glance. Our heads crane, our eyes flit, and a new, up-to-date sketch is loaded in. As for the second-class status of the third dimension, it is almost inevitable. Unlike the other two dimensions, which announce themselves in the rods and cones that are currently active, depth must be painstakingly wrung out of the data. The stereo, contour, shading, and motion experts that work on computing depth are equipped to send along information about distance, slant, tilt, and occlusion relative to the viewer, not 3-D coordinates in the world. The best they can do is to pool their efforts to give us a two-and-a-half-dimensional acquaintance with the surfaces in front of our eyes. It's up to the rest of the brain to figure out how to use it.
The 2½-D sketch is the masterwork of the ingeniously designed, harmoniously running machinery of the visual system. It has only one problem. As delivered, it is useless.
Information in the 2½-D array is specified in a retinal frame of reference, a coordinate system centered on the viewer. If a particular cell {262} says, “There's an edge here,” what “here” means is the position of that cell on the retina — say, dead straight ahead where you're looking. That would be fine if you were a tree looking at another tree, but as soon as something moves — your eyes, your head, your body, a sighted object — the information lurches to a new resting place in the array. Any part of the brain being guided by information in the array would find that its information is now defunct. If your hand was being guided toward the center of the visual field because that spot had contained an apple, the hand would now be heading toward empty space. If yesterday you memorized an image of your car as you were looking at its door handle, today the image would not match your view of the fender; the two views would barely overlap. You can't even make simple judgments like whether two lines are parallel; remember the converging railroad tracks.
These problems make one long for a scale model in the head, but that isn't what vision delivers. The key to using visual information is not to remold it but to access it properly, and that calls for a useful reference frame or coordinate system. Reference frames are inextricable from the very idea of location. How do you answer the question “Where is it?” By naming an object that the asker already knows — the frame of reference — and describing how far and in what direction the “it” is, relative to the frame. A description in words like “next to the fridge,” a street address, compass directions, latitude and longitude, Global Positioning System satellite coordinates — they all indicate distance and direction relative to a reference frame. Einstein built his theory of relativity by questioning Newton's fictitious reference frame that was somehow anchored in empty space, independent of anything in it.
The frame of reference packaged with the 2½-D sketch is position on the retina. Since the retinas constantly gyrate, it is as useless a's directions like “Meet me next to the beige Pontiac that's stopped here at the light.” We need a reference frame that stays put as the eyes rock and roll. Suppose there is a circuit that can slide an invisible reference frame over the visual field, like the crosshairs of a rifle sight sliding over a landscape. And suppose that any mechanism that scoops information out of the visual field is locked onto positions defined by the rifle sight (for example, at the hair-crossing, two notches above them, or a notch to the left). Computer displays have a vaguely similar device, the cursor. The commands that read and write information do so relative to a special point that can be positioned at will over the screen, and when the material on the screen scrolls around, the cursor moves with it, glued to its piece of {263} text or graphics. For the brain to use the contents of the 2½-D sketch, it must employ a similar mechanism, indeed, several of them.
The simplest reference frame that moves over the 2½-D sketch is one that stays riveted to the head. Thanks to the laws of optics, when the eyes move right, the image of the apple scoots left. But suppose the neural command to the eye muscles is cc'd to the visual field, and is used to shift the crosshairs over by the same amount in the opposite direction. The crosshairs will stay on the apple, and so will any mental process that funnels information through the crosshairs. The process can happily continue as if nothing had happened, even though the contents of the visual field have slid around.
Here's a simple demonstration of the cc'ing. Move your eyes; the world stands still. Now close one eye and nudge the other one with your finger; the world jumps. In both cases the eye moves, and in both cases the retinal image moves, but only when the eye is moved by a finger do you see the movement. When you move your eyes by deciding to look somewhere, the command to the eye muscles is copied to a mechanism that moves the reference frame together with the sliding images so as to cancel your subjective sense of motion. But when you move your eye by poking it with your finger, the frame-shifter is bypassed, the frame is not shifted, and you interpret the jerking image as coming from a jerking world.
There may also be reference frames that compensate for movements of the head and body. They give each bit of surface in the visual field a fixed address relative to the room or relative to the ground; the address stays the same as the body moves. These frame shifts might be driven by copies of commands to the neck and body muscles, though they may also be driven by circuitry that tracks the slippage of the contents of the visual field.
Another handy overlay would be a trapezoidal mental grid that marked out equal-sized extents in the world. A gridmark near our feet would cover a large stretch of the visual field; a gridmark near the horizon would cover a smaller stretch of the visual field but the same number of inches as measured along the ground. Since the 2½-D sketch contains depth values at every point, the gridmarks would be easy for the brain to {264} calculate. This world-aligned reference frame would allow us to judge the genuine angles and extents of the matter outside our skin. The perceptual psychologist J. J. Gibson argued that we do have this sense of real-world scale superimposed on the retinal projection, and we can mentally flip between not using it and using it. Standing between the railroad tracks, we can assume one frame of mind in which we see the tracks converge, or another in which we see them as parallel. These two attitudes, which Gibson called “the visual field” and “the visual world,” come from accessing the same information by either the retinal frame or a world-aligned frame.
Yet another invisible frame is the direction of gravity. The mental plumb bob comes from the vestibular system of the inner ear, a labyrinth of chambers that includes three semicircular canals oriented at right angles to each other. If anyone doubts that natural selection uses principles of engineering rediscovered by humans, let them behold the XYZ Cartesian coordinate axes etched into the bones of the skull! As the head pitches, rolls, and yaws, fluid in the canals sloshes around and triggers neural signals registering the motion. A heavy mass of grit pressing down on other membranes registers linear motion and the direction of gravity. These signals can be used to rotate the mental crosshairs so they are always correctly pointing “up.” That is why the world does not seem to list even though people's heads are seldom plumb perpendicular. (The eyes themselves tilt clockwise and counterclockwise in the head, but only enough to undo small head tilts.) Oddly enough, our brains do not compensate for gravity very much. If the compensation were perfect, the world would look normal when we are lying sideways or even standing on our heads. Of course, it does not. It's hard to watch television lying on your side unless you prop your head on your hand, and it's impossible to read unless you hold the book sideways. Perhaps because we are terrestrial creatures, we use the gravity signal mostly to keep our bodies upright rather than to compensate for out-of-kilter visual input when they are not.
The coordination of the retina's frame with the inner ear's frame affects our lives in a surprising way: it causes motion sickness. Ordinarily, when you move about, two signals work in synchrony: the swoops of texture and color in the visual field, and the messages about gravity and inertia sent by the inner ear. But if you are moving inside a container like a car, a boat, or a sedan chair — evolutionarily unprecedented ways to get around — the inner ear says, “You're moving,” but the walls and floor say, {265} “You're staying put.” Motion sickness is triggered by this mismatch, and the standard treatments have you eliminate it: don't read; look out the window; stare at the horizon.
Many astronauts are chronically space-sick, because there is no gravitational signal, a rather extreme mismatch between gravity and vision. (Space-sickness is measured in garns, a unit named after the Republican senator from Utah, Jake Gam, who parlayed his position on the NASA appropriations subcommittee into the ultimate junket, a trip into space. Space Cadet Gam made history as the all-time champion upchucker.) Worse, spacecraft interiors do not give the astronauts a world-aligned frame of reference, because the designers figure that without gravity the concepts “floor,” “ceiling,” and “walls” are meaningless, so they might as well put instruments on all six surfaces. The astronauts, unfortunately, carry their terrestrial brains with them and literally get lost unless they stop and say to themselves, “I'm going to pretend that thataway is ‘up,’ thataway is ‘forward,'” and so on. It works for a while, but if they look out the window and see terra firma above them, or catch sight of a crew-mate floating upside down, a wave of nausea slams them. Space sickness is a concern to NASA, and not only because of the decline in productivity during expensive flight time; you can well imagine the complications of vomiting in zero gravity. It will also affect the burgeoning technology of virtual reality, in which a person wears a wide-field helmet showing a synthetic world whizzing by. Newsweek's assessment: “The most barfogenic invention since the Tilt-a-Whirl. We prefer Budweiser.”
Why on earth — or space — should a mismatch between vision and gravity or inertia lead, of all things, to nausea? What does up-and-down have to do with the gut? The psychologist Michel Treisman has come up with a plausible though still unproven explanation. Animals vomit to expel toxins they have eaten before the toxins do further harm. Many naturally occurring toxins act on the nervous system. This raises the problem faced by Ingrid Bergman in Notorious: how do you know when you have been poisoned? Your judgment would be addled, but that would affect your judgment about whether your judgment has been addled! More generally, how could a malfunction detector distinguish between the brain's malfunctioning and its accurately registering an unusual situation? (Old bumper sticker: “The world is experiencing technical difficulties. Do not adjust your mind.”) Gravity, of course, is the most stable, predictable feature of the world. If two parts of the brain have different opinions about it, chances are that one or both is malfunctioning or that the signals they are {266} getting have been delayed or garbled. The rule would be: if you think gravity is acting up, you've been poisoned; jettison the rest of the poison, now.
The mental up-down axis is also a powerful organizer of our sense of shape and form. What do we have here?
Few people recognize that it is an outline of Africa rotated ninety degrees, even if they tilt their heads counterclockwise. The mental representation of a shape — how our minds “describe” it — does not just reflect its Euclidean geometry, which remains unchanged as a shape is turned. It reflects the geometry relative to our up-down reference frame. Our minds think of Africa as a thing with a fat bit “at the top” and a skinny bit “at the bottom.” Change what's at the top and what's at the bottom, and it's no longer Africa, even if not a jot of coastline has been altered.
The psychologist Irvin Rock has found many other examples, including this simple one:
People see the drawings as two different shapes, a square and a diamond. But as far as a geometer is concerned, they are one and the same shape. They are pegs that fit the same holes; every angle and line is the same. The only difference is in how they are aligned with respect to the viewer's up-and-down reference frame, and that difference is enough to earn them different words in the English language. A square is flat on top, a diamond is pointy on top; there's no avoiding the “on top.” It is even hard to see that the diamond is made of right angles. Finally, objects themselves can plot out reference frames: {267}
The shape at the top right flips between looking like a square and looking like a diamond, depending on whether you mentally group it with the three shapes to its left or the eight shapes below. The imaginary lines aligned with the rows of shapes have become Cartesian reference frames — one frame aligned with the retinal up-down, the other tilted diagonally — and a shape looks different when it is mentally described within one or the other.
And in case you are still skeptical about all these colorless, odorless, and tasteless reference frames allegedly overlaying the visual field, I give you a wonderfully simple demonstration from the psychologist Fred Attneave. What is going on in the triangles on the left?
Look at them long enough, and they snap from one appearance to another. They don't move around, they don't reverse in depth, but something changes. People describe the change as “which way they point.” What is leaping around the page is not the triangles themselves but a mental frame of reference overlaying the triangles. The frame comes not from the retina, the head, the body, the room, the page, or gravity, but from an axis of symmetry of the triangles. The triangles have three such axes, and they take turns dominating. Each axis has the equivalent of a north and a south pole, which grant the feeling that the triangles are pointing. The triangles flip en masse, as if in a chorus line; the brain likes its reference frames to embrace entire neighborhoods of shapes. The triangles in the right diagram are even more jumpy, hopping among six impressions. They can be interpreted either as obtuse triangles lying flat on the page or as right-angle triangles standing in depth, each with a reference frame that can sit three ways. {268}
The ability of objects to attract reference frames to themselves helps to solve one of the great problems in vision, the next problem we face as we continue our climb from the retina to abstract thought. How do people recognize shapes? An average adult knows names for about ten thousand things, most of them distinguished by shape. Even a six-year-old knows names for a few thousand, having learned them at a rate of one every few hours for years. Of course, objects can be recognized from many giveaways. Some can be recognized by their sounds and smells, and others, such as shirts in a hamper, can be identified only by their color and material. But most objects can be recognized by their shapes. When we recognize an object's shape, we are acting as pure geometers, surveying the distribution of matter in space and finding the closest match in memory. The mental geometer must be acute indeed, for a three-year-old can look through a box of animal crackers or a pile of garish plastic chips and rattle off the names of exotic fauna from their silhouettes.
The diagram at the bottom of page 9 introduced you to why the problem is so hard. When an object or the viewer moves, the contours in the 2½-D sketch change. If your memory for the shape — say, a suitcase — was a copy of the 2½-D sketch when you first saw it, the moved version would no longer match. Your memory of a suitcase would be “a rectangular slab and a horizontal handle at twelve o'clock,” but the handle you are now looking at is not horizontal and not at twelve o'clock. You would stare blankly, not knowing what it is. {269}
But suppose that instead of using the retinal reference frame, your memory file uses a frame aligned with the object itself. Your memory would be “a rectangular slab with a handle parallel to the edge of the slab, at the top of the slab.” The “of the slab” part means that you remember the positions of the parts relative to the object itself, not relative to the visual field. Then, when you see an unidentified object, your visual system would automatically align a 3-D reference frame on it, just as it did with Attneave's chorus line of squares and triangles. Now when you match what you see with what you remember, the two coincide, regardless of how the suitcase is oriented. You recognize your luggage.
That, in a nutshell, is how Marr explained shape recognition. The key idea is that a shape memory is not a copy of the 2½-D sketch but rather is stored in a format that differs from it in two ways. First, the coordinate system is centered on the object — not, as in the 2½-D sketch, on the viewer. To recognize an object, the brain aligns a reference frame on its axes of elongation and symmetry and measures the positions and angles of the parts in that reference frame. Only then are vision and memory matched. The second difference is that the matcher does not compare vision and memory pixel by pixel, as if placing a jigsaw puzzle piece in a gap. If it did, shapes that ought to match still might not. Real objects {270} have dents and wobbles and come in different styles and models. No two suitcases have identical dimensions, and some have rounded or beefed-up corners and fat or skinny handles. So the representation of the shape about to be matched shouldn't be an exact mold of every hill and valley. It should be couched in forgiving categories like “slab” and “U-shaped thingy.” The attachments, too, can't be specified to the millimeter but have to allow for some slop: the handles of different cups are all “on the side,” but they can be a bit higher or lower from cup to cup.
The psychologist Irv Biederman has fleshed out Marr's two ideas with an inventory of simple geometric parts that he calls “geons” (by analogy to the protons and electrons making up atoms). Here are five geons along with some combinations:
Biederman proposes twenty-four geons altogether, including a cone, a megaphone, a football, a tube, a cube, and a piece of elbow macaroni. (Technically, they are all just different kinds of cones. If an ice cream cone is the surface swept out by an expanding circle as its center is moved along a line, geons are the surfaces swept out by other 2-D shapes as they expand or contract while moving along straight or curved lines.) Geons can be assembled into objects with a few attachment relations like “above,” “beside,” “end to end,” “end to off-center,” and “parallel.” These relations are defined in a frame of reference centered on the object, of course, not the visual field; “above” means “above the main geon,” not “above the fovea.” So the relations stay the same when the object or viewer moves.
Geons are combinatorial, like grammar. Obviously we don't describe shapes to ourselves in words, but geon assemblies are a kind of internal language, a dialect of mentalese. Elements from a fixed vocabulary are fitted together into larger structures, like words in a phrase or sentence. A sentence is not the sum of its words but depends on their syntactic arrangement; A man bites a dog is not the same as A dog bites a man. Likewise, an object is not the sum of its geons but depends on their spatial arrangement; a cylinder with an elbow on the side is a cup, while a {271} cylinder with an elbow at the top is a pail. And just as a small number of words and rules combine into an astronomical number of sentences, a small number of geons and attachments combine into an astronomical number of objects. According to Biederman, each of the twenty-four geons comes in fifteen sizes and builds (a bit fatter, a bit skinnier), and there are eighty-one ways to join them. That allows for 10,497,600 objects built out of two geons, and 306 billion objects made of three geons. In theory, that should be more than enough to fit the tens of thousands of shapes we know. In practice, it's easy to build instantly recognizable models of everyday objects out of three, and often only two, geons.
Language and complex shapes even seem to be neighbors in the brain. The left hemisphere is not only the seat of language but also the seat of the ability to recognize and imagine shapes defined by arrangements of parts. A neurological patient who had suffered a stroke to his left hemisphere reported, “When I try to imagine a plant, an animal, an object, I can recall but one part. My inner vision is fleeting, fragmented; if I'm asked to imagine the head of a cow, I know it has ears and horns, but I can't revisualize their places.” The right hemisphere, in contrast, is good for measuring whole shapes; it can easily judge whether a rectangle is taller than it is wide or whether a dot lies more or less than an inch from an object.
One advantage of the geon theory is that its demands on the 2½-D sketch are not unreasonable. Carving objects into parts, labeling the parts as geons, and ascertaining their arrangement are not insurmountable problems, and vision researchers have developed models of how the brain might solve them. Another advantage is that a description of an object's anatomy helps the mind to think about objects, not just to blurt out their names. People understand how objects work and what they are for by analyzing the shapes and arrangements of their parts.
The geon theory says that at the highest levels of perception the mind “sees” objects and parts as idealized geometric solids. That would explain a curious and long-noted fact about human visual aesthetics. Anyone who has been to a figure-drawing class or a nude beach quickly learns that real human bodies do not live up to our sweet imaginations. Most of us look better in clothes. In his history of fashion, the art historian Quentin Bell gives an explanation that could have come right out of the geon theory: {272}
If we wrap an object in some kind of envelope, so that the eyes infer rather than see the object that is enclosed, the inferred or imagined form is likely to be more perfect than it would appear if it were uncovered. Thus a square box covered with brown paper will be imagined as a perfect square. Unless the mind is given some very strong clue it is unlikely to visualize holes, dents, cracks, or other accidental qualities. In the same way, if we cast a drapery over a thigh, a leg, an arm or a breast, the imagination supposes a perfectly formed member; it does not and usually cannot envisage the irregularities and the imperfections which experience should lead us to expect.
. . . We know what [a body] is probably like from experience, and yet we are willing to suspend our disbelief in favour of the fictions of [the person's] wardrobe. Indeed I think that we are ready to go further in the way of self-deception. When we slip on our best jacket and see our deplorably unimpressive shoulders artfully magnified and idealised we do, for a moment, rise in our own esteem.
Geons are not good for everything. Many natural objects, such as mountains and trees, have complicated fractal shapes, but geons turn them into pyramids and lollipops. And though geons can be built into a passable generic human face, like a snowman or Mr. Potato Head, it is almost impossible to build a model of a particular face — John's face, your grandmother's face — that is different enough from other faces not to confuse them, but stable enough across smiles, frowns, weight gains, and aging to identify that person every time. Many psychologists believe that face recognition is special. In a social species like ours, faces are so important that natural selection gave us a processor that registers the kinds of geometric contours and ratios needed to tell them apart. Babies lock onto facelike patterns, but not onto other complex and symmetrical arrangements, when they are only thirty minutes old, and quickly learn to recognize their mothers, perhaps as early as the second day of life.
Face recognition may even use distinct parts of the brain. An inability to recognize faces is called prosopagnosia. It is not the same as Oliver Sacks’ famous man who mistook his wife for a hat: prosopagnosics can tell a face from a hat; they just can't tell whose face it is. But many of them can recognize hats and almost everything else. For example, the patient “LH” was tested by the psychologists Nancy Etcoff and Kyle Cave and the neurologist Roy Freeman. LH is an intelligent, knowledgeable man who suffered head injuries in a car accident twenty years {273} before the tests. Since the accident he has been utterly unable to recognize faces. He cannot recognize his wife and children (except by voice, scent, or gait), his own face in a mirror, or celebrities in photographs (unless they have a visual trademark like Einstein, Hitler, and the Beatles in their moptop days). It was not that he had trouble making out the details of a face; he could match full faces with their profiles, even in arty sidelighting, and assess their age, sex, and beauty. And he was virtually normal at recognizing complicated objects that were not faces, including words, clothing, hairstyles, vehicles, tools, vegetables, musical instruments, office chairs, eyeglasses, dot patterns, and television antenna-like shapes. There were only two kinds of shapes he had trouble with. He was embarrassed that he could not name his children's animal crackers; similarly, in the lab he was below average at naming drawings of animals. And he had some trouble recognizing facial expressions such as frowns, sneers, and fearful looks. But neither animals nor facial expressions were as hard for him as faces, which drew utter blanks.
It's not that faces are the hardest things our brains are ever called upon to recognize, so that if a brain is not running on all eight cylinders, face recognition will be the first thing to suffer. The psychologists Marlene Behrmann, Morris Moscovitch, and Gordon Winocur studied a young man who had been hit on the head by the rear-view mirror of a passing truck. He has trouble recognizing everyday objects but no trouble recognizing faces, even when the faces are disguised with glasses, wigs, or mustaches. His syndrome is the opposite of prosopagnosia, and it proves that face recognition is different from object recognition, not just harder.
So do prosopagnosics have a broken face-recognition module? Some psychologists, noting that LH and other prosopagnosics have some trouble with some other shapes, would rather say that prosopagnosics have trouble processing the kinds of geometric features that are most useful in recognizing faces, though also useful in recognizing certain other kinds of shapes. I think the distinction between recognizing faces and recognizing objects with the geometry of faces is meaningless. From the brain's point of view, nothing is a face until it has been recognized as a face. The only thing that can be special about a perception module is the kind of geometry it pays attention to, such as the distance between symmetrical blobs, or the curvature pattern of 2-D elastic surfaces that are drawn over a 3-D skeleton and filled out by underlying soft pads and connectors. If objects other than faces (animals, facial expressions, or {274} even cars) have some of these geometric features, the module will have no choice but to analyze them, even if they are most useful for faces. To call a module a face-recognizer is not to say it can handle only faces; it is to say that it is optimized for the geometric features that distinguish faces because the organism was selected in its evolutionary history for an ability to recognize them.
The geon theory is lovely, but is it true? Certainly not in its purest form, in which every object would get one description of its 3-D geometry, uncontaminated by the vagaries of vantage point. Most objects are opaque, with some surfaces obscuring others. That makes it literally impossible to arrive at the same description of the object from every vantage point. For example, you can't know what the back of a house looks like when you are standing in front of it. Marr got around the problem by ignoring surfaces altogether and analyzing animals’ shapes as if they were built out of pipe cleaners. Biederman's version concedes the problem and gives each object several geon models in the mental shape catalogue, one for each view required to reveal all its surfaces.
But this concession opens the door to a completely different way of doing shape recognition. Why not go all the way and give each shape a large number of memory files, one for every vantage point? Then the files wouldn't need a fancy object-centered reference frame; they could use the retinal coordinates available free in the 2½-D sketch, as long as there were enough files to cover all the angles of view. For many years this idea was dismissed out of hand. If the continuum of viewing angles were chopped into one-degree differences, one would need forty thousand files for every object to cover them all (and those are just to cover the viewing angles; they don't embrace the viewing positions at which the object is not dead-center, or the different viewing distances). One cannot skimp by specifying a few views, like an architect's plan and elevation, because in principle any of the views might be crucial. (Simple proof: Imagine a shape consisting of a hollow sphere with a toy glued on the inside and a small hole drilled opposite it. Only by sighting the toy exactly through the hole can the entire shape be seen.) But recently the idea has made a comeback. By choosing views judiciously, and using a pattern-associator neural network to interpolate between them when an {275} object doesn't match a view spot-on, one can get away with storing a manageable number of views per object, forty at most.
It still seems unlikely that people have to see an object from forty different angles to recognize it thereafter, but another trick is available. Remember that people rely on the up-down direction to construe shapes: squares aren't diamonds, sideways Africa goes unrecognized. This introduces another contamination of the pure geon theory: relations like “above” and “top” must come from the retina (with some adjustment from gravity), not from the object. That concession may be inevitable, because there's often no way of pinpointing the “top” of an object before you've recognized it. But the real problem comes from what people do with sideways objects they don't recognize at first. If you tell people that a shape has been turned sideways, they recognize it quickly, as you surely did when I told you that the Africa drawing was on its side. People can mentally rotate a shape to the upright and then recognize the rotated image. With a mental image-rotator available, the object-centered frame of the geon theory becomes even less necessary. People could store some 2½-D views from a few standard vantage points, like police mug shots, and if an object in front of them didn't match one of the shots, they would mentally rotate it until it did. Some combination of multiple views and a mental rotator would make geon models in object-centered reference frames unnecessary.
With all these options for shape recognition, how can we tell what the mind actually does? The only way is to study real human beings recognizing shapes in the laboratory. One famous set of experiments pointed to mental rotation as a key. The psychologists Lynn Cooper and Roger Shepard showed people letters of the alphabet at different orientations — upright, tilted 45 degrees, sideways, tilted 135 degrees, and upside down. Cooper and Shepard didn't have people blurt out the letter's name because they were worried about shortcuts: a distinctive squiggle like a loop or a tail might be detectable in any orientation and give away the answer. So they forced their subjects to analyze the full geometry of each letter by showing either the letter or its mirror image, and having the subjects press one button if the letter was normal and the other if it was mirror-reversed. {276}
When Cooper and Shepard measured how long it took people to press the button, they observed a clear signature of mental rotation. The farther the letter was misoriented from the upright, the longer people took. That's exactly what you would expect if people gradually dialed an image of the letter to the upright; the more it has to be turned, the longer the turning takes. Maybe, then, people recognize shapes by turning them over in their minds.
But maybe not. People were not just recognizing shapes; they were discriminating them from their mirror images. Mirror images are special. It is fitting that the sequel to Alice's Adventures in Wonderland was called Through the Looking-Glass. The relation of a shape to its mirror image gives rise to surprises, even paradoxes, in many branches of science. (They are explored in fascinating books by Martin Gardner and by Michael Corballis and Ivan Beale.) Consider the detached right and left hands of a mannequin. In one sense they are identical: each has four fingers and a thumb attached to a palm and a wrist. In another sense they are utterly different; one shape cannot be superimposed on the other. The difference lies only in how the parts are aligned with respect to a frame of reference in which all three axes are labeled with directions: up-down, frontward-backward, left-right. When a right hand is pointing fin-gers-up palm-frontward (as in a “halt” gesture), its thumb points left; when a left hand is pointing fingers-up palm-frontward, its thumb points right. That's the only difference, but it is real. The molecules of life have a handedness; their mirror images often do not exist in nature and would not work in bodies.
A fundamental discovery of twentieth-century physics is that the universe has a handedness, too. At first that sounds absurd. For any object and event in the cosmos, you have no way of knowing whether you are seeing the actual event or its reflection in a mirror. You may protest that organic molecules and human-made objects like letters of the alphabet are an exception. The standard versions are all over the place and familiar; the mirror images are rare and can easily be recognized. But for a physicist, they don't count, because their handedness is a historical accident, not something ruled out by the laws of physics. On another planet, or on this one if we could rewind the tape of evolution and let it happen again, they could just as easily go the other way. Physicists used to think that this was true for everything in the universe. Wolfgang Pauli wrote, “I do not believe that the Lord is a weak left-hander,” and Richard Feynman bet fifty dollars to one (he was unwilling to bet a hundred) that no experiment {277} would ever reveal a law of nature that looked different through the looking glass. He lost. The cobalt 60 nucleus is said to spin counterclockwise if you look down on its north pole, but that description by itself is circular because “north pole” is simply what we call the end of the axis from which a rotation looks counterclockwise. The logical circle would be broken if something else differentiated the so-called north pole from the so-called south pole. Here is the something else: when the atom decays, electrons are more likely to be flung out of the end we call south. “North” versus “south” and “clockwise” versus “counterclockwise” are no longer arbitrary labels but can be distinguished relative to the electron spurt. The decay, hence the universe, would look different in the mirror. God is not ambidextrous after all.
So right- and left-handed versions of things, from subatomic particles to the raw material of life to the spin of the earth, are fundamentally different. But the mind usually treats them as if they were the same:
Pooh looked at his two paws. He knew that one of them was the right, and he knew that when you had decided which one of them was the right, then the other one was the left, but he never could remember how to begin.
None of us is good at remembering how to begin. Left and right shoes look so alike that children must be taught tricks to distinguish them, like placing the shoes side by side and sizing up the gap. Which way is Abraham Lincoln facing on the American one-cent piece? There is only a fifty percent chance you will get the answer right, the same as if you had answered by flipping the penny. What about Whistler's famous painting, Arrangement in Black and Gray: The Artist's Mother”? Even the English language likes to collapse left and right: beside and next to denote side-by-side without specifying who's on the left, but there is no word like bebove or aneath that denotes up-and-down without specifying who's on top. Our obliviousness to left-and-right stands in stark contrast to our hypersensitivity to up-and-down and front-and-back. Apparently the human mind does not have a preexisting label for the third dimension of its object-centered reference frame. When it sees a hand, it can align the wrist-fingertip axis with “down-up,” and the back-palm axis with “backward-forward,” but the direction of the pinkie-thumb axis is up for grabs. The mind calls it, say, “thumbward,” and the left and right hands become mental synonyms. Our indecisiveness about left and right needs an {278} explanation, because a geometer would say they are no different from up and down or front and back.
The explanation is that mirror-image confusions come naturally to a bilaterally symmetrical animal. A perfectly symmetrical creature is logically incapable of telling left from right (unless it could react to the decay of cobalt 60!). Natural selection had little incentive to build animals asymmetrically so that they could mentally represent shapes differently from their reflections. Actually, this puts it backwards: natural selection had every incentive to build animals symmetrically so that they would not represent shapes differently from their reflections. In the intermediate-sized world in which animals spend their days (bigger than subatomic particles and organic molecules, smaller than a weather front), left and right make no difference. Objects from dandelions to mountains have tops that differ conspicuously from their bottoms, and most things that move have fronts that differ conspicuously from their behinds. But no natural object has a left side that differs nonrandomly from its right, making its mirror-image version behave differently. If a predator comes from the right, next time it might come from the left. Anything learned from the first encounter should generalize to the mirror-image version. Another way of putting it is that if you took a photographic slide of any natural scene, it would be obvious if someone had turned it upside down, but you wouldn't notice if someone had flipped it left-to-right, unless the scene contained a human-made object like a car or writing.
And that brings us back to letters and mental rotation. In a few human activities, like driving and writing, left and right do make a difference, and we learn to tell them apart. How? The human brain and body are slightly asymmetrical. One hand is dominant, owing to the asymmetry of the brain, and we can feel the difference. (Older dictionaries used to define “right” as the side of the body with the stronger hand, based on the assumption that people are righties. More recent dictionaries, perhaps out of respect for an oppressed minority, use a different asymmetrical object, the earth, and define “right” as east when you are facing north.) The usual way that people tell an object from its mirror image is by turning it so it faces up and forward and looking at which side of their body — the side with the dominant hand or the side with the nondominant hand — the distinctive part is pointing to. The person's body is used as the asymmetrical frame of reference that makes the distinction between a shape and its mirror image logically possible. {279} Now, Cooper and Shepard's subjects may have been doing the same thing, except that they were rotating the shape in their minds instead of in the world. To decide whether they were seeing a normal or a backwards R, they mentally rotated an image of the shape until it was upright, and then judged whether the imaginary loop was on their right side or their left side.
So Cooper and Shepard have demonstrated that the mind can rotate objects, and they have demonstrated that one aspect of an object's intrinsic shape — its handedness — is not stored in a 3-D geon model. But for all its fascination, handedness is such a peculiar feature of the universe that we cannot conclude much about shape recognition in general from the experiments on mental rotation. For all we know, the mind could overlay objects with a 3-D reference frame (for geon matching), specified up to, but not including, which way to put the arrow on the side-to-side axis. As they say, more research is needed.
The psychologist Michael Tarr and I did some more research. We created our own little world of shapes and despotically controlled people's exposure to them, aiming at clean tests of the three hypotheses on the table.
The shapes were similar enough that people could not use shortcuts like a telltale squiggle. None was a mirror image of any other, so we would not get sidetracked by the peculiarities of the world in the looking glass. Each shape had a giveaway little foot, so people would never have a problem finding the top and the bottom. We gave each person three shapes to learn, and then asked them to identify the shapes by pressing one of three buttons whenever a shape flashed on a computer screen. Each shape appeared at a few orientations over and over. For example, Shape 3 might appear with its top at four o'clock hundreds of times, and with its top at seven o'clock hundreds of times. (All the shapes and tilts were mixed up in a random order.) People thus had the opportunity to learn what each shape looked like in a few views. {280} Finally, we hit them with a flurry of new trials in which every shape appeared at twenty-four evenly spaced orientations (again randomly ordered). We wanted to see how people dealt with the old shapes at the new orientations. Every button-press was timed to the thousandth of a second.
According to the multiple-view theory, people should create a separate memory file for every orientation in which an object commonly appeared. For example, they would set up a file showing what Shape 3 looks like right-side up (which is how they learned it), and then a second file for what it looks like at four o'clock and a third for seven o'clock. The people should soon recognize Shape 3 at these orientations very quickly. When we then surprised them with the same shapes at new orientations, however, they should take much longer, because they would have to interpolate a new view between the familiar ones to accommodate it. The new orientations should all take an extra increment of time.
According to the mental-rotation theory, people should be quick to recognize the shape when it is upright, and slower and slower the farther it has been misoriented. An upside-down shape should take the longest, because it needs a full 180-degree turn; the four o'clock shape should be quicker, for it needs only 120 degrees, and so on.
According to the geon theory, orientation shouldn't matter at all. People would learn the objects by mentally describing the various arms and crosses in a coordinate system centered on the object. Then, when a test shape flashed on the screen, it should make no difference if it was sideways, tilted, or upside down. Overlaying a frame should be quick and foolproof, and the shape's description relative to the frame would match the memory model every time.
The envelope, please. And the winner is ...
All of the above. People definitely stored several views: when a shape appeared in one of its habitual orientations, people were very quick to identify it.
And people definitely rotate shapes in their minds. When a shape appeared at a new, unfamiliar orientation, the farther it would have to be rotated to be aligned with the nearest familiar view, the more time people took.
And at least for some shapes, people use an object-centered reference frame, as in the geon theory. Tarr and I ran a variant of the experiment in which the shapes had simpler geometries: {281}
The shapes were symmetrical or nearly symmetrical, or always had the same kinds of frills on each side, so people would never have to describe the parts’ up-down and side-to-side arrangements in the same reference frame. With these shapes, people were uniformly quick at identifying them in all their orientations; upside down was no slower than right-side up.
So people use all the tricks. If a shape's sides are not too different, they store it as a 3-D geon model centered on the object's own axes. If the shape is more complicated, they store a copy of what it looks like at each orientation they see it in. When the shape appears at an unfamiliar orientation, they mentally rotate it into the nearest familiar one. Perhaps we shouldn't be surprised. Shape recognition is such a hard problem that a single, general-purpose algorithm may not work for every shape under every viewing condition.
Let me finish the story with my happiest moment as an experimenter. You may be skeptical about the mental turntable. All we know is that tilted shapes are recognized more slowly. I've glibly written that people rotate an image, but maybe tilted shapes are just harder to analyze for other reasons. Is there any evidence that people actually simulate a physical rotation in real time, degree by degree? Does their behavior show some signature of the geometry of rotation that could convince us that they play a movie in their minds?
Tarr and I had been baffled by one of our findings. In a different experiment, we had tested people both on the shapes they had studied and on their mirror images, at a variety of orientations: {282}
It wasn't a mirror-image test, like the Cooper and Shepard experiments; people were told to treat the two versions the same, just as they use the same word for a left and a right glove. This, of course, is just people's natural tendency. But somehow our subjects were treating them differently. For the standard versions (top row), people took longer when the shape was tilted farther: every picture in the top row took a bit longer than the one before. But for the reflected versions (bottom row), tilt made no difference: every orientation took the same time. It looked as if people mentally rotated the standard shapes but not their mirror images. Tarr and I glumly wrote up a paper begging the reader to believe that people use a different strategy to recognize mirror images. (In psychology, invoking “strategies” to explain funny data is the last refuge of the clueless.) But just as we were touching up the final draft for publication, an idea hit.
We remembered a theorem of the geometry of motion: a 2-D shape can always be aligned with its mirror image by a rotation of no more than 180 degrees, as long as the rotation can be in the third dimension around an optimal axis. In principle, any of our mirror-reversed shapes could be flipped in depth to match the standard upright shape, and the flip would take the same amount of time. The mirror image at 0 degrees would simply swivel around a vertical axis like a revolving door. The upside-down shape at 180 degrees could turn like a chicken on a rotisserie. The sideways shape could pivot around a diagonal axis, like this: look at the back of your right hand, fingertips up; now look at your palm, fingertips left. Different tilted axes could serve as the hinge for the other misoriented shapes; in every case, the rotation would be exactly 180 degrees. It would fit the data perfectly: people may have been mentally rotating all the shapes but were optimal rotators, dialing the standard shapes in the picture plane and flipping the mirror-reversed shapes in depth around the best axis.
We could scarcely believe it. Could people have found the optimal axis before even knowing what the shape was? We knew it was mathematically possible: by identifying just three non-collinear landmarks in each of two views of a shape, one can calculate the axis of rotation that would align one with the other. But can people really do this calculation? We convinced ourselves with a bit of computer animation. Roger Shepard once showed that if people see a shape alternating with a tilted copy, they see it rock back and forth. So we showed ourselves the standard upright shape alternating with one of its mirror images, back and forth {283} once a second. The perception of flipping was so obvious that we didn't bother to recruit volunteers to confirm it. When the shape alternated with its upright reflection, it seemed to pivot like a washing machine agitator. When it alternated with its upside-down reflection, it did back-flips. When it alternated with its sideways reflection, it swooped back and forth around a diagonal axis, and so on. The brain finds the axis every time. The subjects in our experiment were smarter than we were.
The clincher came from Tarr's thesis. He had replicated our experiments using three-dimensional shapes and their mirror images, rotated in the picture plane (shown below) and in depth:
Everything came out the same as for the 2-D shapes, except what people did with the mirror images. Just as a misoriented 2-D shape can be matched to the standard orientation by a rotation in the 2-D picture plane, and its mirror image can be rotated to the standard orientation by a 180-degree flip in the third dimension, a misoriented 3-D shape (top row) can be rotated to the standard orientation in 3-D space, and its mirror image (bottom row) can be rotated to the standard by a 180-degree flip in the fourth dimension. (In H. G. Wells’ “The Plattner Story,” an explosion blows the hero into four-dimensional space. When he returns, his heart is on the right side and he writes backwards with his left hand.) The only difference is that mere mortals should not be able to mentally rotate a shape in the fourth dimension, our mental space being strictly 3-D. All the versions should show an effect of tilt, unlike what we had found for 2-D shapes, where the mirror images did not. That's what happened. The subtle difference between two- and three-dimensional {284} objects sewed up the case: the brain rotates shapes around an optimal axis in three dimensions, but no more than three dimensions. Mental rotation is clearly one of the tricks behind our ability to recognize objects.
Mental rotation is another talent of our gifted visual systems, with a special twist. It does not merely analyze the contours coming in from the world, but creates some its own in the form of a ghostly moving image. This brings us to a final topic in the psychology of vision.
What shape are a beagle's ears? How many windows are in your living room? What's darker, a Christmas tree or a frozen pea? What's larger, a guinea pig or a gerbil? Does a lobster have a mouth? When a person stands up straight, is her navel above her wrist? If the letter D is turned on its back and put on top of a J, what does the combination remind you of?
Most people say that they answer these questions using a “mental image.” They visualize the shape, which feels like conjuring up a picture available for inspection in the mind's eye. The feeling is quite unlike the experience of answering abstract questions, such as “What is your mother's maiden name?” or “What is more important, civil liberties or a lower rate of crime?”
Mental imagery is the engine that drives our thinking about objects in space. To load a car with suitcases or rearrange the furniture, we imagine the different spatial arrangements before we try them. The anthropologist Napoleon Chagnon described an ingenious use of mental imagery by the Yanomamo Indians of the Amazon rainforest. They had blown smoke down the opening of an armadillo hole to asphyxiate the animal, and then had to figure out where to dig to extract it from its tunnel, which could run underground for hundreds of feet. One of the Yanomamo men hit on the idea of threading a long vine with a knot at the end down the hole as far as it would go. The other men kept their ears to the ground listening for the knot bumping the sides of the burrow so they could get a sense of the direction in which the burrow ran. The first man broke off the vine, pulled it out, laid it along the ground, and began to dig where the end of the vine lay. A few feet down they struck armadillo. Without {285} an ability to visualize the tunnel and the vine and armadillo inside it, the men would not have connected a sequence of threading, listening, yanking, breaking, measuring, and digging actions to an expectation of finding an animal corpse. In a joke we used to tell as children, two carpenters are hammering nails into the side of a house, and one asks the other why he is examining each nail as he takes it out of the box and throwing half of them away. “They're defective,” replies the second carpenter, holding one up. “The pointy end is facing the wrong way.” ‘You fool!” shouts the first carpenter. “Those are for the other side of the house!”
But people do not use imagery just to rearrange the furniture or dig up armadillos. The eminent psychologist D. O. Hebb once wrote, ‘You can hardly turn around in psychology without bumping into the image.” Give people a list of nouns to memorize, and they will imagine them interacting in bizarre images. Give them factual questions like “Does a flea have a mouth?” and they will visualize the flea and “look for” the mouth. And, of course, give them a complex shape at an unfamiliar orientation, and they will rotate its image to a familiar one.
Many creative people claim to “see” the solution to a problem in an image. Faraday and Maxwell visualized electromagnetic fields as tiny tubes filled with fluid. Kekule saw the benzene ring in a reverie of snakes biting their tails. Watson and Crick mentally rotated models of what was to become the double helix. Einstein imagined what it would be like to ride on a beam of light or drop a penny in a plummeting elevator. He once wrote, “My particular ability does not lie in mathematical calculation, but rather in visualizing effects, possibilities, and consequences.” Painters and sculptors try out ideas in their minds, and even novelists visualize scenes and plots in their mind's eye before putting pen to paper.
Images drive the emotions as well as the intellect. Hemingway wrote, “Cowardice, as distinguished from panic, is almost always simply a lack of ability to suspend the functioning of the imagination.” Ambition, anxiety, sexual arousal, and jealous rage can all be triggered by images of what isn't there. In one experiment, volunteers were hooked up to electrodes and asked to imagine their mates being unfaithful. The authors report, ‘Their skin conductance increased 1.5 microSiemens, the corru-gator muscle in their brow showed 7.75 microvolts units of contraction, and their heart rates accelerated by five beats per minute, equivalent to drinking three cups of coffee at one sitting.” Of course, the imagination revives many experiences at a time, not just seeing, but the visual image makes a mental simulation especially vivid. {286}
Imagery is an industry. Courses on How to Improve Your Memory teach age-old tricks like imagining items in the rooms of your house and then mentally walking through it, or finding a visual allusion in a person's name and linking it to his face (if you were introduced to me, you would imagine me in a cerise leisure suit). Phobias are often treated by a kind of mental Pavlovian conditioning where an image substitutes for the bell. The patient relaxes deeply and then imagines the snake or spider, until the image — and, by extension, the real thing — is associated with the relaxation. Highly paid “sports psychologists” have athletes relax in a comfy chair and visuajize the perfect swing. Many of these techniques work, though some are downright flaky. I am skeptical of cancer therapies in which patients visualize their antibodies munching the tumor, even more so when it is the patient's support group that does the visualizing. (A woman once called to ask if I thought it would work over the Internet.)
But what is a mental image? Many philosophers with behaviorist leanings think the whole idea is a terrible blunder. An image is supposed to be a picture in the head, but then you would need a little man et cetera, et cetera, et cetera. In fact, the computational theory of mind makes the notion perfectly straightforward. We already know that the visual system uses a 2½-D sketch which is picturelike in several respects. It is a mosaic of elements that stand for points in the visual field. The elements are arranged in two dimensions so that neighboring elements in the array stand for neighboring points in the visual field. Shapes are represented by filling in some of the elements in a pattern that matches the shape's projected contours. Shape-analysis mechanisms — not little men — process information in the sketch by imposing reference frames, finding geons, and so on. A mental image is simply a pattern in the 2½-D sketch that is loaded from long-term memory rather than from the eyes. A number of artificial intelligence programs for reasoning about space are designed in exactly this way.
A depiction like the 2½-D sketch contrasts starkly with a description in a language-like representation like a geon model, a semantic network, a sentence in English, or a proposition in mentalese. In the proposition A symmetrical triangle is above a circle, the words do not stand for points in the visual field, and they are not arranged so that nearby words represent nearby points. Words like symmetrical and above can't be pinned to any piece of the visual field; they denote complicated relationships among the filled-in pieces. {287}
One can even make an educated guess about the anatomy of mental imagery. The incarnation of a 2½-D sketch in neurons is called a topographically organized cortical map: a patch of cortex in which each neuron responds to contours in one part of the visual field, and in which neighboring neurons respond to neighboring parts. The primate brain has at least fifteen of these maps, and in a very real sense they are pictures in the head. Neuroscientists can inject a monkey with a radioactive isotope of glucose while it stares at a bull's-eye. The glucose is taken up by the active neurons, and one can literally develop the monkey's brain as if it were a piece of film. It comes out of the darkroom with a distorted bull's-eye laid out over the visual cortex. Of course, nothing “looks at” the cortex from above; connectivity is all that matters, and the activity pattern is interpreted by networks of neurons plugged into each cortical map. Presumably space in the world is represented by space on the cortex because neurons are connected to their neighbors, and it is handy for nearby bits of the world to be analyzed together. For example, edges are not scattered across the visual field like rice but snake along a line, and most surfaces are not archipelagos but cohesive masses. In a cortical map, lines and surfaces can be handled by neurons that are highly interconnected.
The brain is also ready for the second computational demand of an imagery system, information flowing down from memory instead of up from the eyes. The fiber pathways to the visual areas of the brain are two-way. They carry as much information down from the higher, conceptual levels as up from the lower, sensory levels. No one knows what these top-down connections are for, but they could be there to download memory images into visual maps.
So mental images could be pictures in the head. Are they? There are two ways to find out. One is to see if thinking in images engages the visual parts of the brain. The other is to see if thinking in images works more like computing with graphics or more like computing with a database of propositions.
In the first act of Richard II, the exiled Bolingbroke pines for his native England. He is not consoled by a friend's suggestion to fantasize that he is in more idyllic surroundings: {288}
O, who can hold a fire in his hand
By thinking on the frosty Caucasus?
Or cloy the hungry edge of appetite
By bare imagination of a feast?
Or wallow naked in December snow
By thinking on fantastic summer's heat?
Clearly an image is different from an experience of the real thing. William James said that images are “devoid of pungency and tang.” But in a 1910 Ph.D. thesis, the psychologist Cheves W. Perky tried to show that images were like very faint experiences. She asked her subjects to form a mental image, say of a banana, on a blank wall. The wall was actually a rear-projection screen, and Perky surreptitiously projected a real but dim slide on it. Anyone coming into the room at that point would have seen the slide, but none of the subjects noticed it. Perky claimed that they had incorporated the slide into their mental image, and indeed, the subjects reported details in their image that could only have come from the slide, such as the banana's standing on end. It was not a great experiment by modern standards, but state-of-the-art methods have borne out the crux of the finding, now called the Perky effect: holding a mental image interferes with seeing faint and fine visual details.
Imagery can affect perception in gross ways, too. When people answer questions about shapes from memory, like counting off the right angles in a block letter, their visual-motor coordination suffers. (Since learning about these experiments I try not to get too caught up in a hockey game on the radio while I am driving.) Mental images of lines can affect perception just as real lines do: they make it easier to judge alignment and can even induce visual illusions. When people see some shapes and imagine others, later they sometimes have trouble remembering which was which.
So do imagery and vision share space in the brain? The neuropsychol-ogists Edoardo Bisiach and Claudio Luzzatti studied two Milanese patients with damage to their right parietal lobes that left them with visual neglect syndrome. Their eyes register the whole visual field, but they attend only to the right half: they ignore the cutlery to the left of the plate, draw a face with no left eye or nostril, and when describing a room, ignore large details — like a piano — on their left. Bisiach and Luzzatti asked the patients to imagine standing in the Piazza del Duomo in Milan facing the cathedral and to name the buildings in the piazza. The {289} patients named only the buildings that would be visible on the right — neglecting the left half of imaginary space! Then the patients were asked to mentally walk across the square and stand on the cathedral steps facing the piazza and describe what was in it. They mentioned the buildings that they had left out the first time, and left out the buildings that they had mentioned. Each mental image depicted the scene from one vantage point, and the patients’ lopsided window of attention examined the image exactly as it examined real visual inputs.
These discoveries implicate the visual brain as the seat of imagery, and recently there has been a positive identification. The psychologist Stephen Kosslyn and his colleagues used Positron Emission Tomography (PET scanning) to see which parts of the brain are most active when people have mental images. Each subject lay with his head in a ring of detectors, closed his eyes, and answered questions about uppercase letters of the alphabet, such as whether B has any curves. The occipital lobe or visual cortex, the first gray matter that processes visual input, lit up. The visual cortex is topographically mapped — it forms a picture, if you will. In some runs, the subjects visualized large letters, in others, small letters. Pondering large letters activated the parts of the cortex representing the periphery of the visual field; pondering small letters activated the parts representing the fovea. Images really do seem to be laid across the cortical surface.
Could the activation be just a spillover of activity from other parts of the brain, where the real computation is being done? The psychologist Martha Farah showed that it isn't. She tested a woman's ability to form mental images before and after surgery that removed her visual cortex in one hemisphere. After the surgery, her mental images shrank to half their normal width. Mental images live in the visual cortex; indeed, parts of images take up parts of cortex, just as parts of scenes take up parts of pictures.
Still, an image is not an instant replay. It lacks that pungency and tang, though not because it has been bleached or watered down: imagining red is not like seeing pink. And curiously, in the PET studies the mental image sometimes caused more activation of the visual cortex than a real display, not less. Visual images, though they share brain areas with perception, are somehow different, and perhaps that is not surprising. Donald Symons notes that reactivating a visual experience may well have benefits, but it also has costs: the risk of confusing imagination with reality. Within moments of awakening from a dream, our memory for its plot is wiped out, presumably to avoid contaminating autobiographical memory {290} with bizarre confabulations. Similarly, our voluntary, waking mental images might be hobbled to keep them from becoming hallucinations or false memories.
Knowing where mental images are says little about what they are or how they work. Are mental images really patterns of pixels in a 2½-D array (or patterns of active neurons in a cortical map)? If they are, how do we think with them, and what would make imagery different from any other form of thought?
Let's compare an array or sketch to its rival as a model of imagery, symbolic propositions in mentalese (similar to geon models and to semantic networks). The array is on the left, the propositional model on the right. The diagram collapses many propositions, like “A bear has a head” and “The bear has the size XL,” into a single network.
The array is straightforward. Each pixel represents a small piece of surface or boundary, period; anything more global or abstract is only implicit in the pattern of filled pixels. The propositional representation is quite different. First, it is schematic, filled with qualitative relatiorts like “attached to”; not every detail of the geometry is represented. Second, the spatial properties are factored apart and listed explicitly. Shape (the arrangement of an object's parts or geons), size, location, and orientation get their own symbols, and each can be looked up independently of the others. Third, propositions mix spatial information, like parts and their positions, with conceptual information, like bearhood and membership in the carnivore class.
Of the two data structures, it is the pictorial array that best captures the flavor of imagery. First, images are thumpingly concrete. Consider {291} this request: Visualize a lemon and a banana next to each other, but don't imagine the lemon either to the right or to the left, just next to the banana. You will protest that the request is impossible; if the lemon and banana are next to each other in an image, one or the other has to be on the left. The contrast between a proposition and an array is stark. Propositions can represent cats without grins, grins without cats, or any other disembodied abstraction: squares of no particular size, symmetry with no particular shape, attachment with no particular place, and so on. That is the beauty of a proposition: it is an austere statement of some abstract fact, uncluttered with irrelevant details. Spatial arrays, because they consist only of filled and unfilled patches, commit one to a concrete arrangement of matter in space. And so do mental images: forming an image of “symmetry,” without imagining a something or other that is symmetrical, can't be done.
The concreteness of mental images allows them to be co-opted as a handy analogue computer. Amy is richer than Abigail; Alicia is not as rich as Abigail; who's the richest? Many people solve these syllogisms by lining up the characters in a mental image from least rich to richest. Why should this work? The medium underlying imagery comes with cells dedicated to each location, fixed in a two-dimensional arrangement. That supplies many truths of geometry for free. For example, left-to-right arrangement in space is transitive: if A is to the left of B, and B is to the left of C, then A is to the left of C. Any lookup mechanism that finds the locations of shapes in the array will automatically respect transitivity; the architecture of the medium leaves it no choice.
Suppose the reasoning centers of the brain can get their hands on the mechanisms that plop shapes into the array and that read their locations out of it. Those reasoning demons can exploit the geometry of the array as a surrogate for keeping certain logical constraints in mind. Wealth, like location on a line, is transitive: if A is richer than B, and B is richer than C, then A is richer than C. By using location in an image to symbolize wealth, the thinker takes advantage of the transitivity of location built into the array, and does not have to enter it into a chain of deductive steps. The problem becomes a matter of plop down and look up. It is a fine example of how the form of a mental representation determines what is easy or hard to think.
Mental images also resemble arrays in shmooshing together size, shape, location, and orientation into one pattern of contours, rather {292} than neatly factoring them into separate assertions. Mental rotation is a good example. In assessing an object's shape, a person cannot ignore its orientation — which would be a simple matter if orientation were sequestered in its own statement. Instead, the person must nudge the orientation gradually and watch as the shape changes. The orientation is not re-computed in one step like a matrix multiplication in a digital computer; the farther a shape is dialed, the longer the dialing takes. There must be a rotator network overlaid on the array that shifts the contents of cells a few degrees around its center. Larger rotations require iterating the rotator, bucket-brigade style. Experiments on how people solve spatial problems have uncovered a well-stocked mental toolbox of graphic operations, such as zooming, shrinking, panning, scanning, tracing, and coloring. Visual thinking, such as judging whether two objects lie along the same line or whether two blobs of different sizes have the same shape, strings these operations into mental animation sequences.
Finally, images capture the geometry of an object, not just its meaning. The surefire way of getting people to experience imagery is to ask them about obscure details of an object's shape or coloring — the beagle's ears, the curves in the B, the shade of frozen peas. When a feature is noteworthy — cats have claws, bees have stingers — we file it away as an explicit statement in our conceptual database, available later for instant lookup. But when it is not, we call up a memory of the appearance of the object and run our shape analyzers over the image. Checking for previously unnoticed geometric properties of absent objects is one of the main functions of imagery, and Kosslyn has shown that this mental process differs from dredging up explicit facts. When he asked people questions about well-rehearsed facts, like whether a cat has claws or a lobster has a tail, the speed of the answer depended on how strongly the object and its part were associated in memory. People must have retrieved the answer from a mental database. But when the questions were more unusual, like whether a cat has a head or a lobster has a mouth, and people consulted a mental image, the speed of the answer depended on the size of the part; smaller parts were slower to verify. Since size and shape are mixed together in an image, smaller shape details are harder to resolve.
For decades, philosophers have suggested that the perfect test of whether mental images are depictions or descriptions was whether people can reinterpret ambiguous shapes, like the duck-rabbit: {293}
If the mind stores only descriptions, then a person who sees the duck-rabbit as a rabbit should tuck away only the label “rabbit.” Nothing in the label captures anything about ducks, so later on, the rabbit-seers should be at a loss when asked whether some other animal lurked in the shape; the ambiguous geometric information has been sloughed off. But if the mind stores images, the geometry is still available, and people should be able to call back the image and inspect it for new interpretations. The duck-rabbit itself turns out to be a hard case, because people store shapes with a front-back frame of reference attached, and reinterpreting the duck-rabbit requires reversing the frame. But with some gentle nudging (such as encouraging people to concentrate on the curve at the back of the head), many people do see the duck in the rabbit image or vice versa. Almost everyone can flip simpler ambiguous images. The psychologist Ronald Finke, Martha Farah, and I got people to reinterpret images from verbal descriptions alone, which we read aloud while their eyes were closed. What object can you “see” in each of these descriptions?
Imagine the letter D. Rotate it 90 degrees to the right. Put the number 4 above it. Now remove the horizontal segment of the 4 to the right of the vertical line.
Imagine the letter B. Rotate it 90 degrees to the left. Put a triangle directly below it having the same width and pointing down. Remove the horizontal line.
Imagine the letter K. Place a square next to it on the left side. Put a circle inside the square. Now rotate the figure 90 degrees to the left.
Most people had no trouble reporting the sailboat, the valentine, and the television set that were implicit in the verbiage. {294}
Imagery is a wonderful faculty, but we must not get carried away with the idea of pictures in the head.
For one thing, people cannot reconstruct an image of an entire visual scene. Images are fragmentary. We recall glimpses of parts, arrange them in a mental tableau, and then do a juggling act to refresh each part as it fades. Worse, each glimpse records only the surfaces visible from one vantage point, distorted by perspective. (A simple demonstration is the railroad track paradox — most people see the tracks converge in their mental image, not just in real life.) To remember an object, we turn it over or walk around it, and that means our memory for it is an album of separate views. An image of the whole object is a slide show or pastiche.
That explains why perspective in art took so long to be invented, even though everyone sees in perspective. Paintings without Renaissance craftsmanship look unrealistic, but not because they lack perspective outright. (Even Cro-Magnon cave paintings have a measure of accurate perspective.) Usually, distant objects are smaller, opaque objects hide their backgrounds and take bites out of objects behind them, and many tilted surfaces are foreshortened. The problem is that different parts of the painting are shown as they would appear from different vantage points, rather than from the fixed viewing reticle behind Leonardo's window. No incarnate perceiver, chained to one place at one time, can experience a scene from several vantage points at once, so the painting does not correspond to anything a person ever sees. The imagination, of course, is not chained to one place at one time, and paintings without true perspective may, strangely enough, be evocative renditions of our mental imagery. Cubist and surrealist painters, who were avid consumers of psychology, used multiple perspectives in a painting deliberately, perhaps to awaken photograph-jaded viewers to the evanescence of the mind's eye.
A second limitation is that images are slaves to the organization of memory. Our knowledge of the world could not possibly fit into one big picture or map. There are too many scales, from mountains to fleas, to fit into one medium with a fixed grain size. And our visual memory could not very well be a shoebox stuffed with photographs, either. There would be no way to find the one you need without examining each one to recognize what's in it. (Photo and video archives face a similar problem.) {295} Memory images must be labeled and organized within a propositional superstructure, perhaps a bit like hypermedia, where graphics files are linked to attachment points within a large text or database.
Visual thinking is often driven more strongly by the conceptual knowledge we use to organize our images than by the contents of the images themselves. Chess masters are known for their remarkable memory for the pieces on a chessboard. But it's not because people with photographic memories become chess masters. The masters are no better than beginners when remembering a board of randomly arranged pieces. Their memory captures meaningful relations among the pieces, such as threats and defenses, not just their distribution in space.
Another example comes from a wonderfully low-tech experiment by the psychologists Raymond Nickerson and Marilyn Adams. They asked people to draw both sides of a penny, which everyone has seen thousands of times, from memory. (Try it before you read on.) The results are sobering. An American penny has eight features: Abraham Lincoln's profile, IN GOD WE TRUST, a year, and LIBERTY on one side, and the Lincoln Memorial, UNITED STATES OF AMERICA, E PLURIBUS UNUM, and ONE CENT on the other. Only five percent of the subjects drew all eight. The median number remembered was three, and half were in the wrong place. Intruding into the drawings were ONE PENNY, laurel wreaths, sheaves of wheat, the Washington monument, and Lincoln sitting in a chair. People did better when asked to tick off the features in a penny from a list. (Thankfully, no one selected MADE IN TAIWAN.) But when they were shown fifteen drawings of possible pennies, fewer than half the people picked out the correct one. Obviously, visual memories are not accurate pictures of whole objects.
And if you did get the penny right, try this quiz. Which of these statements are true?
Madrid is farther north than Washington, D.C.
Seattle is farther north than Montreal.
Portland, Oregon, is farther north than Toronto.
Reno is farther west than San Diego.
The Atlantic entrance to the Panama Canal is farther west than the Pacific entrance.
They are all true. Almost everyone gets them wrong, reasoning along these lines: Nevada is east of California; San Diego is in California; Reno is in Nevada; therefore Reno is east of San Diego. Of course, this kind of {296} syllogism is invalid whenever regions don't form a checkerboard. Our geographic knowledge is not a big mental map but a set of smaller maps, organized by assertions about how they are related.
Finally, images cannot serve as our concepts, nor can they serve as the meanings of words in the mental dictionary. A long tradition in empiricist philosophy and psychology tried to argue that they could, since it fit the dogma that there is nothing in the intellect that was not previously in the senses. Images were supposed to be degraded or superimposed copies of visual sensations, the sharp edges sanded off and the colors blended together so that they could stand for entire categories rather than individual objects. As long as you don't think too hard about what these composite images look like, the idea has a ring of plausibility. But then, how would one represent abstract ideas, even something as simple as the concept of a triangle? A triangle is any three-sided polygon. But any image of a triangle must be isosceles, scalene, or equilateral. John Locke made the enigmatic claim that our image of a triangle is “all and none of these at once.” Berkeley called him on it, challenging his readers to form a mental image of a triangle that was isosceles, scalene, equilateral, and none of the above, all at the same time. But rather than abandoning the theory that abstract ideas are images, Berkeley concluded that we don't have abstract ideas!
Early in the twentieth century, Edward Titchener, one of America's first experimental psychologists, rose to the challenge. By carefully introspecting on his own images, he argued that they could represent any idea, no matter how abstract:
I can quite well get Locke's picture, the triangle that is no triangle and all triangles at one and the same time. It is a flashy thing, come and gone from moment to moment; it hints two or three red angles, with the red lines deepening into black, seen on a dark green ground. It is not there long enough for me to say whether the angles join to form the complete figure, or even whether all three of the necessary angles are given.
Horse is, to me, a double curve and a rampant posture with a touch of mane about it; cow is a longish rectangle with a certain facial expression, a sort of exaggerated pout.
I have been ideating meanings all my life. And not only meanings, but meaning also. Meaning in general is represented in my consciousness by another of these impressionistic pictures. I see meaning as the blue-grey tip of a kind of scoop, which has a bit of yellow above it (probably a part of the handle), and which is just digging into a dark mass of {297} what appears to be plastic material. I was educated on classical lines; and it is conceivable that this picture is an echo of the oft-repeated admonition to “dig out the meaning” of some passage of Greek or Latin.
Exaggerated pout indeed! Titchener's Cheshire Cow, his triangle with red angles that don't even join, and his meaning shovel could not possibly be the concepts underlying his thoughts. Surely he did not believe that cows are rectangular or that triangles can do just fine without one of their angles. Something else in his head, not an image, must have embodied that knowledge.
And that is the problem with other claims that all thoughts are images. Suppose I try to represent the concept “man” by an image of a prototypical man — say, Fred MacMurray. The problem is, what makes the image serve as the concept “man” as opposed to, say, the concept “Fred MacMurray”? Or the concept “tall man,” “adult,” “human,” “American,” or “actor who plays an insurance salesman seduced into murder by Barbara Stanwyck”? You have no trouble distinguishing among a particular man, men in general, Americans in general, vamp-victims in general, and so on, so you must have more than a picture of a prototypical man in your head.
And how could a concrete image represent an abstract concept, like “freedom”? The Statue of Liberty is already taken; presumably it is representing the concept “the Statue of Liberty.” What would you use for negative concepts, like “not a giraffe”? An image of a giraffe with a red diagonal line through it? Then what would represent the concept “a giraffe with a red diagonal line through it”? How about disjunctive concepts, like “either a cat or a bird,” or propositions, like “All men are mortal”?
Pictures are ambiguous, but thoughts, virtually by definition, cannot be ambiguous. Your common sense makes distinctions that pictures by themselves do not; therefore your common sense is not just a collection of pictures. If a mental picture is used to represent a thought, it needs to be accompanied by a caption, a set of instructions for how to interpret the picture — what to pay attention to and what to ignore. The captions cannot themselves be pictures, or we would be back where we started. When vision leaves off and thought begins, there's no getting around the need for abstract symbols and propositions that pick out aspects of an object for the mind to manipulate.
Incidentally, the ambiguity of pictures has been lost on the designers of graphical computer interfaces and other icon-encrusted consumer {298} products. My computer screen is festooned with little cartoons that do various things when selected by a click of the mouse. For the life of me I can't remember what the tiny binoculars, eyedropper, and silver platter are supposed to do. A picture is worth a thousand words, but that is not always such a good thing. At some point between gazing and thinking, images must give way to ideas.
<< | {299} | >> |
I |
hope you have not murdered too completely your own and my child.” So wrote Darwin to Alfred Russel Wallace, the biologist who had independently discovered natural selection. What prompted the purple prose? Darwin and Wallace were mutual admirers, so like-minded that they had been inspired by the same author (Malthus) to forge the same theory in almost the same words. What divided these comrades was the human mind. Darwin had coyly predicted that “psychology will be placed on a new foundation,” and in his notebooks was positively grandiose about how evolutionary theory would revolutionize the study of mind:
Origin of man now proved. — Metaphysics must flourish. — He who understand baboon would do more toward metaphysics than Locke.
Plato says . . . that our “imaginary ideas” arise from the preexistence of the soul, are not derivable from experience — read monkeys for preexistence.
He went on to write two books on the evolution of human thoughts and feelings, The Descent of Man and The Expression of the Emotions in Man and Animals.
But Wallace reached the opposite conclusion. The mind, he said, is overdesigned for the needs of evolving humans and cannot be explained by natural selection. Instead, “a superior intelligence has guided the {300} development of man in a definite direction, and for a special purpose.” Ettu!
Wallace became a creationist when he noted that foragers — “savages,” in nineteenth-century parlance — were biologically equal to modern Europeans. Their brains were the same size, and they could easily adapt to the intellectual demands of modern life. But in the foragers’ way of life, which was also the life of our evolutionary ancestors, that level of intelligence was not needed, and there was no occasion to show it off. How, then, could it have evolved in response to the needs of a foraging lifestyle? Wallace wrote:
Our law, our government, and our science continually require us to reason through a variety of complicated phenomena to the expected result. Even our games, such as chess, compel us to exercise all these faculties in a remarkable degree. Compare this with the savage languages, which contain no words for abstract conceptions; the utter want of foresight of the savage man beyond his simplest necessities; his inability to combine, or to compare, or to reason on any general subject that does not immediately appeal to his senses. . . .
... A brain one-half larger than that of the gorilla would . . . fully have sufficed for the limited mental development of the savage; and we must therefore admit that the large brain he actually possesses could never have been solely developed by any of those laws of evolution, whose essence is, that they lead to a degree of organization exactly proportionate to the wants of each species, never beyond those wants. . . . Natural selection could only have endowed savage man with a brain a few degrees superior to that of an ape, whereas he actually possesses one very little inferior to that of a philosopher.
Wallace's paradox, the apparent evolutionary uselessness of human intelligence, is a central problem of psychology, biology, and the scientific worldview. Even today, scientists such as the astronomer Paul Davies think that the “overkill” of human intelligence refutes Darwinism and calls for some other agent of a “progressive evolutionary trend,” perhaps a self-organizing process that will be explained someday by complexity theory. Unfortunately this is barely more satisfying than Wallace's idea of a superior intelligence guiding the development of man in a definite direction. Much of this book, and this chapter in particular, is aimed at demoting Wallace's paradox from a foundation-shaking mystery to a challenging but otherwise ordinary research problem in the human sciences. {301}
Stephen Jay Gould, in an illuminating essay on Darwin and Wallace, sees Wallace as an extreme adaptationist who ignores the possibility of exaptations: adaptive structures that are “fortuitously suited to other roles if elaborated” (such as jaw bones becoming middle-ear bones) and “features that arise without functions . . . but remain available for later co-optation” (such as the panda's thumb, which is really a jury-rigged wristbone).
Objects designed for definite purposes can, as a result of their structural complexity, perform many other tasks as well. A factory may install a computer only to issue the monthly pay checks, but such a machine can also analyze the election returns or whip anyone's ass (or at least perpetually tie them) in tic-tac-toe.
I agree with Gould that the brain has been exapted for novelties like calculus or chess, but this is just an avowal of faith by people like us who believe in natural selection; it can hardly fail to be true. It raises the question of who or what is doing the elaborating and co-opting, and why the original structures were suited to being co-opted. The factory analogy is not helpful. A computer that issues paychecks cannot also analyze election returns or play tic-tac-toe, unless someone has reprogrammed it first.
Wallace went off the tracks not because he was too much of an adaptationist but because he was a lousy linguist, psychologist, and anthropologist (to judge him, unfairly, by modern standards). He saw a chasm between the simple, concrete, here-and-now thinking of foraging peoples and the abstract rationality exercised in modern pursuits like science, mathematics, and chess. But there is no chasm. Wallace, to give him his due, was ahead of his time in realizing that foragers were not on the lower rungs of some biological ladder. But he was wrong about their language, thought, and lifestyle. Prospering as a forager is a more difficult problem than doing calculus or playing chess. As we saw in Chapter 3, people in all societies have words for abstract conceptions, have foresight beyond simple necessities, and combine, compare, and reason on general subjects that do not immediately appeal to their senses. And people everywhere put these abilities to good use in outwitting the defenses of the local flora and fauna. We will soon see that all people, right from the cradle, engage in a hind of scientific thinking. We are all intuitive physicists, biologists, engineers, psychologists, and mathematicians. Thanks to these inborn talents, we outperform robots and have wreaked havoc on the planet.
On the other hand, our intuitive science is different from what the people in white coats do. Though most of us would not agree with Lucy {302} in Peanuts that fir trees give us fur, sparrows grow into eagles that we eat on Thanksgiving, and you can tell a tree's age by counting its leaves, our beliefs are sometimes just as daffy. Children insist that a piece of styro-foam weighs nothing and that people know the outcome of events they did not witness or hear about. They grow into adults who think that a ball flying out of a spiral tube will continue in a spiral path and that a string of heads makes a coin more likely to land tails.
This chapter is about human reasoning: how people make sense of their world. To reverse-engineer our faculties of reasoning, we must begin with Wallace's paradox. To dissolve it, we have to distinguish the intuitive science and mathematics that is part of the human birthright from the modern, institutionalized version that most people find so hard. Then we can explore how our intuitions work, where they came from, and how they are elaborated and polished to give the virtuoso performances of modern civilization.
Ever since the Swiss psychologist Jean Piaget likened children to little scientists, psychologists have compared the person in the street, young and old, to the person in the lab. The analogy is reasonable up to a point. Both scientists and children have to make sense of the world, and children are curious investigators striving to turn their observations into valid generalizations. Once I had family and friends staying over, and a three-year-old boy accompanied my sister as she bathed my infant niece. After staring quietly for several minutes he announced, “Babies don't have penises.” The boy deserves our admiration, if not for the accuracy of his conclusion, then for the keenness of his scientific spirit.
Natural selection, however, did not shape us to earn good grades in science class or to publish in refereed journals. It shaped us to master the local environment, and that led to discrepancies between how we naturally think and what is demanded in the academy.
For many years the psychologist Michael Cole and his colleagues studied a Liberian people called the Kpelle. They are an articulate group, enjoying argument and debate. Most are illiterate and unschooled, and they do poorly on tests that seem easy to us. This dialogue shows why: {303}
Experimenter: Flumo and Yakpalo always drink cane juice [rum] together. Flumo is drinking cane juice. Is Yakpalo drinking cane juice?
Subject: Flumo and Yakpalo drink cane juice together, but the time Flumo was drinking the first one Yakpalo was not there on that day.
Experimenter. But I told you that Flumo and Yakpalo always drink cane juice together. One day Flumo was drinking cane juice. Was Yakpalo drinking cane juice?
Subject: The day Flumo was drinking the cane juice Yakpalo was not there on that day.
Experimenter. What is the reason?
Subject: The reason is that Yakpalo went to his farm on that day and Flumo remained in town on that day.
The example is not atypical; Cole's subjects often say things like “Yakpalo isn't here at the moment; why don't you go and ask him about the matter?” The psychologist Ulric Neisser, who excerpted this dialogue, notes that these answers are by no means stupid. They are just not answers to the experimenter's question.
A ground rule when you solve a problem at school is to base your reasoning on the premises mentioned in a question, ignoring everything else you know. The attitude is important in modern schooling. In the few thousand years since the emergence of civilizations, a division of labor has allowed a class of knowledge professionals to develop methods of inference that are widely applicable and can be disseminated by writing and formal instruction. These methods literally have no content. Long division can calculate miles per gallon, or it can calculate income per capita. Logic can tell you that Socrates is mortal, or, in the examples in Lewis Carroll's logic textbook, that no lamb is accustomed to smoking cigars, all pale people are phlegmatic, and a lame puppy would not say “thank you” if you offered to lend it a skipping rope. The statistical tools of experimental psychology were borrowed from agronomy, where they were invented to gauge the effects of different fertilizers on crop yields. The tools work just fine in psychology, even though, as one psychological statistician wrote, “we do not deal in manure, at least not knowingly.” The power of these tools is that they can be applied to any problem — how color vision works, how to put a man on the moon, whether mito-chondrial Eve was an African — no matter how ignorant one is at the outset. To master the techniques, students must feign the ignorance they will later be saddled with when solving problems in their professional lives. A high {304} school student doing Euclidean geometry gets no credit for pulling out a ruler and measuring the triangle, even though that guarantees a correct answer. The point of the lesson is to inculcate a method that later can be used to calculate the unmeasurable, such as the distance to the moon.
But outside of school, of course, it never makes sense to ignore what you know. A Kpelle could be forgiven for asking, Look, do you want to know whether Yakpalo is drinking cane juice, or don't you? That is true for both the knowledge acquired by an individual and the knowledge acquired by the species. No organism needs content-free algorithms applicable to any problem no matter how esoteric. Our ancestors encountered certain problems for hundreds of thousands or millions of years — recognizing objects, making tools, learning the local language, finding a mate, predicting an animal's movement, finding their way — and encountered certain other problems never — putting a man on the moon, growing better popcorn, proving Fer-mat's last theorem. The knowledge that solves a familiar kind of problem is often irrelevant to any other one. The effect of slant on luminance is useful in calculating shape but not in assessing the fidelity of a potential mate. The effects of lying on tone of voice help with fidelity but not with shape. Natural selection does not care about the ideals of a liberal education and should have no qualms about building parochial inference modules that exploit eons-old regularities in their own subject matters. Tooby and Cosmides call the subject-specific intelligence of our species “ecological rationality.”
A second reason we did not evolve into true scientists is the cost of knowledge. Science is expensive, and not just the superconducting supercollider, but the elementary analysis of cause and effect in John Stuart Mill's canons of induction. Recently I was dissatisfied with the bread I had been baking because it was too dry and fluffy. So I increased the water, decreased the yeast, and lowered the temperature. To this day I don't know which of these manipulations made the difference. The scientist in me knew that the proper procedure would have been to try out all eight logical combinations in a factorial design: more water, same yeast, same temperature; more water, more yeast, same temperature; more water, same yeast, lower temperature; and so on. But the experiment would have taken eight days (twenty-seven if I wanted to test two increments of each factor, sixty-four if I wanted to test three) and required a notebook and a calculator. I wanted tasty bread, not a contribution to the archives of human knowledge, so my multiply-confounded one-shot was enough. In a large society with writing and institutionalized science, the cost of an exponential number of tests is repaid by the benefit of the {305} resulting laws to a large number of people. That is why taxpayers are willing to fund scientific research. But for the provincial interests of a single individual or even a small band, good science isn't worth the trouble.
A third reason we are so-so scientists is that our brains were shaped for fitness, not for truth. Sometimes the truth is adaptive, but sometimes it is not. Conflicts of interest are inherent to the human condition (see Chapters 6 and 7), and we are apt to want our version of the truth, rather than the truth itself, to prevail.
For example, in all societies, expertise is distributed unevenly. Our mental apparatus for understanding the world, even for understanding the meanings of simple words, is designed to work in a society in which we can consult an expert when we have to. The philosopher Hilary Putnam confesses that, like most people, he has no idea how an elm differs from a beech. But the words aren't synonyms for him or for us; we all know that they refer to different kinds of trees, and that there are experts out there who could tell us which is which if we ever had to know Experts are invaluable and are usually rewarded in esteem and wealth. But our reliance on experts puts temptation in their path. The experts can allude to a world of wonders — occult forces, angry gods, magical potions — that is inscrutable to mere mortals but reachable through their services. Tribal shamans are flimflam artists who” supplement their considerable practical knowledge with stage magic, drug-induced trances, and other cheap tricks. Like the Wizard of Oz, they have to keep their beseechers from looking at the man behind the curtain, and that conflicts with the disinterested search for the truth.
In a complex society, a dependence on experts leaves us even more vulnerable to quacks, from carnival snake-oil salesman to the mandarins who advise governments to adopt programs implemented by mandarins. Modern scientific practices like peer review, competitive funding, and open mutual criticism are meant to minimize scientists’ conflicts of interest in principle, and sometimes do so in practice. The stultification of good science by nervous authorities in closed societies is a familiar theme in history, from Catholic southern Europe after Galileo to the Soviet Union in the twentieth century.
It is not only science that can suffer under the thumb of those in power. The anthropologist Donald Brown was puzzled to learn that over the millennia the Hindus of India produced virtually no histories, while the neighboring Chinese had produced libraries full. He suspected that the potentates of a hereditary caste society realized that no good could come from a scholar nosing around in records of the past where he might {306} stumble upon evidence undermining their claims to have descended from heroes and gods. Brown looked at twenty-five civilizations and compared the ones organized by hereditary castes with the others. None of the caste societies had developed a tradition of writing accurate depictions of the past; instead of history they had myth and legend. The caste societies were also distinguished by an absence of political science, social science, natural science, biography, realistic portraiture, and uniform education.
Good science is pedantic, expensive, and subversive. It was an unlikely selection pressure within illiterate foraging bands like our ancestors', and we should expect people's native “scientific” abilities to differ from the genuine article.
The humorist Robert Benchley said that there are two classes of people in the world: those who divide the people of the world into two classes, and those who do not. In Chapter 2, when I asked why the mind keeps track of individuals, I took it for granted that the mind forms categories. But the habit of categorizing deserves scrutiny as well. People put things and other people into mental boxes, give each box a name, and thereafter treat the contents of a box the same. But if our fellow humans are as unique as their fingerprints and no two snowflakes are alike, why the urge to classify?
Psychology textbooks typically give two explanations, neither of which makes sense. One is that memory cannot hold all the events that bombard our senses; by storing only their categories, we cut down on the load. But the brain, with its trillion synapses, hardly seems short of storage space. It's reasonable to say that entities cannot fit in memory when the entities are combinatorial — English sentences, chess games, all shapes in all colors and sizes at all locations — because the numbers from combinatorial explosions can exceed the number of particles in the universe and overwhelm even the most generous reckoning of the brain's capacity. But people live for a paltry two billion seconds, and there is no known reason why the brain could not record every object and event we experience if it had to. Also, we often remember both a category and its members, such as months, family members, continents, and baseball teams, so the category adds to the memory load. {307}
The other putative reason is that the brain is compelled to organize; without categories, mental life would be chaos. But organization for its own sake is useless. I have a compulsive friend whose wife tells callers that he cannot come to the phone because he is alphabetizing his shirts. Occasionally I receive lengthy manuscripts from theoreticians who have discovered that everything in the universe falls into classes of three: the Father, the Son, and the Holy Ghost; protons, neutrons, and electrons; masculine, feminine, and neuter; Huey, Dewey, and Louie; and so on, for page after page. Jorge Luis Borges writes of a Chinese encyclopedia that divided animals into: (a) those that belong to the Emperor, (b) embalmed ones, (c) those that are trained, (d) suckling pigs, (e) mermaids, (f) fabulous ones, (g) stray dogs, (h) those that are included in this classification, (i) those that tremble as if they were mad, (j) innumerable ones, (k) those drawn with a very fine camel's hair brush, (l) others, (m) those that have just broken a flower vase, (n) those that resemble flies from a distance.
No, the mind has to get something out of forming categories, and that something is inference. Obviously we can't know everything about every object. But we can observe some of its properties, assign it to a category, and from the category predict properties that we have not observed. If Mopsy has long ears, he is a rabbit; if he is a rabbit, he should eat carrots, go hippety-hop, and breed like, well, a rabbit. The smaller the category, the better the prediction. Knowing that Peter is a cottontail, we can predict that he grows, breathes, moves, was suckled, inhabits open country or woodland clearings, spreads tularemia, and can contract myxomatosis. If we knew only that he was a mammal, the list would include only growing, breathing, moving, and being suckled. If we knew only that he was an animal, it would shrink to growing, breathing, and moving.
On the other hand, it's much harder to tag Peter as a cottontail than as a mammal or an animal. To tag him as a mammal we need only notice that he is furry and moving, but to tag him as a cottontail we have to notice that he has long ears, a short tail, long hind legs, and white on the underside of his tail. To identify very specific categories we have to examine so many properties that there would be few left to predict. Most of our everyday categories are somewhere in the middle: “rabbit,” not mammal or cottontail; “car,” not vehicle or Ford Tempo; “chair,” not furniture or Barcalounger. They represent a compromise between how hard it is to identify the category and how much good the category does you. The psychologist Eleanor Rosch called them basic-level categories. They are {308} the first words children learn for objects and generally the first mental label we assign when seeing them.
What makes a category like “mammal” or “rabbit” better than a category like “shirts made by companies beginning with H” or “animals drawn with a very fine camel's hair brush”? Many anthropologists and philosophers believe that categories are arbitrary conventions that we learn along with the other cultural accidents standardized in our language. Deconstructionism, poststructuralism, and postmodernism in the humanities take this view to an extreme. But categories would be useful only if they meshed with the way the world works. Fortunately for us, the world's objects are not evenly sprinkled throughout the rows and columns of the inventory list defined by the properties we notice. The world's inventory is lumpy. Creatures with cotton tails tend have long ears and live in woodland clearings; creatures with fins tend to have scales and live in the water. Other than in the children's books with split pages for assembling do-it-yourself chimeras, there are no finned cottontails or floppy-eared fish. Mental boxes work because things come in clusters that fit the boxes.
What makes the birds of a feather cluster together? The world is sculpted and sorted by laws that science and mathematics aim to discover. The laws of physics dictate that objects denser than water are found on the bottom of a lake, not its surface. Laws of natural selection and physics dictate that objects that move swiftly through fluids have streamlined shapes. The laws of genetics make offspring resemble their parents. Laws of anatomy, physics, and human intentions force chairs to have shapes and materials that make them stable supports.
People form two kinds of categories, as we saw in Chapter 2. We treat games and vegetables as categories that have stereotypes, fuzzy boundaries, and family-like resemblances. That kind of category falls naturally out of pattern-associator neural networks. We treat odd numbers and females as categories that have definitions, in-or-out boundaries, and common threads running through the members. That kind of category is naturally computed by systems of rules. We put some things into both kinds of mental categories — we think of “a grandmother” as a gray-haired muffin dispenser; we also think of “a grandmother” as the female parent of a parent. {309}
Now we can explain what these two ways of thinking are for. Fuzzy categories come from examining objects and uninsightfully recording the correlations among their features. Their predictive power comes from similarity: if A shares some features with B, it probably shares others. They work by recording the clusters in reality. Well-defined categories, in contrast, work by ferreting out the laws that put the clusters there. They fall out of the intuitive theories that capture people's best guess about what makes the world tick. Their predictive power comes from deduction: if A implies B, and A is true, then B is true.
Real science is famous for transcending fuzzy feelings of similarity and getting at underlying laws. Whales are not fish; people are apes; solid matter is mostly empty space. Though ordinary people don't think exactly like scientists, they too let their theories override similarity when they reason about how the world works. Which two out of three belong together: white hair, gray hair, black hair? How about white cloud, gray cloud, black cloud? Most people say that black is the odd hair out, because aging hair turns gray and then white, but that white is the odd cloud out, because gray and black clouds give rain. Say I tell you I have a three-inch disk. Which is it more similar to, a quarter or a pizza? Which is it more likely to be, a quarter or a pizza? Most people say it is more similar to a quarter but more likely to be a pizza. They reason that quarters have to be standardized but pizzas can vary. On a trip to an unexplored forest, you discover a centipede, a caterpillar that looks like it, and a butterfly that the caterpillar turns into. How many kinds of animals have you found, and which belong together? Most people feel, along with biologists, that the caterpillar and the butterfly are the same animal, but the caterpillar and the centipede are not, despite appearances to the contrary. During your first basketball game, you see blond players with green jerseys run toward the east basket with the ball, and black players with yellow jerseys run toward the west basket with the ball. The whistle blows and a black player with a green jersey enters. Which basket will he run to? Everyone knows it is the east basket.
These similarity-defying guesses come from intuitive theories about aging, weather, economic exchange, biology, and social coalitions. They belong to larger systems of tacit assumptions about kinds of things and the laws governing them. The laws can be played out combinatorially in the mind to get predictions and inferences about events unseen. People everywhere have homespun ideas about physics, to predict how objects roll and bounce; psychology, to predict what other people think and do; {310} logic, to derive some truths from others; arithmetic, to predict the effects of aggregating; biology, to reason about living things and their powers; kinship, to reason about relatedness and inheritance; and a variety of social and legal rule systems. The bulk of this chapter explores those intuitive theories. But first we must ask: when does the world allow theories (scientific or intuitive) to work, and when does it force us all to fall back on fuzzy categories defined by similarity and stereotypes?
Where do our fuzzy similarity clusters come from? Are they just the parts of the world that we understand so poorly that the underlying laws escape us? Or does the world really have fuzzy categories even in our best scientific understanding? The answer depends on what part of the world we look at. Mathematics, physics, and chemistry trade in crisp categories that obey theorems and laws, such as triangles and electrons. But in any realm in which history plays a role, such as biology, members drift in and out of lawful categories over time, leaving their boundaries ragged. Some of the categories are definable, but others really are fuzzy.
Most biologists consider species to be lawful categories: they are populations that have become reproductively isolated and adapted to their local environment. Adaptation to a niche and inbreeding homogenize the population, so a species at a given time is a real category in the world that taxonomists can identify using well-defined criteria. But a higher taxonomic category, representing the descendants of an ancestral species, is not as well behaved. When the ancestral organisms dispersed and their descendants lost touch and adopted new homelands, the original pretty picture became a palimpsest. Robins, penguins, and ostriches share some features, like feathers, because they are great-great-grandchildren of a single population adapted to flight. They differ because ostriches are African and adapted to running and penguins are Antarctic and adapted to swimming. Flying, once a badge of all the birds, is now merely part of their stereotype.
For birds, at least, there is a kind of crisp biological category into which they can be fitted: a clade, exactly one branch of the genealogical tree of organisms. The branch represents the descendants of a single ancestral population. But not all of our familiar animal categories can be pegged onto one branch. Sometimes the descendants of a species {311} diverge so unevenly that some of their scions are almost unrecognizable. Those branchlets have to be hacked off to keep the category as we know it, and the main branch is disfigured by jagged stumps. It turns into a fuzzy category whose boundaries are defined by similarity, without a crisp scientific definition.
Fish, for example, do not occupy one branch in the tree of life. One of their kind, a lungfish, begot the amphibians, whose descendants embrace the reptiles, whose descendants embrace the birds and the mammals. There is no definition that picks out all and only the fish, no branch of the tree of life that includes salmon and lungfish but excludes lizards and cows. Taxonomists fiercely debate what to do with categories like fish that are obvious to any child but have no scientific definition because they are neither species nor clades. Some insist that there is no such thing as a fish; it is merely a layperson's stereotype. Others try to rehabilitate everyday categories like fish using computer algorithms that sort creatures into clusters sharing properties. Still others wonder what the fuss is about; they see categories like families and orders as matters of convenience and taste — which similarities are important for the discussion at hand.
Classification is particularly fuzzy at the stump where a branch was hacked off, that is, the extinct species that became the inauspicious ancestor of a new group. The fossil Archaeopteryx, thought to be the ancestor of the birds, has been described by one paleontologist as “a piss-poor reptile, and not very much of a bird.” The anachronistic shoehorning of extinct animals into the modern categories they spawned was a bad habit of early paleontologists, dramatically recounted in Gould's Wonderful Life.
So the world sometimes presents us with fuzzy categories, and registering their similarities is the best we can do. Now we may turn the question around. Does the world ever present us with crisp categories?
In his book Women, Fire, and Dangerous Things, named after a fuzzy grammatical category in an Australian language, the linguist George Lakoff argues that pristine categories are fictions. They are artifacts of the bad habit of seeking definitions, a habit that we inherited from Aristotle and now must shake off. He defies his readers to find a sharp-edged category in {312} the world. Crank up the microscope, and the boundaries turn to fuzz. Take a textbook example, “mother,” a category with the seemingly straightforward definition “female parent.” Oh, yeah? What about surrogate mothers? Adoptive mothers? Foster mothers? Egg donors? Or take species. A species, unlike the controversial larger categories like “fish,” is supposed to have a clear definition: usually, a population of organisms whose members can mate to form fertile offspring. But even that vaporizes under scrutiny. There are widely dispersed, gradually varying species in which an animal from the western edge of the range can mate with an animal from the center, and an animal from the center can mate with an animal from the east, but an animal from the west cannot mate with an animal from the east.
The observations are interesting, but I think they miss an important point. Systems of rules are idealizations that abstract away from complicating aspects of reality. They are never visible in pure form, but are no less real for all that. No one has ever actually sighted a triangle without thickness, a frictionless plane, a point mass, an ideal gas, or an infinite, randomly interbreeding population. That is not because they are useless figments but because they are masked by the complexity and finiteness of the world and by many layers of noise. The concept of “mother” is perfectly well defined within a number of idealized theories. In mammalian genetics, a mother is the source of the sex cell that always carries an X chromosome. In evolutionary biology, she is the producer of the larger gamete. In mammalian physiology, she is the site of prenatal growth and birth; in genealogy, the immediate female ancestor; in some legal contexts, the guardian of the child and the spouse of the child's father. The omnibus concept “mother” depends on an idealization of the idealizations in which all the systems pick out the same entities: the contributor of the egg nurtures the embryo, bears the offspring, raises it, and marries the sperm donor. Just as friction does not refute Newton, exotic disruptions of the idealized alignment of genetics, physiology, and law do not make “mother” any fuzzier within each of these systems. Our theories, both folk and scientific, can idealize away from the messiness of the world and lay bare its underlying causal forces.
It's hard to read about the human mind's tendency to put things in boxes organized around a stereotype without pondering the tragedy of racism. {313} If people form stereotypes even about rabbits and fish, does racism come naturally to us? And if racism is both natural and irrational, does that make the love of stereotypes a bug in our cognitive software? Many social and cognitive psychologists would answer yes. They link ethnic stereotypes to an overeagerness to form categories and to an insensitivity to the laws of statistics that would show the stereotypes to be false. An Internet discussion group for neural-network modelers once debated what kinds of learning algorithms would best model Archie Bunker. The discussants assumed that people are racists when their neural networks perform poorly or are deprived of good training examples. If only our networks could use a proper learning rule and take in enough data, they would transcend false stereotypes and correctly register the facts of human equality.
Some ethnic stereotypes are indeed based on bad statistics or none at all; they are a product of a coalitional psychology that automatically denigrates outsiders (see Chapter 7). Others may be based on good statistics about nonexistent people, the virtual characters we meet every day on the big and small screens: Italian goodfellas, Arab terrorists, black drug dealers, Asian kung fu masters, British spies, and so on.
But sadly, some stereotypes may be based on good statistics about real people. In the United States at present, there are real and large differences among ethnic and racial groups in their average performance in school and in their rates of committing violent crimes. (The statistics, of course, say nothing about heredity or any other putative cause.) Ordinary people's estimates of these differences are fairly accurate, and in some cases, people with more contact with a minority group, such as social workers, have more pessimistic, and unfortunately more accurate, estimates of the frequency of negative traits such as illegitimacy and welfare dependency. A good statistical category-maker could develop racial stereotypes and use them to make actuarially sound but morally repugnant decisions about individual cases. This behavior is racist not because it is irrational (in the sense of statistically inaccurate) but because it flouts the moral principle that it is wrong to judge an individual using the statistics of a racial or ethnic group. The argument against bigotry, then, does not come from the design specs for a rational statistical categorizer. It comes from a rule system, in this case a rule of ethics, that tells us when to turn our statistical categorizers off. {314}
You have channel-surfed to a rerun of L.A. Law, and you want to know why the harpy lawyer Rosalind Shays is weeping on the witness stand. If someone began to explain that the fluid in her tear ducts had increased in volume until the pressure exceeded the surface tension by such and such an amount, you would squelch the lecture. What you want to find out is that she hopes to win a lawsuit against her former employers and is shedding crocodile tears to convince the jury that when the firm fired her she was devastated. But if you saw the next episode and wanted to know why she plummeted to the bottom of an elevator shaft after she accidentally stepped through the open door, her motives would be irrelevant to anyone but a Freudian gone mad. The explanation is that matter in free fall, Rosalind Shays included, accelerates at a rate of 9.8 meters per second per second.
There are many ways to explain an event, and some are better than others. Even if neuroscientists someday decode the entire wiring diagram of the brain, human behavior makes the most sense when it is explained in terms of beliefs and desires, not in terms of volts and grams. Physics provides no insight into the machinations of a crafty lawyer, and even fails to enlighten us about many simpler acts of living things. As Richard Dawkins observed, “If you throw a dead bird into the air it will describe a graceful parabola, exactly as physics books say it should, then come to rest on the ground and stay there. It behaves as a solid body of a particular mass and wind resistance ought to behave. But if you throw a live bird in the air it will not describe a parabola and come to rest on the ground. It will fly away, and may not touch land this side of the county boundary.” We understand birds and plants in terms of their innards. To know why they move and grow, we cut them open and put bits under a microscope. We need yet another kind of explanation for artifacts like a chair and a crowbar: a statement of the function the object is intended to perform. It would be silly to try to understand why chairs have a stable horizontal surface by cutting them open and putting bits of them under a microscope. The explanation is that someone designed the chair to hold up a human behind.
Many cognitive scientists believe that the mind is equipped with innate intuitive theories or modules for the major ways of making sense {315} of the world. There are modules for objects and forces, for animate beings, for artifacts, for minds, and for natural kinds like animals, plants, and minerals. Don't take the “theory” idiom literally; as we have seen, people don't really work like scientists. Don't take the “module” metaphor too seriously, either; people can mix and match their ways of knowing. A concept like “throwing,” for example, welds an intention (intuitive psychology) to a motion (intuitive physics). And we often apply modes of thinking to subject matters they were not designed for, such as in slapstick humor (person as object), animistic religion (tree or mountain as having a mind), and anthropomorphic animal stories (animals with human minds). As I have mentioned, I prefer to think of the ways of knowing in anatomical terms, as mental systems, organs, and tissues, like the immune system, blood, or skin. They accomplish specialized functions, thanks to their specialized structures, but don't necessarily come in encapsulated packages. I would also add that the list of intuitive theories or modules or ways of knowing is surely too short. Cognitive scientists think of people as Mr. Spock without the funny ears. A more realistic inventory would include modes of thought and feeling for danger, contamination, status, dominance, fairness, love, friendship, sexuality, children, relatives, and the self. They are explored in later chapters.
Saying that the different ways of knowing are innate is different from saying that knowledge is innate. Obviously we have to learn about Fris-bees, butterflies, and lawyers. Talking about innate modules is not meant to minimize learning but to explain it. Learning involves more than recording experience; learning requires couching the records of experience so that they generalize in useful ways. A VCR is excellent at recording, but no one would look to this modern version of the blank slate as a paradigm of intelligence. When we watch lawyers in action, we draw conclusions about their goals and values, not their tongue and limb trajectories. Goals and values are one of the vocabularies in which we mentally couch our experiences. They cannot be built out of simpler concepts from our physical knowledge the way “momentum” can be built out of mass and velocity or “power” can be built out of energy and time. They are primitive or irreducible, and higher-level concepts are defined in terms of them. To understand learning in other domains, we have to find their vocabularies, too.
Because a combinatorial system like a vocabulary can generate a vast number of combinations, one might wonder whether human thoughts can be generated by a single system, a general-purpose Esperanto of the mind. But even a very powerful combinatorial system has its limits. A {316} calculator can add and multiply a vast number of vast numbers, but it will never spell a sentence. A dedicated word processor can type Borges’ infinite library of books with all combinations of characters, but it can never add the numbers it spells out. Modern digital computers can do a lot with a little, but that “little” still includes distinct, hard-wired vocabularies for text, graphics, logic, and several kinds of numbers. When the computers are programmed into artificial intelligence reasoning systems, they have to be innately endowed with an understanding of the basic categories of the world: objects, which can't be in two places at once, animals, which live for a single interval of time, people, who don't like pain, and so on. That is no less true of the human mind. Even a dozen innate mental vocabularies — a wild and crazy idea, according to critics — would be a small number with which to spell the entirety of human thought and feeling, from the meanings of the 500,000 words in the Oxford English Dictionary to the plots of Scheherazade's 1,001 tales.
We live in the material world, and one of the first things in life we must figure out is how objects bump into each other and fall down elevator shafts. Until recently, everyone thought that the infant's world was a kaleidoscope of sensations, a “blooming, buzzing confusion,” in William James’ memorable words. Piaget claimed that infants were sensorimotor creatures, unaware that objects cohere and persist and that the world works by external laws rather than the infants’ actions. Infants would be like the man in the famous limerick about Berkeley's idealist philosophy:
There once was a man who said, “God
Must think it exceedingly odd
If he finds that this tree
Continues to be
When there's no one about in the Quad.”
Philosophers are fond of pointing out that the belief that the world is a hallucination or that objects do not exist when you aren't looking at them is not refutable by any observation. A baby could experience the blooming and buzzing all its life unless it was equipped with a mental mechanism that interpreted the blooms and buzzes as the outward signs {317} of persisting objects that follow mechanical laws. We should expect infants to show some appreciation of physics from the start.
Only careful laboratory studies can tell us what it is like — rather, what it was like — to be a baby. Unfortunately, infants are difficult experimental subjects, worse than rats and sophomores. They can't easily be conditioned, and they don't talk. But an ingenious technique, refined by the psychologists Elizabeth Spelke and Renee Baillargeon, capitalizes on one feat that infants are good at: getting bored. When infants see the same old thing again and again, they signal their boredom by looking away. If a new thing appears, they perk up and stare. Now, “old thing” and “new thing” are in the mind of the beholder. By seeing what revives babies’ interest and what prolongs their ennui, we can guess at what things they see as the same and what things they see as different — that is, how they categorize experience. It's especially informative when a screen first blocks part of the infant's view and then falls away, for we can try to tell what the babies were thinking about the invisible part of their world. If the baby's eyes are only momentarily attracted and then wander off, we can infer that the scene was in the baby's mind's eye all along. If the baby stares longer, we can infer that the scene came as a surprise.
Three- to four-month-old infants are usually the youngest tested, both because they are better behaved than younger babies and because their stereo vision, motion perception, visual attention, and acuity have just matured. The tests cannot, by themselves, establish what is and is not innate. Three-month-olds were not born yesterday, so anything they know they could, in theory, have learned. And three-month-olds still have a lot of maturing to do, so anything they come to know later could emerge without learning, just as teeth and pubic hair do. But by telling us what babies know at what age, the findings narrow the options.
Spelke and Philip Kelman wanted to see what infants treated as an object. Remember from Chapter 4 that it is not easy, even for an adult, to say what an “object” is. An object can be defined as a stretch of the visual field with a smooth silhouette, a stretch with a homogeneous color and texture, or a collection of patches with a common motion. Often these definitions pick out the same pieces, but when they don't, it is common motion that wins the day. When pieces move together, we see them as a single object; when pieces go their separate ways, we see them as separate objects. The concept of an object is useful because bits of matter that are attached to one another usually move together. Bicycles and {318} grapevines and snails may be jagged agglomerations of different materials, but if you pick up one end, the other end comes along for the ride.
Kelman and Spelke bored babies with two sticks poking out from behind the top and bottom edges of a wide screen. The question was whether the babies would see the sticks as part of a single object. When the screen was removed, the babies saw either one long stick or two short ones with a gap between them. If the babies had visualized a single object, then seeing a single object would be a bore, and two would come as a surprise. If they had thought of each piece as its own object, then seeing a single object would be a surprise, and two a bore. Control experiments measured how long infants looked at one versus two objects without having seen anything beforehand; these baseline times were subtracted out.
Infants might have been expected to see the two pieces as two pieces, or, if they had mentally united them at all, to have used all the correlations among the features of an object as criteria: smooth silhouettes, common colors, common textures, and common motions. But apparently infants have an idea of objecthood early in life, and it is the core of the adult concept: parts moving together. When two sticks peeking out from behind the screen moved back and forth in tandem, babies saw them as a single object and were surprised if the raised screen revealed two. When they didn't move, babies did not expect them to be a single object, even though the visible pieces had the same color and texture. When a stick peeked out from behind the top edge and a red jagged polygon peeked out from behind the bottom edge, and they moved back and forth in tandem, babies expected them to be connected, even though they had nothing in common but motion.
The child is parent to the adult in other principles of intuitive physics. One is that an object cannot pass through another object like a ghost. Renee Baillargeon has shown that four-month-old infants are surprised when a panel just in front of a cube somehow manages to fall back flat to the ground, right through the space that the cube should be occupying. Spelke and company have shown that infants don't expect an object to pass through a barrier or through a gap that is narrower than the object is.
A second principle is that objects move along continuous trajectories: they cannot disappear from one place and materialize in another, as in the transporter room of the Enterprise. When an infant sees an object pass behind the left edge of a left screen and then seem to reappear from {319} behind the right edge of a right screen without moving through the gap between the screens, she assumes she is seeing two objects. When she sees an object pass behind the left screen, reappear at the other edge of the screen, cross the gap, and then pass behind the right screen, she assumes she is seeing one object.
A third principle is that objects are cohesive. Infants are surprised when a hand picks up what looks like an object but part of the object stays behind.
A fourth principle is that objects move each other by contact only — no action at a distance. After repeatedly seeing an object pass behind a screen and another object pop out, babies expect to see one launching the other like billiard balls. They are surprised when the screen reveals one ball stopping short and the second just up and leaving.
So three- to four-month-old infants see objects, remember them, and expect them to obey the laws of continuity, cohesion, and contact as they move. Babies are not as stoned as James, Piaget, Freud, and others thought. As the psychologist David Geary has said, James’ “blooming, buzzing confusion” is a good description of the parents' life, not the infant's. The discovery also overturns the suggestion that babies stop their world from spinning by manipulating objects, walking around them, talking about them, or hearing them talked about. Three-month-olds can barely orient, see, touch, and reach, let alone manipulate, walk, talk, and understand. They could not have learned anything by the standard techniques of interaction, feedback, and language. Nonetheless, they are sagely understanding a stable and lawful world.
Proud parents should not call MIT admissions just yet. Small babies have an uncertain grasp, at best, of gravity. They are surprised when a hand pushes a box off a table and it remains hovering in midair, but the slightest contact with the edge of the table or a fingertip is enough for them to act as if nothing were amiss. And they are not fazed when a screen rises to reveal a falling object that has defied gravity by coming to rest in midair. Nor are they nonplussed when a ball rolls right over a large hole in a table without falling through. Infants don't quite have inertia down, either. For example, they don't care when a ball rolls toward one corner of a covered box and then is shown to have ended up in the other corner.
But then, adults’ grasp of gravity and inertia is not so firm, either. The psychologists Michael McCloskey Alfonso Caramazza, and Bert Green asked college students what would happen when a ball shot out of a {320} curved tube or when a whirling tetherball was cut loose. A depressingly large minority, including many who had taken physics, guessed that it would continue in a curving path. (Newton's first law states that a moving object continues to move in a straight line unless a force acts on it.) The students explained that the object acquires a “force” or “momentum” (some students, remembering the lingo but not the concept, called it “angular momentum”), which propels it along the curve until the momentum gets used up and the path straightens out. Their beliefs come right out of the medieval theory in which an object is impressed with an “impetus” that maintains the object's motion and gradually dissipates.
These howlers come from conscious theorizing; they are not what people are prepared to see. When people view their paper-and-pencil answer as a computer animation, they burst out laughing as if watching Wile E. Coyote chasing the Road Runner over a cliff and stopping in midair before plunging straight down. But the cognitive misconceptions do run deep. I toss a ball straight up. After it leaves my hand, which forces act on it on the way up, at the apogee, and on the way down? It's almost impossible not to think that momentum carries the ball up against gravity, the forces equal out, and then gravity is stronger and pushes it back down. The correct answer is that gravity is the only force and that it applies the whole time. The linguist Leonard Talmy points out that the impetus theory infuses our language. When we say The ball kept rolling because the wind blew on it, we are construing the ball as having an inherent tendency toward rest. When we say The ridge kept the pencil on the table, we are imbuing the pencil with a tendency toward motion, not to mention flouting Newton's third law (action equals reaction) by imputing a greater force to the ridge. Talmy, like most cognitive scientists, believes that the conceptions drive the language, not the other way around.
When it comes to more complicated motions, even perception fails us. The psychologists Dennis Proffitt and David Gilden have asked people simple questions about spinning tops, wheels rolling down ramps, colliding balls, and Archimedes-in-the-bathtub displacements. Even physics professors guess the wrong outcome if they are not allowed to fiddle with equations on paper. (If they are, they spend a quarter of an hour working it out and then announce that the problem is “trivial.”) When it comes to these motions, video animations of impossible events look quite natural. Indeed, possible events look unnatural: a spinning {321} top, which leans without falling, is an object of wonder to all of us, even physicists.
It is not surprising to find that the mind is non-Newtonian. The idealized motions of classical mechanics are visible only in perfectly elastic point masses moving in vacuums on frictionless planes. In the real world, Newton's laws are masked by friction from the air, the ground, and the objects’ own molecules. With friction slowing everything that moves and keeping stationary objects in place, it's natural to conceive of objects as having an inherent tendency toward rest. As historians of science have noted, it would be hard to convince a medieval European struggling to free an oxcart from the mud that an object in motion continues at a constant speed along a straight line unless acted upon by an external force. Complicated motions like spinning tops and rolling wheels have a double disadvantage. They depend on evolutionarily unprecedented machines with negligible friction, and their motions are governed by complex equations that relate many variables at once; our perceptual system can handle only one at a time even in the best of circumstances.
Even the brainiest baby has a lot to learn. Children grow up in a world of sand, Velcro, glue, Nerf balls, rubbed balloons, dandelion seeds, boomerangs, television remote controls, objects suspended by near-invisible fishing line, and countless other objects whose idiosyncratic properties overwhelm the generic predictions of Newton's laws. The precociousness that infants show in the lab does not absolve them of learning about objects; it makes the learning possible. If children did not carve the world into objects, or if they were prepared to believe that objects could magically disappear and reappear anywhere, they would have no pegs on which to hang their discoveries of stickiness, fluffiness, squishiness, and so on. Nor could they develop the intuitions captured in Aristotle's theory, the impetus theory, Newton's theory, or Wile E. Coyote's theory. An intuitive physics relevant to our middle-sized world has to refer to enduring matter and its lawful motions, and infants see the world in those terms from the beginning.
Here is the plot of a movie. A protagonist strives to attain a goal. An antagonist interferes. Thanks to a helper, the protagonist finally succeeds. This movie does not feature a swashbuckling hero aided by a {322} romantic interest to foil a dastardly villain. Its stars are three dots. One dot moves some distance up an inclined line, back down, and up again, until it is almost at the top. Another abruptly collides with it, and it moves back down. A third gently touches it and moves together with it to the top of the incline. It is impossible not to see the first dot as trying to get up the hill, the second as hindering it, and the third as helping it reach its goal.
The social psychologists Fritz Heider and M. Simmel were the filmmakers. Together with many developmental psychologists, they conclude that people interpret certain motions not as special cases in their intuitive physics (perhaps as weird springy objects) but as a different kind of entity altogether. People construe certain objects as animate agents. Agents are recognized by their ability to violate intuitive physics by starting, stopping, swerving, or speeding up without an external nudge, especially when they persistently approach or avoid some other object. The agents are thought to have an internal and renewable source of energy, force, impetus, or oomph, which they use to propel themselves, usually in service of a goal.
These agents are animals, of course, including humans. Science tells us that they follow physical laws, just like everything else in the universe; it's just that the matter in motion consists of tiny little molecules in muscles and brains. But outside the neurophysiology lab ordinary thinkers have to assign them to a different category of uncaused causers.
Infants divide the world into the animate and the inert early in life. Three-month-olds are upset by a face that suddenly goes still but not by an object that suddenly stops moving. They try to bring objects toward them by pushing things, but try to bring people toward them by making noise. By six or seven months, babies distinguish between how hands act upon objects and how other objects act upon objects. They have opposite expectations about what makes people move and what makes objects move: objects launch each other by collisions; people start and stop on their own. By twelve months, babies interpret cartoons of moving dots as if the dots were seeking goals. For example, the babies are not surprised when a dot that hops over a barrier on its way to another dot makes a beeline after the barrier is removed. Three-year-olds describe dot cartoons much as we do, and have no trouble distinguishing things that move on their own, like animals, from things that don't, like dolls, statues, and lifelike animal figurines.
Intuitions about self-propelled agents overlap with three other major {323} ways of knowing. Most agents are animals, and animals, like plants and minerals, are categories that we sense are given by nature. Some self-propelled things, like cars and windup dolls, are artifacts. And many agents do not merely approach and avoid goals but act out of beliefs and desires; that is, they have minds. Let's look at each of them.
People everywhere are fine amateur biologists. They enjoy looking at animals and plants, classify them into groups that biologists recognize, predict their movements and life cycles, and use their juices as medicines, poisons, food additives, and recreational drugs. These talents, which have adapted us to the cognitive niche, come from a mode of understanding the world called folk biology, though “folk natural history” may be a more apt term. People have certain intuitions about natural kinds — roughly, the sorts of things found in a museum of natural history, such as animals, plants, and minerals — that they don't apply to artifacts, such as coffeepots, or to kinds stipulated directly by rules, such as triangles and prime ministers.
What is the definition of lion? You might say “a large, ferocious cat that lives in Africa.” But suppose you learned that a decade ago lions were hunted to extinction in Africa and survive only in American zoos. Suppose scientists discovered that lions weren't innately ferocious; they get that way in a dysfunctional family but otherwise grow up like Bert Lahr in the Wizard of Oz. Suppose it turned out that they were not even cats. I had a teacher who insisted that lions really belonged in the dog family, and though she was wrong, she could have been right, just as whales turned out to be mammals, not fish. But if this thought experiment turned out to be true, you would probably feel that these gentle American dogs were still really lions, even if not a word of the definition survived. Lions just don't have definitions. They are not even picked out by the picture of a lion in the dictionary next to the definition of the word. A lifelike mechanical lion wouldn't count as the real thing, and one can imagine breeding a striped lion that looked more like a tiger but would still count as a lion.
Philosophers say that the meaning of a natural-kind term comes from an intuition of a hidden trait or essence that the members share with one another and with the first examples dubbed with the term. People don't {324} need to know what the essence is, just that there is one. Some people probably think that lionhood is in the blood; others might mumble something about DNA; still others would have no idea but would sense that lions all have it, whatever it is, and pass it to their offspring. Even when an essence is known, it is not a definition. Physicists tell us that gold is matter with atomic number 79, as good an essence as we can hope for. But if they had miscalculated and it turned out that gold was 78 and platinum 79, we would not think that the word gold now refers to platinum or experience much of a change in the way we think about gold. Compare these intuitions with our feelings about artifacts like coffeepots. Coffeepots are pots for making coffee. The possibility that all coffeepots have an essence, that scientists might someday discover it, or that we might have been wrong about coffeepots all along and that they are really pots for making tea are worthy of Monty Python's Flying Circus.
If the driving intuition behind folk physics is the continuous solid object, and the driving intuition behind animacy is an internal and renewable source of oomph, then the driving intuition behind natural kinds is a hidden essence. Folk biology is said to be essentialistic. The essence has something in common with the oomph that powers animals’ motions, but it also is sensed to give the animal its form, to drive its growth, and to orchestrate its vegetative processes like breathing and digestion. Of course, today we know that this elan vital is really just a tiny data tape and chemical factory inside every cell.
Intuitions about essences can be found long ago and far away. Even before Darwin, the Linnaean classification system used by professional biologists was guided by a sense of proper categories based not on similarity but on underlying constitution. Peacocks and peahens were classified as the same animal, as were a caterpillar and the butterfly it turned into. Some similar animals — monarch and viceroy butterflies, mice and shrews — were put into different groups because of subtle differences in their internal structure or embryonic forms. The classification was hierarchical: every living thing belonged to one species, every species belonged to one genus, and so on up through families, classes, orders, and phyla to the plant and animal kingdoms, all in one tree of life. Again, compare this system with the classification of artifacts — say, the tapes in a video store. They can be arranged by genre, such as dramas and musicals, by period, such as new releases and classics, by alphabetical order, by country of origin, or by various cross-classifications such as foreign {325} new releases or classic musicals. There is no single correct tree of videotapes.
The anthropologists Brent Berlin and Scott Atran have discovered that folk taxonomies all over the world work the same way as the Lin-naean tree. People group all the local plants and animals into kinds that correspond to the biologist's “genus.” Since there is usually only one species per genus in a locality, their categories usually match the biologist's “species” as well. Every folk genus belongs to a single “life form,” such as mammals, birds, mushrooms, herbs, insects, or reptiles. The life forms are in turn either animals or plants. People override appearances when classifying living things; for example, they lump frogs and tadpoles. They use their classes to reason about how animals work, such as who can breed with whom.
One of Darwin's best arguments for evolution was that it explained why living things are hierarchically grouped. The tree of life is a family tree. The members of a species seem to share an essence because they are descendants of a common ancestor that passed it on. Species fall into groups within groups because they diverged from even earlier common ancestors. Embryonic and internal features are more sensible criteria than surface appearance because they better reflect degree of related-ness.
Darwin had to fight his contemporaries’ intuitive essentialism because, taken to an extreme, it implied that species could not change. A reptile has a reptilian essence and can no more evolve into a bird than the number seven can evolve into an even number. As recently as the 1940s, the philosopher Mortimer Adler argued that just as there can be no three-and-a-half-sided triangle, there can be nothing intermediate between an animal and a human, so humans could not have evolved. Darwin pointed out that species are populations, not ideal types, with members that vary; in the past they could have shaded into in-between forms.
Today we have gone to the other extreme, and in modern academic life “essentialist” is just about the worst thing you can call someone. In the sciences, essentialism is tantamount to creationism. In the humanities, the label implies that the person subscribes to insane beliefs such as that the sexes are not socially constructed, there are universal human emotions, a real world exists, and so on. And in the social sciences, “essentialism” has joined “reductionism,” “determinism,” and “reifica-tion” as a term of abuse hurled at anyone who tries to explain human {326} thought and behavior rather than redescribe it. I think it is unfortunate that “essentialism” has become an epithet, because at heart it is just the ordinary human curiosity to find out what makes natural things work. Essentialism is behind the success of chemistry, physiology, and genetics, and even today biologists routinely embrace the essentialist heresy when they work on the Human Genome Project (but everyone has a different genome!) or open up Gray's Anatomy (but bodies vary!).
How deeply rooted is essentialist thinking? The psychologists Frank Keil, Susan Gelman, and Henry Wellman have taken the philosophers’ thought experiments about natural kinds and given them to children. Doctors take a tiger, bleach its fur, and sew on a mane. Is it a lion or a tiger? Seven-year-olds say it's still a tiger, but five-year-olds say it's now a lion. This finding, taken at face value, suggests that older children are essentialists about animals but younger ones are not. (At no age are children essentialists about artifacts — if you make a coffeepot look like a birdfeeder, children, like adults, say it just is a birdfeeder.)
But with deeper probing, one can find evidence for essentialist intuitions about living things even in preschoolers. Five-year-olds deny that an animal can be made to cross the deeper boundary into plants or artifacts. For example, they say that a porcupine that looks as if it has been turned into a cactus or a hairbrush in fact has not. And preschoolers think that one species can be turned into another only when the transformation affects a permanent part of the animal's constitution, not when it merely alters appearance. For example, they deny that a lion costume turns a tiger into a lion. They claim that if you remove the innards of a dog, the shell that remains, while looking like a dog, is not a dog and can't bark or eat dogfood. But if you remove the outsides of a dog, leaving something that doesn't look like a dog at all, it's still a dog and does doggy things. Preschoolers even have a crude sense of inheritance. Told that a piglet is being raised by cows, they know it will grow up to oink, not moo.
Children do not merely sort animals like baseball cards but use their categories to reason about how animals work. In one experiment, three-year-olds were shown pictures of a flamingo, a blackbird, and a bat that looked a lot like the blackbird. The children were told that flamingos feed their babies mashed-up food but bats feed their babies milk, and were asked what they thought the blackbird feeds its babies. With no further information, children went with appearances and said that blackbirds, like bats, give milk. But if they were told that a flamingo is a bird, the children thought of them as working like blackbirds, despite their {327} different appearance, and guessed that blackbirds provide their babies with mashed-up food, too.
Children also have a sense that a living thing's properties are there to keep it alive and help it function. Three-year-olds say that a rose has thorns because it helps the rose, but not that barbed wire has barbs to help the wire. They say that claws are good for the lobster, but not that jaws are good for the pliers. This sense of fitness or adaptation is not just a confusion between psychological wants and biological functions. The psychologists Giyoo Hatano and Kayoko Inagaki have shown that children have a clear sense that bodily processes are involuntary. They know that a boy can't digest dinner more quickly to make room for dessert, nor can he make himself fat by wishing alone.
Is essentialism learned? Biological processes are too slow and hidden to show to a bored baby, but testing babies is only one way to show knowledge in the absence of experience. Another is to measure the source of the experience itself. Three-year-olds haven't taken biology, and they have few opportunities to experiment with the innards or the heritability of animals. Whatever they have learned about essences has presumably come from their parents. Gelman and her students analyzed more than four thousand sentences from mothers talking to their children about animals and artifacts. The parents virtually never talked about innards, origins, or essences, and the few times they did, it was about the innards of artifacts. Children are essentialists without their parents’ help.
Artifacts come with being human. We make tools, and as we evolved our tools made us. One-year-old babies are fascinated by what objects can do for them. They tinker obsessively with sticks for pushing, cloth and strings for pulling, and supports for holding things up. As soon as they can be tested on tool use, around eighteen months, children show an understanding that tools have to contact their material and that a tool's rigidity and shape are more important than its color or ornamentation. Some patients with brain damage cannot name natural objects but can name artifacts, or vice versa, suggesting that artifacts and natural kinds might even be stored in different ways in the brain.
What is an artifact? An artifact is an object suitable for attaining some {328} end that a person intends to be used for attaining that end. The mixture of mechanics and psychology makes artifacts a strange category. Artifacts can't be defined by their shape or their constitution, only by what they can do and by what someone, somewhere, wants them to do. A store in my neighborhood sells nothing but chairs, but its inventory is as varied as a department store's. It has stools, high-backed dining chairs, recliners, beanbags, elastics and wires stretched over frames, hammocks, wooden cubes, plastic S's, and foam-rubber cylinders. We call them all chairs because they are designed to hold people up. A stump or an elephant's foot can become a chair if someone decides to use it as one. Probably somewhere in the forests of the world there is a knot of branches that uncannily resembles a chair. But like the proverbial falling tree that makes no sound, it is not a chair until someone decides to treat it as one. Keil's young subjects who happily let coffeepots turn into birdfeeders get the idea.
An extraterrestrial physicist or geometer, unless it had our psychology, would be baffled by some of the things we think exist in the world when these things are artifacts. Chomsky points out that we can say that the book John is writing will weigh five pounds when it is published: “the book” is both a stream of ideas in John's head and an object with mass. We talk about a house burning down to nothing and being rebuilt; somehow, it's the same house. Consider what kind of object “a city” must be, given that we can say London is so unhappy, ugly, and polluted that it should be destroyed and rebuilt a hundred miles away.
When Atran claimed that folk biology mirrors professional biology, he was criticized because folk categories like “vegetable” and “pet” match no Linnaean taxon. He replies that they are artifacts. Not only are they defined by the needs they serve (savory, succulent food; tractable companions), but they are, quite literally, human products. Millennia of selective breeding have created corn out of a grass and carrots out of a root. One has only to imagine packs of French poodles roaming the primeval forests to realize that most pets are human creations, too.
Daniel Dennett proposes that the mind adopts a “design stance” when dealing with artifacts, complementing its “physical stance” for objects like rocks and its “intentional stance” for minds. In the design stance, one imputes an intention to a real or hypothetical designer. Some objects are so suited to accomplishing an improbable outcome that the attribution is easy. As Dennett writes, “There can be little doubt what an axe is, or what a telephone is for; we hardly need to consult Alexander {329} Graham Bell's biography for clues about what he had in mind.” Others are notoriously open to rival interpretations, like paintings and sculpture, which are sometimes designed to have an inscrutable design. Still others, like Stonehenge or an assembly of gears found in a shipwreck, probably have a function, though we don't know what it is. Artifacts, because they depend on human intentions, are subject to interpretation and criticism just as if they were works of art, an activity Dennett calls “artifact hermeneutics.”
And now we come to the mind's way of knowing other minds. We are all psychologists. We analyze minds not just to follow soap-opera connivings but to understand the simplest human actions.
The psychologist Simon Baron-Cohen makes the point with a story. Mary walked into the bedroom, walked around, and walked out. How do you explain it? Maybe you'd say that Mary was looking for something she wanted to find and thought it was in the bedroom. Maybe you'd say Mary heard something in the bedroom and wanted to know what made the noise. Or maybe you'd say that Mary forgot where she was going; maybe she really intended to go downstairs. But you certainly would not say that Mary just does this every day at this time: she just walks into the bedroom, walks around, and walks out again. It would be unnatural to explain human behavior in the physicist's language of time, distance, and mass, and it would also be wrong; if you came back tomorrow to test the hypothesis, it would surely fail. Our minds explain other people's behavior by their beliefs and desires because other people's behavior is in fact caused by their beliefs and desires. The behaviorists were wrong, and everyone intuitively knows it.
Mental states are invisible and weightless. Philosophers define them as “a relation between a person and a proposition.” The relation is an attitude like believes-that, desires-that, hopes-that, pretends-that. The proposition is the content of the belief, something very roughly like the meaning of a sentence — for example, Mary finds the keys, or The keys are in the bedroom. The content of a belief lives in a different realm from the facts of the world. There are unicorns grazing in Cambridge Common is false, but John thinks there are unicorns grazing in Cambridge Common could very well be true. To ascribe a belief to someone, we can't just {330} think a thought in the ordinary way, or we wouldn't be able to learn that John believes in unicorns without believing in them ourselves. We have to take a thought, set it aside in mental quotation marks, and think, “That is what John thinks” (or wants, or hopes for, or guesses). Moreover, anything we can think is also something we can think that someone else thinks (Mary knows that John thinks that there are unicorns . . .). These onionlike thoughts-inside-thoughts need a special computational architecture (see Chapter 2) and, when we communicate them to others, the recursive grammar proposed by Chomsky and explained in The Language Instinct.
We mortals can't read other people's minds directly. But we make good guesses from what they say, what we read between the lines, what they show in their face and eyes, and what best explains their behavior. It is our species’ most remarkable talent. After reading the chapter on vision you might be amazed that people can recognize a dog. Now think about what it takes to recognize the dog in a pantomime of walking one.
But somehow children do it. The skills behind mind reading are first exercised in the crib. Two-month-olds stare at eyes; six-month-olds know when they're staring back; one-year-olds look at what a parent is staring at, and check a parent's eyes when they are uncertain why the parent is doing something. Between eighteen and twenty-four months, children begin to separate the contents of other people's minds from their own beliefs. They show that ability off in a deceptively simple feat: pretending. When a toddler plays along with his mother who tells him the phone is ringing and hands him a banana, he is separating the contents of their pretense (the banana is a telephone) from the contents of his own belief (the banana is a banana). Two-year-olds use mental verbs like see and want, and three-year-olds use verbs like think, know, and remember. They know that a looker generally wants what he is looking at. And they grasp the idea of “idea.” For example, they know that you can't eat the memory of an apple and that a person can tell what's in a box only by looking into it.
By four, children pass a very stringent test of knowledge about other minds: they can attribute to others beliefs they themselves know to be false. In a typical experiment, children open a Smarties box and are surprised to find pencils inside. (Smarties, the British psychologists explain to American audiences, are like M&M's, only better.) Then the children are asked what a person coming into the room expects to find. Though the children know that the box contains pencils, they sequester the {331} knowledge, put themselves in the newcomer's shoes, and say, “Smarties.” Three-year-olds have more trouble keeping their knowledge out of the picture; they insist that the newcomer will expect to find pencils in the candy box. But it's unlikely that they lack the very idea of other minds; when the wrong answer is made less alluring or the children are induced to think a bit harder, they attribute false beliefs to others, too. The results come out the same in every country in which children have been tested.
Thinking of other minds comes so naturally that it almost seems like part and parcel of intelligence itself. Can we even imagine what it would be like not to think of other people as having minds? The psychologist Alison Gopnik imagines it would be like this:
At the top of my field of vision is a blurry edge of nose, in front are waving hands . . . Around me bags of skin are draped over chairs, and stuffed into pieces of cloth; they shift and protrude in unexpected ways. . . . Two dark spots near the top of them swivel restlessly back and forth. A hole beneath the spots fills with food and from it comes a stream of noises. . . . The noisy skin-bags suddenly [move] toward you, and their noises [grow] loud, and you [have] no idea why. . . .
Baron-Cohen, Alan Leslie, and Uta Frith have proposed that there really are people who think like this. They are the people we call autistic.
Autism affects about one in a thousand children. They are said to “draw into a shell and live within themselves.” When taken into a room, they disregard people and go for the objects. When someone offers a hand, they play with it like a mechanical toy. Cuddly dolls and stuffed animals hold little interest. They pay little attention to their parents and don't respond when called. In public, they touch, smell, and walk over people as if they were furniture. They don't play with other children. But the intellectual and perceptual abilities of some autistic children are legendary (especially after Dustin Hoffman's performance in Rain Man). Some of them learn multiplication tables, put together jigsaw puzzles (even upside down), disassemble and reassemble appliances, read distant license plates, or instantly calculate the day of the week on which any given date in the past or future falls.
Like many psychology undergraduates, I learned about autism from a famous Scientific American reprint, “Joey: A Mechanical Boy,” by the psychoanalyst Bruno Bettelheim. Bettelheim explained that Joey's autism was caused by emotionally distant parents (“icebox mother” became the {332} favored term) and early, rigid toilet training. He wrote, “It is unlikely that Joey's calamity could befall a child in any time and culture but our own.” According to Bettelheim, postwar parents had such an easy time providing their children with creature comforts that they took no pleasure in it, and the children did not develop a feeling of worth from having their basic needs satisfied. Bettelheim claimed to have cured Joey, at first by letting him use a wastebasket instead of the toilet. (He allowed that the therapy “entailed some hardship for his counselors.”)
Today we know that autism occurs in every country and social class, lasts a lifetime (though sometimes with improvement), and cannot be blamed on mothers. It almost certainly has neurological and genetic causes, though they have not been pinpointed. Baron-Cohen, Frith, and Leslie suggest that autistic children are mind-blind: their module for attributing minds to others is damaged. Autistic children almost never pretend, can't explain the difference between an apple and a memory of an apple, don't distinguish between someone's looking into a box and someone's touching it, know where a cartoon face is looking but do not guess that it wants what it is looking at, and fail the Smarties (false-belief) task. Remarkably, they pass a test that is logically the same as the false-belief task but not about minds. The experimenter lifts Rubber Ducky out of the bathtub and puts it on the bed, takes a Polaroid snapshot, and then puts it back in the bathtub. Normal three-year-olds believe that the photo will somehow show the duck in the tub. Autistic children know it does not.
Mind-blindness is not caused by real blindness, nor by mental retardation such as Down's syndrome. It is a vivid reminder that the contents of the world are not just there for the knowing but have to be grasped with suitable mental machinery. In a sense, autistic children are right: the universe is nothing but matter in motion. My “normal” mental equipment leaves me chronically dumbfounded at the fact that a micradot and a spoonful of semen can bring about a site of thinking and feeling and that a blood clot or a metal slug can end it. It gives me the delusion that London and chairs and vegetables are on the inventory of the world's objects. Even the objects themselves are a kind of delusion. Buckminster Fuller once wrote: “Everything you've learned ... as ‘obvious’ becomes less and less obvious as you begin to study the universe. For example, there are no solids in the universe. There's not even a suggestion of a solid. There are no absolute continuums. There are no surfaces. There are no straight lines.” {333}
In another sense, of course, the world does have surfaces and chairs and rabbits and minds. They are knots and patterns and vortices of matter and energy that obey their own laws and ripple through the sector of space-time in which we spend our days. They are not social constructions, nor the bits of undigested beef that Scrooge blamed for his vision of Marley's ghost. But to a mind unequipped to find them, they might as well not exist at all. As the psychologist George Miller has put it, “The crowning intellectual accomplishment of the brain is the real world. . . . [A]ll [the] fundamental aspects of the real world of our experience are adaptive interpretations of the really real world of physics.”
The medieval curriculum comprised seven liberal arts, divided into the lower-level trivium (grammar, logic, and rhetoric) and the upper-level quadrivium (geometry, astronomy, arithmetic, and music). Trivium originally meant three roads, then it meant crossroads, then commonplace (since common people hang around crossroads), and finally trifling or immaterial. The etymology is, in a sense, apt: with the exception of astronomy, none of the liberal arts is about anything. They don't explain plants or animals or rocks or people; rather, they are intellectual tools that can be applied in any realm. Like the students who complain that algebra will never help them in the real world, one can wonder whether these abstract tools are useful enough in nature for natural selection to have inculcated them in the brain. Let's look at a modified trivium: logic, arithmetic, and probability.
“Contrariwise,” continued Tweedledee, “if it was so, it might be, and if it were
so, it would be; but as it isn't, it ain't. That's logic!”
Logic, in the technical sense, refers not to rationality in general but to inferring the truth of one statement from the truth of other statements based only on their form, not their content. I am using logic when I reason as follows. P is true, P implies Q, therefore Q is true. P and Q are {334} true, therefore P is true. P or Q is true, P is false, therefore Q is true. P implies Q, Q is false, therefore P is false. I can derive all these truths not knowing whether P means “There is a unicorn in the garden,” “Iowa grows soybeans,” or “My car has been eaten by rats.”
Does the brain do logic? College students’ performance on logic problems is not a pretty sight. There are some archeologists, biologists, and chess players in a room. None of the archeologists are biologists. All of the biologists are chess players. What, if anything, follows? A majority of students conclude that none of the archeologists are chess players, which is not valid. None of them conclude that some of the chess players are not archeologists, which is valid. In fact, a fifth claim that the premises allow no valid inferences.
Spock always did say that humans are illogical. But as the psychologist John Macnamara has argued, that idea itself is barely logical. The rules of logic were originally seen as a formalization of the laws of thought. That went a bit overboard; logical truths are true regardless of how people think. But it is hard to imagine a species discovering logic if its brain did not give it a feeling of certitude when it found a logical truth. There is something peculiarly compelling, even irresistible, about P, P implies Q, therefore Q. With enough time and patience, we discover why our own logical errors are erroneous. We come to agree with one another on which truths are necessary. And we teach others not by force of authority but socratically, by causing the pupils to recognize truths by their own standards.
People surely do use some kind of logic. All languages have logical terms like not, and, same, equivalent, and opposite. Children use and, not, or, and if appropriately before they turn three, not only in English but in half a dozen other languages that have been studied. Logical inferences are ubiquitous in human thought, particularly when we understand language. Here is a simple example from the psychologist Martin Braine:
John went in for lunch. The menu showed a soup-and-salad special, with free beer or coffee. Also, with the steak you got a free glass of red wine. John chose the soup-and-salad special with coffee, along with something else to drink.
(a) Did John get a free beer? (Yes, No, Can't Tell)
(b) Did John get a free glass of wine? (Yes, No, Can't Tell)
Virtually everyone deduces that the answer to (a) is no. Our knowledge of restaurant menus tells us that the or in free beer or coffee implies “not {335} both” — you get only one of them free; if you want the other, you have to pay for it. Farther along, we learn that John chose coffee. From the premises “not both free beer and free coffee” and “free coffee,” we derive “not free beer” by a logical inference. The answer to question (b) is also no. Our knowledge of restaurants reminds us that food and beverages are not free unless explicitly offered as such by the menu. We therefore add the conditional “if not steak, then no free red wine.” John chose the soup and salad, which suggests he did not choose steak; we conclude, using a logical inference, that he did not get a free glass of wine.
Logic is indispensable in inferring true things about the world from piecemeal facts acquired from other people via language or from one's own generalizations. Why, then, do people seem to flout logic in stories about archeologists, biologists, and chess players?
One reason is that logical words in everyday languages like English are ambiguous, often denoting several formal logical concepts. The English word or can sometimes mean the logical connective OR (A or B or both) and can sometimes mean the logical connective XOR (exclusive or: A or B but not both). The context often makes it clear which one the speaker intended, but in bare puzzles coming out of the blue, readers can make the wrong guess.
Another reason is that logical inferences cannot be drawn out willy-nilly. Any true statement can spawn an infinite number of true but useless new ones. From “Iowa grows soybeans,” we can derive “Iowa grows soybeans or the cow jumped over the moon,” “Iowa grows soybeans and either the cow jumped over the moon or it didn't,” ad infinitum. (This is an example of the “frame problem” introduced in Chapter 1.) Unless it has all the time in the world, even the best logical inferencer has to guess which implications to explore and which are likely to be blind alleys. Some rules have to be inhibited, so valid inferences will inevitably be missed. The guessing can't itself come from logic; generally it comes from assuming that the speaker is a cooperative conversational partner conveying relevant information and not, say, a hostile lawyer or a tough-grading logic professor trying to trip one up.
Perhaps the most important impediment is that mental logic is not a hand-held calculator ready to accept any As and B's and C's as input. It is enmeshed with our system of knowledge about the world. A particular step of mental logic, once set into motion, does not depend on world knowledge, but its inputs and outputs are piped directly into that knowledge. {336} In the restaurant story, for example, the links of inference alternate between knowledge of menus and applications of logic.
Some areas of knowledge have their own inference rules that can either reinforce or work at cross-purposes with the rules of logic. A famous example comes from the psychologist Peter Wason. Wason was inspired by the philosopher Karl Popper's ideal of scientific reasoning: a hypothesis is accepted if attempts to falsify it fail. Wason wanted to see how ordinary people do at falsifying hypotheses. He told them that a set of cards had letters on one side and numbers on the other, and asked them to test the rule “If a card has a D on one side, it has a 3 on the other,” a simple P-implies-Q statement. The subjects were shown four cards and were asked which ones they would have to turn over to see if the rule was true. Try it:
Most people choose either the D card or the D card and the 3 card. The correct answer is D and 7. “P implies Q” is false only if P is true and Q is false. The 3 card is irrelevant; the rule said that D's have 3's, not that 3's have D's. The 7 card is crucial; if it had a D on the other side, the rule would be dead. Only about five to ten percent of the people who are given the test select the right cards. Even people who have taken logic courses get it wrong. (Incidentally, it's not that people interpret “If D then 3” as “If D then 3 and vice versa.” If they did interpret it that way but otherwise behaved like logicians, they would turn over all four cards.) Dire implications were seen. John Q. Public was irrational, unscientific, prone to confirming his prejudices rather than seeking evidence that could falsify them.
But when the arid numbers and letters are replaced with real-world events, sometimes — though only sometimes — people turn into logicians. You are a bouncer in a bar, and are enforcing the rule “If a person is drinking beer, he must be eighteen or older.” You may check what people are drinking or how old they are. Which do you have to check: a beer drinker, a Coke drinker, a twenty-five-year-old, a sixteen-year-old? Most people correctly select the beer drinker and the sixteen-year-old. But mere concreteness is not enough. The rule “If a person eats hot chili peppers, then he drinks cold beer” is no easier to falsify than the D's and 3's.
Leda Cosmides discovered that people get the answer right when the {337} rule is a contract, an exchange of benefits. In those circumstances, showing that the rule is false is equivalent to finding cheaters. A contract is an implication of the form “If you take a benefit, you must meet a requirement”; cheaters take the benefit without meeting the requirement. Beer in a bar is a benefit that one earns by proof of maturity, and cheaters are underage drinkers. Beer after chili peppers is mere cause and effect, so Coke drinking (which logically must be checked) doesn't seem relevant. Cosmides showed that people do the logical thing whenever they construe the P's and Q's as benefits and costs, even when the events are exotic, like eating duiker meat and finding ostrich eggshells. It's not that a logic module is being switched on, but that people are using a different set of rules. These rules, appropriate to detecting cheaters, sometimes coincide with logical rules and sometimes don't. When the cost and benefit terms are flipped, as in “If a person pays $20, he receives a watch,” people still choose the cheater card (he receives the watch, he doesn't pay $20) — a choice that is neither logically correct nor the typical error made with meaningless cards. In fact, the very same story can draw out logical or nonlogical choices depending on the reader's interpretation of who, if anyone, is a cheater. “If an employee gets a pension, he has worked for ten years. Who is violating the rule?” If people take the employee's point of view, they seek the twelve-year workers without pensions; if they take the employer's point of view, they seek the eight-year workers who hold them. The basic findings have been replicated among the Shiwiar, a foraging people in Ecuador.
The mind seems to have a cheater-detector with a logic of its own. When standard logic and cheater-detector logic coincide, people act like logicians; when they part company, people still look for cheaters. What gave Cosmides the idea to look for this mental mechanism? It was the evolutionary analysis of altruism (see Chapters 6 and 7). Natural selection does not select public-mindedness; a selfish mutant would quickly outreproduce its altruistic competitors. Any selfless behavior in the natural world needs a special explanation. One explanation is reciprocation: a creature can extend help in return for help expected in the future. But favor-trading is always vulnerable to cheaters. For it to have evolved, it must be accompanied by a cognitive apparatus that remembers who has taken and that ensures that they give in return. The evolutionary biologist Robert Trivers had predicted that humans, the most conspicuous altruists in the animal kingdom, should have evolved a hypertrophied cheater-detector algorithm. Cosmides appears to have found it. {338}
So is the mind logical in the logician's sense? Sometimes yes, sometimes no. A better question is, Is the mind well-designed in the biologist's sense? Here the “yes” can be a bit stronger. Logic by itself can spin off trivial truths and miss consequential ones. The mind does seem to use logical rules, but they are recruited by the processes of language understanding, mixed with world knowledge, and supplemented or superseded by special inference rules appropriate to the content.
Mathematics is part of our birthright. One-week-old babies perk up when a scene changes from two to three items or vice versa. Infants in their first ten months notice how many items (up to four) are in a display, and it doesn't matter whether the items are homogeneous or heterogeneous, bunched together or spread out, dots or household objects, even whether they are objects or sounds. According to recent experiments by the psychologist Karen Wynn, five-month-old infants even do simple arithmetic. They are shown Mickey Mouse, a screen covers him up, and a second Mickey is placed behind it. The babies expect to see two Mick-eys when the screen falls and are surprised if it reveals only one. Other babies are shown two Mickeys and one is removed from behind the screen. These babies expect to see one Mickey and are surprised to find two. By eighteen months children know that numbers not only differ but fall into an order; for example, the children can be taught to choose the picture with fewer dots. Some of these abilities are found in, or can be taught to, some kinds of animals.
Can infants and animals really count? The question may sound absurd because these creatures have no words. But registering quantities does not depend on language. Imagine opening a faucet for one second every time you hear a drumbeat. The amount of water in the glass would represent the number of beats. The brain might have a similar mechanism, which would accumulate not water but neural pulses or the number of active neurons. Infants and many animals appear to be equipped with this simple kind of counter. It would have many potential selective advantages, which depend on the animal's niche. They range from estimating the rate of return of foraging in different patches to solving problems such as “Three bears went into the cave; two came out. Should I go in?” {339}
Human adults use several mental representations of quantity. One is analogue — a sense of “how much” — which can be translated into mental images such as an image of a number line. But we also assign number words to quantities and use the words and the concepts to measure, to count more accurately, and to count, add, and subtract larger numbers. All cultures have words for numbers, though sometimes only “one,” “two,” and “many.” Before you snicker, remember that the concept of number has nothing to do with the size of a number vocabulary. Whether or not people know words for big numbers (like “four” or “quintillion”), they can know that if two sets are the same, and you add 1 to one of them, that set is now larger. That is true whether the sets have four items or a quintillion items. People know that they can compare the size of two sets by pairing off their members and checking for leftovers; even mathematicians are forced to that technique when they make strange claims about the relative sizes of infinite sets. Cultures without words for big numbers often use tricks like holding up fingers, pointing to parts of the body in sequence, or grabbing or lining up objects in twos and threes.
Children as young as two enjoy counting, lining up sets, and other activities guided by a sense of number. Preschoolers count small sets, even when they have to mix kinds of objects, or have to mix objects, actions, and sounds. Before they really get the hang of counting and measuring, they appreciate much of its logic. For example, they will try to distribute a hot dog equitably by cutting it up and giving everyone two pieces (though the pieces may be of different sizes), and they yell at a counting puppet who misses an item or counts it twice, though their own counting is riddled with the same kinds of errors.
Formal mathematics is an extension of our mathematical intuitions. Arithmetic obviously grew out of our sense of number, and geometry out of our sense of shape and space. The eminent mathematician Saunders Mac Lane speculated that basic human activities were the inspiration for every branch of mathematics:
Counting
® arithmetic and number theory
Measuring
® real numbers, calculus, analysis
Shaping
® geometry, topology
Forming (as in architecture)
® symmetry, group theory
Estimating
® probability, measure theory, statistics
Moving
® mechanics, calculus, dynamics
Calculating
® algebra, numerical analysis
Proving
Puzzling
® combinatorics, number theory
Grouping
® set theory, combinatorics
Mac Lane suggests that “mathematics starts from a variety of human activities, disentangles from them a number of notions which are generic and not arbitrary, then formalizes these notions and their manifold interrelations.” The power of mathematics is that the formal rule systems can then “codify deeper and nonobvious properties of the various originating human activities.” Everyone — even a blind toddler — instinctively knows that the path from A straight ahead to B and then right to C is longer than the shortcut from A to C. Everyone also visualizes how a line can define the edge of a square and how shapes can be abutted to form bigger shapes. But it takes a mathematician to show that the square on the hypotenuse is equal to the sum of the squares on the other two sides, so one can calculate the savings of the shortcut without traversing it.
To say that school mathematics comes out of intuitive mathematics is not to say that it comes out easily. David Geary has suggested that natural selection gave children some basic mathematical abilities: determining the quantity of small sets, understanding relations like “more than” and “less than” and the ordering of small numbers, adding and subtracting small sets, and using number words for simple counting, measurement, and arithmetic. But that's where it stopped. Children, he suggests, are not biologically designed to command large number words, large sets, the base-10 system, fractions, multicolumn addition and subtraction, carrying, borrowing, multiplication, division, radicals, and exponents. These skills develop slowly, unevenly, or not at all.
On evolutionary grounds it would be surprising if children were mentally equipped for school mathematics. These tools were invented recently in history and only in a few cultures, too late and too local to stamp the human genome. The mothers of these inventions were the recording and trading of farming surpluses in the first agricultural civilizations. Thanks to formal schooling and written language (itself a recent, noninstinctive invention), the inventions could accumulate over the millennia, and simple mathematical operations could be assembled into more and more complicated ones. Written symbols could serve as a medium of computation that surmounted the limitations of short-term memory, just as silicon chips do today.
How can people use their Stone Age minds to wield high-tech mathematical instruments? The first way is to set mental modules to work on {341} objects other than the ones they were designed for. Ordinarily, lines and shapes are analyzed by imagery and other components of our spatial sense, and heaps of things are analyzed by our number faculty. But to accomplish Mac Lane's ideal of disentangling the generic from the parochial (for example, disentangling the generic concept of quantity from the parochial concept of the number of rocks in a heap), people might have to apply their sense of number to an entity that, at first, feels like the wrong kind of subject matter. For example, people might have to analyze a line in the sand not by the habitual imagery operations of continuous scanning and shifting, but by counting off imaginary segments from one end to the other.
The second way to get to mathematical competence is similar to the way to get to Carnegie Hall: practice. Mathematical concepts come from snapping together old concepts in a useful new arrangement. But those old concepts are assemblies of still older concepts. Each subassembly hangs together by the mental rivets called chunking and automaticity: with copious practice, concepts adhere into larger concepts, and sequences of steps are compiled into a single step. Just as bicycles are assembled out of frames and wheels, not tubes and spokes, and recipes say how to make sauces, not how to grasp spoons and open jars, mathematics is learned by fitting together overlearned routines. Calculus teachers lament that students find the subject difficult not because derivatives and integrals are abstruse concepts — they're just rate and accumulation — but because you can't do calculus unless algebraic operations are second nature, and most students enter the course without having learned the algebra properly and need to concentrate every drop of mental energy on that. Mathematics is ruthlessly cumulative, all the way back to counting to ten.
Evolutionary psychology has implications for pedagogy which are particularly clear in the teaching of mathematics. American children are among the worst performers in the industrialized world on tests of mathematical achievement. They are not born dunces; the problem is that the educational establishment is ignorant of evolution. The ascendant philosophy of mathematical education in the United States is constructivism, a mixture of Piaget's psychology with counterculture and postmodernist ideology. Children must actively construct mathematical knowledge for themselves in a social enterprise driven by disagreements about the meanings of concepts. The teacher provides the materials and the social milieu but does not lecture or guide the discussion. Drill and {342} practice, the routes to automaticity, are called “mechanistic” and seen as detrimental to understanding. As one pedagogue lucidly explained, “A zone of potential construction of a specific mathematical concept is determined by the modifications of the concept children might make in, or as a result of, interactive communication in the mathematical learning environment.” The result, another declared, is that “it is possible for students to construct for themselves the mathematical practices that, historically, took several thousand years to evolve.”
As Geary points out, constructivism has merit when it comes to the intuitions of small numbers and simple arithmetic that arise naturally in all children. But it ignores the difference between our factory-installed equipment and the accessories that civilization bolts on afterward. Setting our mental modules to work on material they were not designed for is hard. Children do not spontaneously see a string of beads as elements in a set, or points on a line as numbers. If you give them a bunch of blocks and tell them to do something together, they will exercise their intuitive physics and intuitive psychology for all they're worth, but not necessarily their intuitive sense of number. (The better curricula explicitly point out connections across ways of knowing. Children might be told to do every arithmetic problem three different ways: by counting, by drawing diagrams, and by moving segments along a number line.) And without the practice that compiles a halting sequence of steps into a mental reflex, a learner will always be building mathematical structures out of the tiniest nuts and bolts, like the watchmaker who never made subassemblies and had to start from scratch every time he put down a watch to answer the phone.
Mastery of mathematics is deeply satisfying, but it is a reward for hard work that is not itself always pleasurable. Without the esteem for hard-won mathematical skills that is common in other cultures, the mastery is unlikely to blossom. Sadly, the same story is being played out in American reading instruction. In the dominant technique, called “whole language,” the insight that language is a naturally developing human instinct has been garbled into the evolutionarily improbable claim that reading is a naturally developing human instinct. Old-fashioned practice at connecting letters to sounds is replaced by immersion in a text-rich social environment, and the children don't learn to read. Without an understanding of what the mind was designed to do in the environment in which we evolved, the unnatural activity called formal education is unlikely to succeed. {343}
“I shall never believe that God plays dice with the world,” Einstein famously said. Whether or not he was right about quantum mechanics and the cosmos, his statement is certainly not true of the games people play in their daily lives. Life is not chess but backgammon, with a throw of the dice at every turn. As a result, it is hard to make predictions, especially about the future (as Yogi Berra allegedly said). But in a universe with any regularities at all, decisions informed by the past are better than decisions made at random. That has always been true, and we would expect organisms, especially informavores such as humans, to have evolved acute intuitions about probability. The founders of probability theory, like the founders of logic, assumed that they were just formalizing common sense. But then why do people often seem to be “probability-blind,” in the words of Massimo Piattelli-Palmarini? Many mathematicians and scientists have bemoaned the innumeracy of ordinary people when they reason about risk. The psychologists Amos Tversky and Daniel Kahneman have amassed ingenious demonstrations of how people's intuitive grasp of chance appears to flout the elementary canons of probability theory. Here are some famous examples.
• People gamble and buy state lottery tickets, sometimes called “the stupidity tax.” But since the house must profit, the players, on average, must lose.
• People fear planes more than cars, especially after news of a gory plane crash, though plane travel is statistically far safer. They fear nuclear power, though more people are crippled and killed by coal. Every year a thousand Americans are accidentally electrocuted, but rock stars don't campaign to reduce the household voltage. People clamor for bans on pesticide residues and food additives, though they pose trivial risks of cancer compared to the thousands of natural carcinogens that plants have evolved to deter the bugs that eat them.
• People feel that if a roulette wheel has stopped at black six times in a row, it's due to stop at red, though of course the wheel has no memory and every spin is independent. A large industry of self-anointed seers hallucinate trends in the random walk of the stock market. Hoop fans believe that basketball players get a “hot hand,” making baskets in clusters, though their strings of swishes and bricks are indistinguishable from coin flips. {344}
• This problem was given to sixty students and staff members at Harvard Medical School: “If a test to detect a disease whose prevalence is 1/1000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming you know nothing about the person's symptoms or signs?” The most popular answer was .95. The average answer was .56. The correct answer is .02, and only eighteen percent of the experts guessed it. The answer, according to Bayes’ theorem, may be calculated as the prevalence or base rate (1/1000) times the test's sensitivity or hit rate (proportion of sick people who test positive, presumably 1), divided by the overall incidence of positive test results (the percentage of the time the test comes out positive, collapsing over sick and healthy people — that is, the sum of the sick people who test positive, 1/1000 × 1, and the healthy people who test positive, 999/1000 × .05). One bugaboo in the problem is that many people misinterpret “false positive rate” as the proportion of positive results that come from healthy people, instead of interpreting it as the proportion of healthy people who test positive. But the biggest problem is that people ignore the base rate (1/1000), which ought to have reminded them that the disease is rare and hence improbable for a given patient even if the test comes out positive. (They apparently commit the fallacy that because zebras make hoofbeats, hoofbeats imply zebras.) Surveys have shown that many doctors needlessly terrify their patients who test positive for a rare disease.
• Try this: “Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations. What is the probability that Linda is a bank-teller? What is the probability that Linda is a bankteller and is active in the feminist movement?” People sometimes give a higher estimate to the probability that she is a feminist bankteller than to the probability that she is a bankteller. But it's impossible for “A and B” to be more likely than “A” alone.
When I presented these findings in class, a student cried out, “I'm ashamed for my species!” Many others feel the disgrace, if not about themselves, then about the person in the street. Tversky, Kahneman, Gould, Piattelli-Palmarini, and many social psychologists have concluded that the mind is not designed to grasp the laws of probability, even though the laws rule the universe. The brain can process limited amounts of information, so instead of computing theorems it uses crude {345} rules of thumb. One rule is: the more memorable an event, the more likely it is to happen. (I can remember a recent gory plane crash, therefore planes are unsafe.) Another is: the more an individual resembles a stereotype, the more likely he is to belong to that category. (Linda fits my image of a feminist bankteller better than she fits my image of a bank-teller, so she's more likely to be a feminist bankteller.) Popular books with lurid titles have spread the bad news: Irrationality: The Enemy Within; Inevitable Illusions: How Mistakes of Reason Rule Our Minds; How We Know What Isn't So: The Fallibility of Human Reason in Everyday Life. The sad history of human folly and prejudice is explained by our ineptness as intuitive statisticians.
Tversky and Kahneman's demonstrations are among the most thought-provoking in psychology, and the research has drawn attention to the depressingly low intellectual quality of our public discourse about societal and personal risk. But in a probabilistic world, could the human mind really be oblivious to probability? The solutions to the problems that people flub can be computed with a few keystrokes on a cheap calculator. Many animals, even bees, compute accurate probabilities as they forage. Could those computations really exceed the information-processing capacity of the trillion-synapse human brain? It is hard to believe, and one does not have to believe it. People's reasoning is not as stupid as it might first appear.
To begin with, many risky choices are just that, choices, and cannot be gainsaid. Take the gamblers, plane phobics, and chemical avoiders. Are they really irrational? Some people take pleasure in awaiting the outcomes of events that could radically improve their lives. Some people dislike being strapped in a tube and flooded with reminders of a terrifying way to die. Some people dislike eating foods deliberately laced with poison (just as some people might choose not to eat a hamburger fortified with harmless worm meat). There is nothing irrational in any of these choices, any more than in preferring vanilla over chocolate ice cream.
The psychologist Gerd Gigerenzer, along with Cosmides and Tooby, have noted that even when people's judgments of probability depart from the truth, their reasoning may not be illogical. No mental faculty is omniscient. Color vision is fooled by sodium vapor streetlights, but that does not mean it is badly designed. It is demonstrably well designed, far better than any camera at registering constant colors with changing illumination (see Chapter 4). But it owes its success at this unsolvable problem {346} to tacit assumptions about the world. When the assumptions are violated in an artificial world, color vision fails. The same may be true of our probability-estimators.
Take the notorious “gambler's fallacy”: expecting that a run of heads increases the chance of a tail, as if the coin had a memory and a desire to be fair. I remember to my shame an incident during a family vacation when I was a teenager. My father mentioned that we had suffered through several days of rain and were due for good weather, amd I corrected him, accusing him of the gambler's fallacy. But long-suffering Dad was right, and his know-it-all son was wrong. Cold fronts aren't raked off the earth at day's end and replaced with new ones the next morning. A cloud cover must have some average size, speed, and direction, and it would not surprise me (now) if a week of clouds really did predict that the trailing edge was near and the sun was about to be unmasked, just as the hundredth railroad car on a passing train portends the caboose with greater likelihood than the third car.
Many events work like that. They have a characteristic life history, a changing probability of occurring over time which statisticians call a hazard function. An astute observer should commit the gambler's fallacy and try to predict the next occurrence of an event from its history so far, a kind of statistics called time-series analysis. There is one exception: devices that are designed to deliver events independently of their history. What kind of device would do that? We call them gambling machines. Their reason for being is to foil an observer who likes to turn patterns into predictions. If our love of patterns were misbegotten because randomness is everywhere, gambling machines should be easy to build and gamblers easy to fool. In fact, roulette wheels, slot machines, even dice, cards, and coins are precision instruments; they are demanding to manufacture and easy to defeat. Card counters who “commit the gambler's fallacy” in blackjack by remembering the dealt cards and betting they won't turn up again soon are the pests of Las Vegas.
So in any world but a casino, the gambler's fallacy is rarely a fallacy. Indeed, calling our intuitive predictions fallacious because they fail on gambling devices is backwards. A gambling device is, by definition, a machine designed to defeat our intuitive predictions. It is like calling our hands badly designed because they make it hard to get out of handcuffs. The same is true of the hot-hand illusion and other fallacies among sports fans. If basketball shots were easily predictable, we would no longer call basketball a sport. An efficient stock market is another invention {347} designed to defeat human pattern detection. It is set up to let traders quickly capitalize on, hence nullify, deviations from a random walk.
Other so-called fallacies may also be triggered by evolutionary novelties that trick our probability calculators, rather than arising from crippling design defects. “Probability” has many meanings. One is relative frequency in the long run. “The probability that the penny will land heads is .5” would mean that in a hundred coin flips, fifty will be heads. Another meaning is subjective confidence about the outcome of a single event. In this sense, “the probability that the penny will land heads is .5” would mean that on a scale of 0 to 1, your confidence that the next flip will be heads is halfway between certainty that it will happen and certainty that it won't.
Numbers referring to the probability of a single event, which only make sense as estimates of subjective confidence, are commonplace nowadays: there is a thirty percent chance of rain tomorrow; the Canadi-ens are favored to beat the Mighty Ducks tonight with odds of five to three. But the mind may have evolved to think of probabilities as relative frequencies in the long run, not as numbers expressing confidence in a single event. The mathematics of probability was invented only in the seventeenth century, and the use of proportions or percentages to express them arose even later. (Percentages came in after the French Revolution with the rest of the metric system and were initially used for interest and tax rates.) Still more modern is the input to the formulas for probability: data gathered by teams, recorded in writing, checked for errors, accumulated in archives, and tallied and scaled to yield numbers. The closest equivalent for our ancestors would have been hearsay of unknown validity, transmitted with coarse labels like probably. Our ancestors’ usable probabilities must have come from their own experience, and that means they were frequencies: over the years, five out of the eight people who came down with a purple rash died the following day.
Gigerenzer, Cosmides, Tooby, and the psychologist Klaus Fiedler noticed that the medical decision problem and the Linda problem ask for single-event probabilities: how likely is that this patient is sick, how likely is it that Linda is a bankteller. A probability instinct that worked in relative frequencies might find the questions beyond its ken. There's only one Linda, and either she is a bankteller or she isn't. “The probability that she is a bankteller” is uncomputable. So they gave people the vexing problems but stated them in terms of frequencies, not single-event {348} probabilities. One out of a thousand Americans has the disease; fifty out of a thousand healthy people test positive; we assembled a thousand Americans; how many who test positive have the disease? A hundred people fit Linda's description; how many are banktellers; how many are feminist banktellers? Now a majority of people — up to ninety-two percent — behave like good statisticians.
This cognitive therapy has enormous implications. Many men who test positive for HIV (the AIDS virus) assume they are doomed. Some have taken extreme measures, including suicide, despite their surely knowing that most men don't have AIDS (especially men who do not fall into a known risk group) and that no test is perfect. But it is hard for doctors and patients to use that knowledge to calibrate the chance of being infected, even when the probabilities are known. For example, in recent years the prevalence of HIV in German men who do not belong to a risk group is 0.01%, the sensitivity (hit rate) of a typical HIV test is 99.99%, and the false positive rate is perhaps 0.01%. The prospects of a patient who has tested positive do not sound very good. But now imagine that a doctor counseled a patient as follows: “Think of 10,000 heterosexual men like you. We expect one to be infected with the virus, and he will almost certainly test positive. Of the 9,999 men who are not infected, one additional man will test positive. Thus we get two who test positive, but only one of them actually has the virus. All we know at this point is that you have tested positive. So the chance that you actually have the virus is about 50-50.” Gigerenzer has found that when probabilities are presented in this way (as frequencies), people, including specialists, are vastly more accurate at estimating the probability of a disease following a medical test. The same is true for other judgments under uncertainty, such as guilt in a criminal trial.
Gigerenzer argues that people's intuitive equation of probability with frequency not only makes them calculate like statisticians, it! makes them think like statisticians about the concept of probability itself — a surprisingly slippery and paradoxical notion. What does the probability of a single event even mean? Bookmakers are willing to make up inscrutable numbers such as that the odds that Michael Jackson and LaToya Jackson are the same person are 500 to 1, or that the odds that {349} circles in cornfields emanate from Phobos (one of the moons of Mars) are 1,000 to 1. I once saw a tabloid headline announcing that the chances that Mikhail Gorbachev is the Antichrist are one in eight trillion. Are these statements true? False? Approximately true? How could we tell? A colleague tells me that there is a ninety-five percent chance he will show up at my talk. He doesn't come. Was he lying?
You may be thinking: granted, a single-event probability is just subjective confidence, but isn't it rational to calibrate confidence by relative frequency? If everyday people don't do it that way, wouldn't they be irrational? Ah, but the relative frequency of what? To count frequencies you have to decide on a class of events to count up, and a single event belongs to an infinite number of classes. Richard von Mises, a pioneer of probability theory, gives an example.
In a sample of American women between the ages of 35 and 50, 4 out of 100 develop breast cancer within a year. Does Mrs. Smith, a 49-year-old American woman, therefore have a 4% chance of getting breast cancer in the next year? There is no answer. Suppose that in a sample of women between the ages of 45 and 90 — a class to which Mrs. Smith also belongs — 11 out of 100 develop breast cancer in a year. Are Mrs. Smith's chances 4%, or are they 11%? Suppose that her mother had breast cancer, and 22 out of 100 women between 45 and 90 whose mothers had the disease will develop it. Are her chances 4%, 11%, or 22%? She also smokes, lives in California, had two children before the age of 25 and one after 40, is of Greek descent . . . What group should we compare her with to figure out the “true” odds? You might think, the more specific the class, the better — but the more specific the class, the smaller its size and the less reliable the frequency. If there were only two people in the world very much like Mrs. Smith, and one developed breast cancer, would anyone say that Mrs. Smith's chances are 50%? In the limit, the only class that is truly comparable with Mrs. Smith in all her details is the class containing Mrs. Smith herself. But in a class of one, “relative frequency” makes no sense.
These philosophical questions about the meaning of probability are not academic; they affect every decision we make. When a smoker rationalizes that his ninety-year-old parents have been puffing a pack a day for decades, so the nationwide odds don't apply to him, he might very well be right. In the 1996 presidential election, the advanced age of the Republican candidate became an issue. The New Republic published the following letter: {350}
To the Editors:
In your editorial “Is Dole Too Old?” (April 1) your actuarial information was misleading. The average 72-year-old white man may suffer a 27 percent risk of dying within five years, but more than health and gender must be considered. Those still in the work force, as is Senator Bob Dole, have a much greater longevity. In addition, statistics show that greater wealth correlates to a longer life. Taking these characteristics into consideration, the average 73-year-old (the age that Dole would be if he takes office as president) has a 12.7 percent chance of dying within four years.
Yes, and what about the average seventy-three-year-old wealthy working white male who hails from Kansas, doesn't smoke, and was strong enough to survive an artillery shell? An even more dramatic difference surfaced during the murder trial of O.J. Simpson in 1995. The lawyer Alan Dershowitz, who was consulting for the defense, said on television that among men who batter their wives, only one-tenth of one percent go on to murder them. In a letter to Nature, a statistician then pointed out that among men who batter their wives and whose wives are then murdered by someone, more than half axe the murderers.
Many probability theorists conclude that the probability of a single event cannot be computed; the whole business is meaningless. Single-event probabilities are “utter nonsense,” said one mathematician. They should be handled “by psychoanalysis, not probability theory,” sniffed another. It's not that people can believe anything they want about a single event. The statements that I am more likely to lose a fight against Mike Tyson than to win one, or that I am not likely to be abducted by aliens tonight, are not meaningless. But they are not mathematical statements that are precisely true or false, and people who question them have not committed an elementary fallacy. Statements about single events can't be decided by a calculator; they have to be hashed out by weighing the evidence, evaluating the persuasiveness of arguments, recasting the statements to make them easier to evaluate, and all the other fallible processes by which mortal beings make inductive guesses about an unknowable future.
So even the ditziest performance in the Homo sapiens hall of shame — saying that Linda is more likely to be a feminist bankteller than a bankteller — is not a fallacy, according to many mathematicians. Since a single-event probability is mathematically meaningless, people are forced to make sense of the question as best they can. Gigeirenzer suggests that since frequencies are moot and people don't intuitively {351} give numbers to single events, they may switch to a third, nonmathe-matical definition of probability, “degree of belief warranted by the information just presented.” That definition is found in many dictionaries and is used in courts of law, where it corresponds to concepts such as probable cause, weight of evidence, and reasonable doubt. If questions about single-event probabilities nudge people into that definition — a natural interpretation for subjects to have made if they assumed, quite reasonably, that the experimenter had included the sketch of Linda for some purpose — they would have interpreted the question as, To what extent does the information given about Linda warrant the conclusion that she is a bankteller? And a reasonable answer is, not very much.
A final mind-bending ingredient of the concept of probability is the belief in a stable world. A probabilistic inference is a prediction today based on frequencies gathered yesterday. But that was then; this is now. How do you know that the world hasn't changed in the interim? Philosophers of probability debate whether any beliefs in probabilities are truly rational in a changing world. Actuaries and insurance companies worry even more — insurance companies go bankrupt when a current event or a change in lifestyles makes their tables obsolete. Social psychologists point to the schlemiel who avoids buying a car with excellent repair statistics after hearing that a neighbor's model broke down yesterday. Gigerenzer offers the comparison of a person who avoids letting his child play in a river with no previous fatalities after hearing that a neighbor's child was attacked there by a crocodile that morning. The difference between the scenarios (aside from the drastic consequences) is that we judge that the car world is stable, so the old statistics apply, but the river world has changed, so the old statistics are moot. The person in the street who gives a recent anecdote greater weight than a ream of statistics is not necessarily being irrational.
Of course, people sometimes reason fallaciously, especially in today's data deluge. And, of course, everyone should learn probability and statistics. But a species that had no instinct for probability could not learn the subject, let alone invent it. And when people are given information in a format that meshes with the way they naturally think about probability, they can be remarkably accurate. The claim that our species is blind to chance is, as they say, unlikely to be true. {352}
We are almost ready to dissolve Wallace's paradox that a forager's mind is capable of calculus. The human mind, we see, is not equipped with an evolutionarily frivolous faculty for doing Western science, mathematics, chess, or other diversions. It is equipped with faculties to master the local environment and outwit its denizens. People form concepts that find the clumps in the correlational texture of the world. They have several ways of knowing, or intuitive theories, adapted to the major kinds of entities in human experience: objects, animate things, natural kinds, artifacts, minds, and the social bonds and forces we will explore in the next two chapters. They wield inferential tools like the elements of logic, arithmetic, and probability. What we now want to know is where these faculties came from and how they can be applied to modern intellectual challenges.
Here is an idea, inspired by a discovery in linguistics. Ray Jackendoff points to sentences like the following:
The messenger went from Paris to Istanbul.
The inheritance finally went to Fred.
The light went from green to red.
The meeting went from 3:00 to 4:00.
The first sentence is straightforward: someone moves from place to place. But in the others, things stay put. Fred could have become a millionaire when the will was read even if no cash changed hands but a bank account was signed over. Traffic signals are set in pavement and don't travel, and meetings aren't even things that could travel. We are using space and motion as a metaphor for more abstract ideas. In the Fred sentence, possessions are objects, owners are places, and giving is moving. For the traffic light, a changeable thing is the object, its states (red and green) are places, and changing is moving. For the meeting, time is a line, the present is a moving point, events are journeys, beginnings and ends are origins and destinations.
The spatial metaphor is found not only in talk about changes but in talk about unchanging states. Belonging, being, and scheduling are construed as if they were landmarks situated at a place: {353}
The messenger is in Istanbul.
The money is Fred's.
The light is red.
The meeting is at 3:00.
The metaphor also works in sentences about causing something to remain in a state:
The gang kept the messenger in Istanbul.
Fred kept the money.
The cop kept the light red.
Emilio kept the meeting on Monday.
Why do we make these analogies? It is not just to co-opt words but to co-opt their inferential machinery. Some deductions that apply to motion and space also apply nicely to possession, circumstances, and time. That allows the deductive machinery for space to be borrowed for reasoning about other subjects. For example, if we know that X went to Y, we can infer that X was not at Y beforehand but is there now. By analogy, if we know that a possession goes to a person, we can infer that the person did not own the possession beforehand but owns it now. The analogy is close, though it is never exact: as a messenger travels he occupies a series of locations between Paris and Istanbul, but as Fred inherits the money it does not gradually come into his possession to varying degrees as the will is being read; the transfer is instantaneous. So the concept of location must not be allowed to merge with the concepts of possession, circumstance, and time, but it can lend them some of its inferential rules. This sharing is what makes the analogies between location and other concepts good for something, and not just resemblances that catch our eye.
The mind couches abstract concepts in concrete terms. It is not only words that are borrowed for metaphors, but entire grammatical constructions. The double-object construction — Minnie sent Mary the marbles — is dedicated to sentences about giving. But the construction can be co-opted for talking about communication:
Minnie told Mary a story.
Alex asked Annie a question.
Ideas are gifts, communication is giving, the speaker is the sender, the audience is the recipient, knowing is having.
Location in space is one of the two fundamental metaphors in language, used for thousands of meanings. The other is force, agency, and causation. Leonard Talmy points out that in each of the following pairs, the two sentences refer to the same event, but the events feel different to us:
The ball was rolling along the grass.
The ball kept on rolling along the grass.
John doesn't go out of the house.
John can't go out of the house.
Larry didn't close the door.
Larry refrained from closing the door.
Shirley is polite to him.
Shirley is civil to him.
Margie's got to go to the park.
Margie gets to go to the park.
The difference is that the second sentence makes us think of an agent exerting force to overcome resistance or overpower some other force. With the second ball-in-the-grass sentence, the force is literally a physical force. But with John, the force is a desire: a desire to go out which has been restrained. Similarly, the second Larry seems to house one psychic force impelling him to close the door and another that overpowers it. For Shirley, those psychodynamics are conveyed by the mere choice of the adjective civil. In the first Margie sentence, she is impelled to the park by an external force in spite of an internal resistance. In the second, she is propelled by an internal force that overcomes an external resistance.
The metaphor of force and resistance is even more explicit in this family of sentences:
Fran forced the door to open.
Fran forced Sally to go.
The very same word, force, is being used literally and metaphorically, with a common thread of meaning that we easily appreciate. Sentences about motion and sentences about desire both allude to a billiard-ball dynamics in which an agonist has an intrinsic tendency to motion or rest, and is opposed by a weaker or stronger antagonist, causing one or both to stop or proceed. It is the impetus theory I discussed earlier in the chapter, the core of people's intuitive theory of physics.
Space and force pervade language. Many cognitive scientists (including me) have concluded from their research on language that a handful of concepts about places, paths, motions, agency, and causation underlie the literal or figurative meanings of tens of thousands of words and constructions, not only in English but in every other language that has been studied. The thought underlying the sentence Minnie gave the house to Mary would be something like “Minnie cause [house go-possessionally from Minnie to Mary].” These concepts and relations appear to be the vocabulary and syntax of mentalese, the language of thought. Because the language of thought is combinatorial, these elementary concepts may be combined into more and more complex ideas. The discovery of portions of the vocabulary and syntax of mentalese is a vindication of Leibniz’ “remarkable thought”: “that a kind of alphabet of human thoughts can be worked out and that everything can be discovered and judged by comparison of the letters of this alphabet and an analysis of the words made from them.” And the discovery that the elements of mentalese are based on places and projectiles has implications for both where the language of thought came from and how we put it to use in modern times.
Other primates may not think about stories, inheritances, meetings, and traffic lights, but they do think about rocks, sticks, and burrows. Evolutionary change often works by copying body parts and tinkering with the copy. For example, insects’ mouth parts are modified legs. A similar process may have given us our language of thought. Suppose ancestral circuits for reasoning about space and force were copied, the copy's connections to the eyes and muscles were severed, and references to the physical world were bleached out. The circuits could serve as a scaffolding whose slots are filled with symbols for more abstract concerns like states, possessions, ideas, and desires. The circuits would retain their {356} computational abilities, continuing to reckon about entities being in one state at a time, shifting from state to state, and overcoming entities with opposite valence. When the new, abstract domain has a logical structure that mirrors objects in motion — a traffic light has one color at a time but flips between them; contested social interactions are determined by the stronger of two wills — the old circuits can do useful inferential work. They divulge their ancestry as space- and force-simulators by the metaphors they invite, a kind of vestigial cognitive organ.
Are there any reasons to believe that this is how our language of thought evolved? A few. Chimpanzees, and presumably their common ancestor with our species, are curious manipulators of objects. When they are trained to use symbols or gestures, they can make them stand for the event of going to a place or putting an object in a location. The psychologist David Premack has shown that chimpanzees can isolate causes. Given a pair of before-and-after pictures, like an apple and a pair of apple halves or a scribbled sheet of paper next to a clean one, they pick out the object that wreaked the change, a knife in the first case and an eraser in the second. So not only do chimpanzees maneuver in the physical world, but they have freestanding thoughts about it. Perhaps the circuitry behind those thoughts was co-opted in our lineage for more abstract kinds of causation.
How do we know that the minds of living human beings really appreciate the parallels between, say, social and physical pressure, or between space and time? How do we know that people aren't just using dead metaphors uncomprehendingly, as when we talk of breakfast without thinking of it as breaking a fast? For one thing, space and force metaphors have been reinvented time and again, in dozens of language families across the globe. Even more suggestive evidence comes from my own main field of research, child language acquisition. The psychologist Melissa Bowerman discovered that preschool children spontaneously coin their own metaphors in which space and motion symbolize possession, circumstance, time, and causation:
You put me just bread and butter.
Mother takes ball away from boy and puts it to girl.
I'm taking these cracks bigger [while shelling a peanut]. I putted part of the sleeve blue so I crossed it out with red [while coloring]. {357}
Can I have any reading behind the dinner?
Today we'll be packing because tomorrow there won't be enough space to pack. Friday is covering Saturday and Sunday so I can't have Saturday and
Sunday if I don't go through Friday.
My dolly is scrunched from someone . . . but not from me. They had to stop from a red light.
The children could not have inherited the metaphors from earlier speakers; the equation of space with abstract ideas has come naturally to them.
Space and force are so basic to language that they are hardly metaphors at all, at least not in the sense of the literary devices used in poetry and prose. There is no way to talk about possession, circumstance, and time in ordinary conversation without using words like going, keeping, and being at. And the words don't trigger the sense of incongruity that drives a genuine literary metaphor. We all know when we are faced with a figure of speech. As Jackendoff points out, it's natural to say, “Of course, the world isn't really a stage, but if it were, you might say that infancy is the first act.” But it would be bizarre to say, “Of course, meetings aren't really points in motion, but if they were, you might say that this one went from 3:00 to 4:00.” Models of space and force don't act like figures of speech intended to convey new insights; they seem closer to the medium of thought itself. I suspect that parts of our mental equipment for time, animate beings, minds, and social relations were copied and modified in the course of our evolution from the module for intuitive physics that we partly share with chimpanzees.
Metaphors can be built out of metaphors, and we continue to borrow from concrete thoughts when we stretch our ideas and words to encompass new domains. Somewhere between the basic constructions for space and time in English and the glories of Shakespeare there is a vast inventory of everyday metaphors that express the bulk of our experience. George Lakoff and the linguist Mark Johnson have assembled a list of the “metaphors we live by” — mental equations that embrace dozens of expressions:
ARGUMENT IS WAR:
Your claims are indefensible.
He attached every weak point in my argument. {358}
Her criticisms were right on target. I've never won an argument with him.
VIRTUE IS UP:
He is high-minded.
She is an upstanding citizen.
That was a low trick.
Don't be underhanded.
I wouldn't stoop to that; it is beneath me.
LOVE IS A PATIENT:
This is a sick relationship.
They have a healthy marriage.
This marriage is dead— it can't be revived.
It's a tired affair.
IDEAS ARE FOOD:
What he said left a bad taste in my mouth.
All this paper has are half-baked ideas and warmed-over theories.
I can't swallow that claim.
That's food for thought.
Once you begin to notice this pedestrian poetry, you find it everywhere. Ideas are not only food but buildings, people, plants, products, commodities, money, tools, and fashions. Love is a force, madness, magic, and war. The visual field is a container, self-esteem is a brittle object, time is money, life is a game of chance.
The ubiquity of metaphor brings us closer to a resolution to Wallace's paradox. The answer to the question “Why is the human mind adapted to think about arbitrary abstract entities?” is that it really isn't. Unlike computers and the rules of mathematical logic, we don't think in F's and x's and y's. We have inherited a pad of forms that capture the key features of encounters among objects and forces, and the features of other consequential themes of the human condition such as fighting, food, and health. By erasing the contents and filling in the blanks with new symbols, we can adapt our inherited forms to more abstruse domains. Some of these revisions may have taken place in our evolution, giving us basic {359} mental categories like ownership, time, and will out of forms originally designed for intuitive physics. Other revisions take place as we live our lives and grapple with new realms of knowledge.
Even the most recondite scientific reasoning is an assembly of down-home mental metaphors. We pry our faculties loose from the domains they were designed to work in, and use their machinery to make sense of new domains that abstractly resemble the old ones. The metaphors we think in are lifted not only from basic scenarios like moving and bumping but from entire ways of knowing. To do academic biology, we take our way of understanding artifacts and apply it to organisms. To do chemistry, we treat the essence of a natural kind as a collection of tiny, bouncy, sticky objects. To do psychology, we treat the mind as a natural kind.
Mathematical reasoning both takes from and gives to the other parts of the mind. Thanks to graphs, we primates grasp mathematics with our eyes and our mind's eye. Functions are shapes (linear, flat, steep, crossing, smooth), and operating is doodling in mental imagery (rotating, extrapolating, filling, tracing). In return, mathematical thinking offers new ways to understand the world. Galileo wrote that “the book of nature is written in the language of mathematics; without its help it is impossible to comprehend a single word of it.”
Galileo's dictum applies not only to equation-filled blackboards in the physics department but to elementary truths we take for granted. The psychologists Carol Smith and Susan Carey have found that children have odd beliefs about matter. Children know that a heap of rice weighs something but claim that a grain of rice weighs nothing. When asked to imagine cutting a piece of steel in half repeatedly, they say that one will finally arrive at a piece so small that it no longer takes up space or has any steel inside it. They are not of unsound mind. Every physical event has a threshold below which no person or device can detect it. Repeated division of an object results in objects too small to detect; a collection of objects each of which falls below the threshold may be detectable en masse. Smith and Carey note that we find children's beliefs silly because we can construe matter using our concept of number. Only in the realm of mathematics does repeated division of a positive quantity always yield a positive quantity, and repeated addition of zero always yields zero. Our understanding of the physical world is more sophisticated than children's because we have merged our intuitions about objects with our intuitions about number.
So vision was co-opted for mathematical thinking, which helps us see {360} the world. Educated understanding is an enormous contraption of parts within parts. Each part is built out of basic mental models or ways of knowing that are copied, bleached of their original content, connected to other models, and packaged into larger parts, which can be packaged into still larger parts without limit. Because human thoughts are combinatorial (simple parts combine) and recursive (parts can be embedded within parts), breathtaking expanses of knowledge can be explored with a finite inventory of mental tools.
And what about the genius? How can natural selection explain a Shakespeare, a Mozart, an Einstein, an Abdul-Jabbar? How would Jane Austen, Vincent van Gogh, or Thelonious Monk have earned their keep on the Pleistocene savanna?
All of us are creative. Every time we stick a handy object under the leg of a wobbly table or think up a new way to bribe a child into his pajamas, we have used our faculties to create a novel outcome. But creative geniuses are distinguished not just by their extraordinary works but by their extraordinary way of working; they are not supposed to think like you and me. They burst on the scene as prodigies, enfants terribles, young turks. They listen to their muse and defy the conventional wisdom. They work when the inspiration hits, and leap with insight while the rest of us plod in baby steps along well-worn paths. They put a problem aside and let it incubate in the unconscious; then, without warning, a bulb lights up and a fully formed solution presents itself. Aha! The genius leaves us with masterpieces, a legacy of the unrepressed creativity of the unconscious. Woody Allen captures the image in his hypothetical letters from Vincent van Gogh in the story “If the Impressionists Had Been Dentists.” Vincent writes to his brother in anguish and despair, “Mrs. Sol Schwim-mer is suing me because I made her bridge as I felt it and not to fit her ridiculous mouth! That's right! I can't work to order like a common tradesman! I decided her bridge should be enormous and billowing, with wild, explosive teeth flaring up in every direction like fire! Now she is upset because it won't fit in her mouth! ... I tried forcing the false plate in but it sticks out like a star burst chandelier. Still, I find it beautiful.” {361}
The image came out of the Romantic movement two hundred years ago and is now firmly entrenched. Creativity consultants take millions of dollars from corporations for Dilbertesque workshops on brainstorming, lateral thinking, and flow from the right side of the brain, guaranteed to turn every manager into an Edison. Elaborate theories have been built to explain the uncanny problem-solving power of the dreamy unconscious. Like Alfred Russel Wallace, some have concluded that there can be no natural explanation. Mozart's manuscripts were said to have no corrections. The pieces must have come from the mind of God, who had chosen to express his voice through Mozart.
Unfortunately, creative people are at their most creative when writing their autobiographies. Historians have scrutinized their diaries, notebooks, manuscripts, and correspondence looking for signs of the temperamental seer periodically struck by bolts from the unconscious. Alas, they have found that the creative genius is more Salieri than Amadeus.
Geniuses are wonks. The typical genius pays dues for at least ten years before contributing anything of lasting value. (Mozart composed symphonies at eight, but they weren't very good; his first masterwork came in the twelfth year of his career.) During the apprenticeship, geniuses immerse themselves in their genre. They absorb tens of thousands of problems and solutions, so no challenge is completely new and they can draw on a vast repertoire of motifs and strategies. They keep an eye on the competition and a finger to the wind, and are either discriminating or lucky in their choice of problems. (The unlucky ones, however talented, aren't remembered as geniuses.) They are mindful of the esteem of others and of their place in history. (The physicist Richard Feynman wrote two books describing how brilliant, irreverent, and admired he was and called one of them What Do You Care What Other People Think?) They work day and night, and leave us with many works of subgenius. (Wallace spent the end of his career trying to communicate with the dead.) Their interludes away from a problem are helpful not because it ferments in the unconscious but because they are exhausted and need the rest (and possibly so they can forget blind alleys). They do not repress a problem but engage in “creative worrying,” and the epiphany is not a masterstroke but a tweaking of an earlier attempt. They revise endlessly, gradually closing in on their ideal.
Geniuses, of course, may also have been dealt a genetic hand with {362} four aces. But they are not freaks with minds utterly unlike ours or unlike anything we can imagine evolving in a species that has always lived by its wits. The genius creates good ideas because we all create good ideas; that is what our combinatorial, adapted minds are for.
<< | {363} | >> |
O |
n March 13, 1996, Thomas Hamilton walked into an elementary school in Dunblane, Scotland, carrying two revolvers and two semiautomatic pistols. After wounding staff members who tried to tackle him, he ran to the gymnasium, where a kindergarten class was playing. There he shot twenty-eight children, sixteen fatally, and killed their teacher before turning the gun on himself. “Evil visited us yesterday, and we don't know why,” said the school's headmaster the next day. “We don't understand it and I don't think we ever will.”
We probably never will understand what made Hamilton commit his vile final acts. But the report of pointless revenge by an embittered loner is disturbingly familiar. Hamilton was a suspected pedophile who had been forced to resign as a Scout leader and then formed his own youth groups so he could continue working with boys. One group held its meetings in the Dunblane school's gymnasium until school officials, responding to parents’ complaints about his odd behavior, forced him out. Hamilton was the target of ridicule and gossip, and was known in the area, undoubtedly for good reasons, as “Mr. Creepy.” Days before his rampage he had sent letters to the media and to Queen Elizabeth defending his reputation and pleading for reinstatement in the scouting movement.
The Dunblane tragedy was particularly shocking because no one thought it could happen there. Dunblane is an idyllic, close-knit village where serious crime was unknown. It is far from America, land of the wackos, where there are as many guns as people and where murderous rampages by disgruntled postal workers are so common (a dozen {364} incidents in a dozen years) that a slang term for losing one's temper is “going postal.” But running amok is not unique to America, to Western nations, or even to modern societies. Amok is a Malay word for the homicidal sprees occasionally undertaken by lonely Indochinese men who have suffered a loss of love, a loss of money, or a loss of face. The syndrome has been described in a culture even more remote from the West: the stone-age foragers of Papua New Guinea.
The amok man is patently out of his mind, an automaton oblivious to his surroundings and unreachable by appeals or threats. But his rampage is preceded by lengthy brooding over failure, and is carefully planned as a means of deliverance from an unbearable situation. The amok state is chillingly cognitive. It is triggered not by a stimulus, not by a tumor, not by a random spurt of brain chemicals, but by an idea. The idea is so standard that the following summary of the amok mind-set, composed in 1968 by a psychiatrist who had interviewed seven hospitalized amoks in Papua New Guinea, is an apt description of the thoughts of mass murderers continents and decades away:
I am not an important or “big man.” I possess only my personal sense of dignity. My life has been reduced to nothing by an intolerable insult. Therefore, I have nothing to lose except my life, which is nothing, so I trade my life for yours, as your life is favoured. The exchange is in my favour, so I shall not only kill you, but I shall kill many of you, and at the same time rehabilitate myself in the eyes of the group of which I am a member, even though I might be killed in the process.
The amok syndrome is an extreme instance of the puzzle of the human emotions. Exotic at first glance, upon scrutiny they turn out to be universal; quintessentially irrational, they are tightly interwoven with abstract thought and have a cold logic of their own.
A familiar tactic for flaunting one's worldliness is to inform listeners that some culture lacks an emotion we have or has an emotion we lack. Allegedly the Utku-Inuit Eskimos have no word for anger and do not feel the emotion. Tahitians supposedly do not recognize guilt, sadness, longing, or loneliness; they describe what we would call grief as fatigue, sicikness, or {365} bodily distress. Spartan mothers were said to smile upon hearing that their sons died in combat. In Latin cultures, machismo reigns, whereas the Japanese are driven by a fear of shaming the family. In interviews on language I have been asked, Who but the Jews would have a word, naches, for luminous pride in a child's accomplishments? And does it not say something profound about the Teutonic psyche that the German language has the word Schadenfreude, pleasure in another's misfortunes?
Cultures surely differ in how often their members express, talk about, and act on various emotions. But that says nothing about what their people feel. The evidence suggests that the emotions of all normal members of our species are played on the same keyboard.
The most accessible signs of emotions are candid facial expressions. In preparing The Expression of the Emotions in Man and Animals, Darwin circulated a questionnaire to people who interacted with aboriginal populations on five continents, including populations that had had little contact with Europeans. Urging them to answer in detail and from observation rather than memory, Darwin asked how the natives expressed astonishment, shame, indignation, concentration, grief, good spirits, contempt, obstinacy, disgust, fear, resignation, sulkiness, guilt, slyness, jealousy, and “yes” and “no.” For example:
(5.) When in low spirits, are the corners of the mouth depressed, and the inner corner of the eyebrows raised by that muscle which the French call the “Grief muscle”? The eyebrow in this state becomes slightly oblique, with a little swelling at the inner end; and the forehead is transversely wrinkled in the middle part, but not across the whole breadth, as when the eyebrows are raised in surprise.
Darwin summed up the responses: “The same state of mind is expressed throughout the world with remarkable uniformity; and this fact is in itself interesting as evidence of the close similarity in bodily structure and mental disposition of all the races of mankind.”
Though Darwin may have biased his informants with leading questions, contemporary research has borne out his conclusion. When the psychologist Paul Ekman began to study emotions in the 1960s, facial expressions were thought to be arbitrary signs that the infant learns when its random grimaces are rewarded and punished. If expressions appeared universal, it was thought, that was because Western models had become universal; no culture was beyond the reach of John Wayne and Charlie Chaplin. Ekman assembled photographs of people expressing {366} six emotions. He showed them to people from many cultures, including the isolated Fore foragers of Papua New Guinea, and asked them to label the emotion or make up a story about what the person had gone through. Everyone recognized happiness, sadness, anger, fear, disgust, and surprise. For example, a Fore subject said that the American showing fear in the photograph must have just seen a boar. Reversing the procedure, Ekman photographed his Fore informants as they acted out scenarios such as ‘Your friend has come and you are happy,” ‘Your child has died,” ‘You are angry and about to fight,” and ‘You see a dead pig that has been lying there for a long time.” The expressions in the photographs are unmistakable.
When Ekman began to present his findings at a meeting of anthropologists in the late 1960s, he met with outrage. One prominent anthropologist rose from the audience shouting that Ekman should not be allowed to continue to speak because his claims were fascist. On another occasion an African American activist called him a racist for saying that black facial expressions were no different from white ones. Ekman was bewildered because he had thought that if the work had any political moral it was unity and brotherhood. In any case, the conclusions have been replicated and are now widely accepted in some form (though there are controversies over which expressions belong on the universal list, how much context is needed to interpret them, and how reflexively they are tied to each emotion). And another observation by Darwin has been corroborated: children who are blind and deaf from birth display virtually the full gamut of emotions on their faces.
Why, then, do so many people think that emotions differ from culture to culture? Their evidence is much more indirect than Darwin's informants and Ekman's experiments. It comes from two sources that cannot be trusted at all as readouts of people's minds: their language and their opinions.
The common remark that a language does or doesn't have a wprd for an emotion means little. In The Language Instinct I argued that the influence of language on thought has been exaggerated, and that is all the more true for the influence of language on feeling. Whether a language appears to have a word for an emotion depends on the skill of the translator and on quirks of the language's grammar and history. A language accumulates a large vocabulary, including words for emotions, when it has had influential wordsmiths, contact with other languages, rules for forming new words out of old ones, and widespread literacy, which {367} allows new coinages to become epidemic. When a language has not had these stimulants, people describe how they feel with circumlocutions, metaphors, metonyms, and synecdoches. When a Tahitian woman says, “My husband died and I feel sick,” her emotional state is hardly mysterious; we can bet she is not complaining about acid indigestion. Even a language with a copious vocabulary has words for only a fraction of emotional experience. The author G. K. Chesterton wrote,
Man knows that there are in the soul tints more bewildering, more numberless, and more nameless than the colours of an autumn forest; . . . Yet he seriously believes that these things can every one of them, in all their tones and semitones, in all their blends and unions, be accurately represented by an arbitrary system of grunts and squeals. He believes that an ordinary civilized stockbroker can really produce out of his own inside noises which denote all the mysteries of memory and all the agonies of desire.
When English-speakers hear the word Schadenfreude for the first time, their reaction is not, “Let me see . . . Pleasure in another's misfortunes . . . What could that possibly be? I cannot grasp the concept; my language and culture have not provided me with such a category.” Their reaction is, ‘You mean there's a word for it? Cool!” That is surely what went through the minds of the writers who introduced Schadenfreude into written English a century ago. New emotion words catch on quickly, without tortuous definitions; they come from other languages (ennui, angst, naches, amok), from subcultures such as those of musicians and drug addicts (blues, funk, juiced, wasted, rush, high, freaked out), and from general slang (pissed, bummed, grossed out, blown away). I have never heard a foreign emotion word whose meaning was not instantly recognizable.
People's emotions are so alike that it takes a philosopher to craft a genuinely alien one. In an essay called “Mad Pain and Martian Pain,” David Lewis defines mad pain as follows:
There might be a strange man who sometimes feels pain, just as we do, but whose pain differs greatly from ours in its causes and effects. Our pain is typically caused by cuts, burns, pressure, and the like; his is caused by moderate exercise on an empty stomach. Our pain is generally distracting; his turns his mind to mathematics, facilitating concentration on that but distracting him from anything else. Intense pain has no {368} tendency whatever to cause him to groan or writhe, but does cause him to cross his legs and snap his fingers. He is not in the least motivated to prevent pain or to get rid of it.
Have anthropologists discovered a people that feels mad pain or something equally weird? It might seem that way if you look only at stimulus and response. The anthropologist Richard Shweder points out, “It is a trivial exercise for any anthropologist to generate long lists of antecedent events (ingesting cow urine, eating chicken five days after your father dies, kissing the genitals of an infant boy, being complimented about your pregnancy, caning a child, touching someone's foot or shoulder, being addressed by your first name by your wife, ad infinitum) about which the emotional judgments of a Western observer would not correspond to the native's evaluative response.” True enough, but if you look a bit deeper and ask how people categorize these stimuli, the emotions elicited by the categories make you feel at home. To us, cow urine is a contaminant and cow mammary secretions are a nutrient; in another culture, the categories may be reversed, but we all feel disgust for contaminants. To us, being addressed by your first name by a spouse is not disrespectful, but being addressed by your first name by a stranger might be, and being addressed by your religion by your spouse might be, too. In all the cases, disrespect triggers anger.
But what about the claims of native informants that they just don't have one of our emotions? Do our emotions seem like mad pain to them? Probably not. The Utku-Inuits’ claim that they do not feel anger is belied by their behavior: they recognize anger in foreigners, beat their dogs to discipline them, squeeze their children painfully hard, and occasionally get “heated up.” Margaret Mead disseminated the incredible claim that Samoans have no passions — no anger between parents and children or between a cuckold and a seducer, no revenge, no lasting love or bereavement, no maternal caring, no tension about sex, no adolescent turmoil. Derek Freeman and other anthropologists found that Samoan society in fact had widespread adolescent resentment and delinquency, a cult of virginity, frequent rape, reprisals by the rape victim's family, frigidity, harsh punishment of children, sexual jealousy, and strong religious feeling.
We should not be surprised at these discrepancies. The anthropologist Renato Rosaldo has noted, “A traditional anthropological description is like a book of etiquette. What you get isn't so much the deep cultural wisdom as the cultural cliches, the wisdom of Polonius, conventions in {369} the trivial rather than the informing sense. It may tell you the official rules, but it won't tell you how life is lived.” Emotions, in particular, are often regulated by the official rules, because they are assertions of a person's interests. To me it's a confession of my innermost feelings, but to you it's bitching and moaning, and you may very well tell me to put a lid on it. And to those in power, other people's emotions are even more annoying — they lead to nuisances such as women wanting men as husbands and sons rather than as cannon fodder, men fighting each other when they could be fighting the enemy, and children falling in love with a soulmate instead of accepting a betrothed who cements an important deal. Many societies deal with these nuisances by trying to regulate emotions and spreading the disinformation that they don't exist.
Ekman has shown that cultures differ the most in how the emotions are expressed in public. He secretly filmed the expressions of American and Japanese students as they watched gruesome footage of a primitive puberty rite. (Emotion researchers have extensive collections of gross-out material.) If a white-coated experimenter was in the room interviewing them, the Japanese students smiled politely during scenes that made the Americans recoil in horror. But when the subjects were alone, the Japanese and American faces were equally horrified.
The Romantic movement in philosophy, literature, and art began about two hundred years ago, and since then the emotions and the intellect have been assigned to different realms. The emotions come from nature and live in the body. They are hot, irrational impulses and intuitions, which follow the imperatives of biology. The intellect comes from civilization and lives in the mind. It is a cool deliberator that follows the interests of self and society by keeping the emotions in check. Romantics believe that the emotions are the source of wisdom, innocence, authenticity, and creativity, and should not be repressed by individuals or society. Often Romantics acknowledge a dark side, the price we must pay for artistic greatness. When the antihero in Anthony Burgess'A Clockwork Orange has his violent impulses conditioned out of him, he loses his taste for Beethoven. Romanticism dominates contemporary American popular culture, as in the Dionysian ethos of rock music, the {370} pop psychology imperative to get in touch with your feelings, and the Hollywood formulas about wise simpletons and about uptight yuppies taking a walk on the wild side.
Most scientists tacitly accept the premises of Romanticism even when they disagree with its morals. The irrational emotions and the repressing intellect keep reappearing in scientific guises: the id and the superego, biological drives and cultural norms, the right hemisphere and the left hemisphere, the limbic system and the cerebral cortex, the evolutionary baggage of our animal ancestors and the general intelligence that propelled us to civilization.
In this chapter I present a distinctly unromantic theory of the emotions. It combines the computational theory of mind, which says that the lifeblood of the psyche is information rather than energy, with the modern theory of evolution, which calls for reverse-engineering the complex design of biological systems. I will show that the emotions are adaptations, well-engineered software modules that work in harmony with the intellect and are indispensable to the functioning of the whole mind. The problem with the emotions is not that they are untamed forces or vestiges of our animal past; it is that they were designed to propagate copies of the genes that built them rather than to promote happiness, wisdom, or moral values. We often call an act “emotional” when it is harmful to the social group, damaging to the actor's happiness in the long run, uncontrollable and impervious to persuasion, or a product of self-delusion. Sad to say, these outcomes are not malfunctions but precisely what we would expect from well-engineered emotions.
The emotions are another part of the mind that has been prematurely written off as nonadaptive baggage. The neuroscientist Paul MacLean took the Romantic doctrine of the emotions and translated it into a famous but incorrect theory known as the Triune Brain. He described the human cerebrum as an evolutionary palimpsest of three layers. At the bottom are the basal ganglia or Reptilian Brain, the seat of the primitive and selfish emotions driving the “Four Fs”: feeding, fighting, fleeing, and sexual behavior. Grafted onto it is the limbic system or Primitive Mammalian Brain, which is dedicated to the kinder, gentler, social emotions, like those behind parenting. Wrapped around that is {371} the Modern Mammalian Brain, the neocortex that grew wild in human evolution and that houses the intellect. The belief that the emotions are animal legacies is also familiar from pop ethology documentaries in which snarling baboons segue into rioting soccer hooligans as the voice-over frets about whether we will rise above our instincts and stave off nuclear doom.
One problem for the triune theory is that the forces of evolution do not just heap layers on an unchanged foundation. Natural selection has to work with what is already around, but it can modify what it finds. Most parts of the human body came from ancient mammals and before them ancient reptiles, but the parts were heavily modified to fit features of the human lifestyle, such as upright posture. Though our bodies carry vestiges of the past, they have few parts that were unmodifiable and adapted only to the needs of older species. Even the appendix is currently put to use, by the immune system. The circuitry for the emotions was not left untouched, either.
Admittedly, some traits are so much a part of the architectural plan of an organism that selection is powerless to tinker with them. Might the software for the emotions be burned so deeply into the brain that organisms are condemned to feel as their remote ancestors did? The evidence says no; the emotions are easy to reprogram. Emotional repertoires vary wildly among animals depending on their species, sex, and age. Within the mammals, we find the lion and the lamb. Even within dogs (a single species), a few millennia of selective breeding have given us pit bulls and Saint Bernards. The genus closest to ours embraces common chimpanzees, in which gangs of males massacre rival gangs and females can murder one another's babies, and the pygmy chimpanzees (bonobos), whose philosophy is “Make love not war.” Of course, some reactions are widely shared across species — say, panic when one is confined — but the reactions may have been retained because they are adaptive for everyone. Natural selection may not have had complete freedom to reprogram. the emotions, but it had a lot.
And the human cerebral cortex does not ride piggyback on an ancient limbic system, or serve as the terminus of a processing stream beginning there. The systems work in tandem, integrated by many two-way connections. The amygdala, an almond-shaped organ buried in each temporal lobe, houses the main circuits that color our experience with emotions. It receives not just simple signals (such as of loud noises) from the lower stations of the brain, but abstract, complex information from the brain's {372} highest centers. The amygdala in turn sends signals to virtually every other part of the brain, including the decision-making circuitry of the frontal lobes.
The anatomy mirrors the psychology. Emotion is not just running away from a bear. It can be set off by the most sophisticated information processing the mind is capable of, such as reading a Dear John letter or coming home to find an ambulance in the driveway. And the emotions help to connive intricate plots for escape, revenge, ambition, and courtship. As Samuel Johnson wrote, “Depend upon it, sir, when a man knows he is to be hanged in a fortnight, it concentrates his mind wonderfully.”
The first step in reverse-engineering the emotions is try to imagine what a mind would be like without them. Supposedly Mr. Spock, the Vulcan mastermind, didn't have emotions (except for occasional intrusions from his human side and a seven-year itch that drove him back to Vulcan to spawn). But Spock's emotionlessness really just amounted to his being in control, not losing his head, coolly voicing unpleasant truths, and so on. He must have been driven by some motives or goals. Something must have kept Spock from spending his days calculating pi to a quadrillion digits or memorizing the Manhattan telephone directory. Something must have impelled him to explore strange new worlds, to seek out new civilizations, and to boldly go where no man had gone before. Presumably it was intellectual curiosity, a drive to set and solve problems, and solidarity with allies — emotions all. And what would Spock have done when faced with a predator or an invading Klingon? Do a headstand? Prove the four-color map theorem? Presumably a part of his brain quickly mobilized his faculties to scope out how to flee and to take steps to avoid the vulnerable predicament in the future. That is, he had fear. Spock may not have been impulsive or demonstrative, but he must have had drives that impelled him to deploy his intellect in pursuit of certain goals rather than others.
A conventional computer program is a list of instructions that the machine executes until it reaches STOP. But the intelligence of aliens, robots, and animals needs a more flexible method of control. Recall that intelligence is the pursuit of goals in the face of obstacles. Without goals, the very concept of intelligence is meaningless. To get into my locked {373} apartment, I can force open a window, call the landlord, or try to reach the latch through the mail slot. Each of these goals is attained by a chain of subgoals. My fingers won't reach the latch, so the subgoal is to find pliers. But my pliers are inside, so I set up a sub-subgoal of finding a store and buying new pliers. And so on. Most artificial intelligence systems are built around means and ends, like the production system in Chapter 2 with its stack of goal symbols displayed on a bulletin board and the software demons that respond to them.
But where does the topmost goal, the one that the rest of the program tries to attain, come from? For artificial intelligence systems, it comes from the programmer. The programmer designs it to diagnose soybean diseases or predict the next day's Dow Jones Industrial Average. For organisms, it comes from natural selection. The brain strives to put its owner in circumstances like those that caused its ancestors to reproduce. (The brain's goal is not reproduction itself; animals don't know the facts of life, and people who do know them are happy to subvert them, such as when they use contraception.) The goals installed in Homo sapiens, that problem-solving, social species, are not just the Four Fs. High on the list are understanding the environment and securing the cooperation of others.
And here is the key to why we have emotions. An animal cannot pursue all its goals at once. If an animal is both hungry and thirsty, it should not stand halfway between a berry bush and a lake, as in the fable about the indecisive ass who starved between two haystacks. Nor should it nibble a berry, walk over and take a sip from the lake, walk back to nibble another berry, and so on. The animal must commit its body to one goal at a time, and the goals have to be matched with the best moments for achieving them. Ecclesiastes says that to every thing there is a season, and a time to every purpose under heaven: a time to weep, and a time to laugh; a time to love, and a time to hate. Different goals are appropriate when a lion has you in its sights, when your child shows up in tears, or when a rival calls you an idiot in public.
The emotions are mechanisms that set the brain's highest-level goals. Once triggered by a propitious moment, an emotion triggers the cascade of subgoals and sub-subgoals that we call thinking and acting. Because the goals and means are woven into a multiply nested control structure of subgoals within subgoals within subgoals, no sharp line divides thinking from feeling, nor does thinking inevitably precede feeling or vice versa (notwithstanding the century of debate within psychology over {374} which comes first). For example, fear is triggered by a signal of impending harm like a predator, a clifftop, or a spoken threat. It lights up the short-term goal of fleeing, subduing, or deflecting the danger, and gives the goal high priority, which we experience as a sense of urgency It also lights up the longer-term goals of avoiding the hazard in the future and remembering how we got out of it this time, triggered by the state we experience as relief. Most artificial intelligence researchers believe that freely behaving robots (as opposed to the ones bolted to the side of an assembly line) will have to be programmed with something like emotions merely for them to know at every moment what to do next. (Whether the robots would be sentient of these emotions is another question, as we saw in Chapter 2.)
Fear also presses a button that readies the body for action, the so-called fight-or-flight response. (The nickname is misleading because the response prepares us for any time-sensitive action, such as grabbing a baby who is crawling toward the top of a stairwell.) The heart thumps to send blood to the muscles. Blood is rerouted from the gut and skin, leaving butterflies and clamminess. Rapid breathing takes in oxygen. Adrenaline releases fuel from the liver and helps the blood to clot. And it gives our face that universal deer-in-the-headlights look.
Each human emotion mobilizes the mind and body to meet one of the challenges of living and reproducing in the cognitive niche. Some challenges are posed by physical things, and the emotions that deal with them, like disgust, fear, and appreciation of natural beauty, work in straightforward ways. Others are posed by people. The problem in dealing with people is that people can deal back. The emotions that evolved in response to other people's emotions, like anger, gratitude, shame, and romantic love, are played on a complicated chessboard, and they spawn the passion and intrigue that misleads the Romantic. First let's explore emotions about things, then emotions about people.
The expression “a fish out of water” reminds us that every animal is adapted to a habitat. Humans are no exception. We tend to think that animals just go where they belong, like heat-seeking missiles, but the animals must experience these drives as emotions not unlike ours. Some {375} places are inviting, calming, or beautiful; others are depressing or scary. The topic in biology called “habitat selection” is, in the case of Homo sapiens, the same as the topic in geography and architecture called “environmental aesthetics”: what kinds of places we enjoy being in.
Until very recently our ancestors were nomads, leaving a site when they had used up its edible plants and animals. The decision of where to go next was no small matter. Cosmides and Tooby write:
Imagine that you are on a camping trip that lasts a lifetime. Having to carry water from a stream and firewood from the trees, one quickly learns to appreciate the advantages of some campsites over others. Dealing with exposure on a daily basis quickly gives one an appreciation for sheltered sites, out of the wind, snow, or rain. For hunter-gatherers, there is no escape from this way of life: no opportunities to pick up food at the grocery store, no telephones, no emergency services, no artificial water supplies, no fuel deliveries, no cages, guns, or animal control officers to protect one from the predatory animals. In these circumstances, one's life depends on the operation of mechanisms that cause one to prefer habitats that provide sufficient food, water, shelter, information, and safety to support human life, and that cause one to avoid those that do not.
Homo sapiens is adapted to two habitats. One is the African savanna, in which most of our evolution took place. For an omnivore like our ancestors, the savanna is a hospitable place compared with other ecosystems. Deserts have little biomass because they have little water. Temperate forests lock up much of their biomass in wood. Rainforests — or, as they used to be called, jungles — place it high in the canopy, relegating omnivores on the ground to being scavengers who gather the bits that fall from above. But the savanna — grasslands dotted with clumps of trees — is rich in biomass, much of it in the flesh of large animals, because grass replenishes itself quickly when grazed. And most of the biomass is conveniently placed a meter or two from the ground. Savannas also offer expansive views, so predators, water, and paths can be spotted from afar. Its trees provide shade and an escape from carnivores.
Our second-choice habitat is the rest of the world. Our ancestors, after evolving on the African savannas, wandered into almost every nook and cranny of the planet. Some were pioneers who left the savanna and then other areas in turn, as the population expanded or the climate changed. Others were refugees in search of safety. Foraging tribes can't {376} stand one another. They frequently raid neighboring territories and kill any stranger who blunders into theirs.
We could afford this wanderlust because of our intellect. People explore a new landscape and draw up a mental resource map, rich in details about water, plants, animals, routes, and shelter. And if they can, they make their new homeland into a savanna. Native Americans and Australian aborigines used to burn huge swaths of woodland, opening them up for colonization by grasses. The ersatz savanna attracted grazing animals, which were easy to hunt, and exposed visitors before they got too close.
The biologist George Orians, an expert on the behavioral ecology of birds, recently turned his eye to the behavioral ecology of humans. With Judith Heerwagen, Stephen Kaplan, Rachel Kaplan, and others, he argues that our sense of natural beauty is the mechanism that drove our ancestors into suitable habitats. We innately find savannas beautiful, but we also like a landscape that is easy to explore and remember, and that we have lived in long enough to know its ins and outs.
In experiments on human habitat preference, American children and adults are shown slides of landscapes and asked how much they would like to visit or live in them. The children prefer savannas, even though they have never been to one. The adults like the savannas, too, but they like the deciduous and coniferous forests — -which resemble much of the habitable United States — just as much. No one likes the deserts and the rainforests. One interpretation is that the children are revealing our species’ default habitat preference, and the adults supplement it with the land with which they have grown familiar.
Of course, people do not have a mystical longing for ancient homelands. They are merely pleased by the landscape features that savannas tend to have. Orians and Heerwagen surveyed the professional wisdom of gardeners, photographers, and painters to learn what kinds of landscapes people find beautiful. They treated it as a second kind of data on human tastes in habitats, supplementing the experiments on people's reactions to slides. The landscapes thought to be the loveliest, they found, are dead ringers for an optimal savanna: semi-open space (neither completely exposed, which leaves one vulnerable, nor overgrown, which impedes vision and movement), even ground cover, views to the horizon, large trees, water, changes in elevation, and multiple paths leading out. The geographer Jay Appleton succinctly captured what makes a landscape appealing: prospect and refuge, or seeing without being seen. The combination allows us to learn the lay of the land safely. {377}
The land itself must be legible, too. Anyone who has lost a trail in a dense forest or seen footage of sand dunes or snow drifts in all directions knows the terror of an environment lacking a frame of reference. A landscape is just a very big object, and we recognize complex objects by locating their parts in a reference frame belonging to the object (see Chapter 4). The reference frames in a mental map are big landmarks, like trees, rocks, and ponds, and long paths or boundaries, like rivers and mountain ranges. A vista without these guideposts is unsettling. Kaplan and Kaplan found another key to natural beauty, which they call mystery. Paths bending around hills, meandering streams, gaps in foliage, undulating land, and partly blocked views grab our interest by hinting that the land may have important features that could be discovered by further exploration.
People also love to look at animals and plants, especially flowers. If you are reading this book at home or in other pleasant but artificial surroundings, chances are you can look up and find animal, plant, or flower motifs in the decorations. Our fascination with animals is obvious. We eat them, they eat us. But our love of flowers, which we don't eat except in salads in overpriced restaurants, needs an explanation. We ran into it in Chapters 3 and 5. People are intuitive botanists, and a flower is a rich source of data. Plants blend into a sea of green and often can be identified only by their flowers. Flowers are harbingers of growth, marking the site of future fruit, nuts, or tubers for creatures smart enough to remember them.
Some natural happenings are deeply evocative, like sunsets, thunder, gathering clouds, and fire. Orians and Heerwagen note that they tell of an imminent and consequential change: darkness, a storm, a blaze. The emotions evoked are arresting, forcing one to stop, take notice, and prepare for what's to come.
Environmental aesthetics is a major factor in our lives. Mood depends on surroundings: think of being in a bus terminal waiting room or a lakeside cottage. People's biggest purchase is their home, and the three rules of home buying — location, location, and location — pertain, apart from nearness to amenities, to grassland, trees, bodies of water, and prospect (views). The value of the house itself depends on its refuge (cozy spaces) and mystery (nooks, bends, windows, multiple levels). And people in the unlikeliest of ecosystems strive for a patch of savanna to call their own. In New England, any land that is left alone quickly turns into a scruffy deciduous forest. During my interlude in suburbia, every weekend my fellow burghers and I would drag out our lawn mowers, leaf blowers, {378} weed whackers, limb loppers, branch pruners, stem snippers, hedge clippers, and wood chippers in a Sisyphean effort to hold the forest at bay. Here in Santa Barbara, the land wants to be an arid chaparral, but decades ago the city fathers dammed wilderness creeks and tunneled through mountains to bring water to thirsty lawns. During a recent drought, homeowners were so desperate for verdant vistas that they sprayed their dusty yards with green paint.
Great green gobs of greasy grimy gopher guts,
Mutilated monkey meat,
Concentrated chicken feet.
Jars and jars of petrified porpoise pus,
And me without my spoon!
— fondly remembered camp song, sung to the
tune of “The Old Gray Mare”; lyricist unknown
Disgust is a universal human emotion, signaled with its own facial expression and codified everywhere in food taboos. Like all the emotions, disgust has profound effects on human affairs. During World War II, American pilots in the Pacific went hungry rather than eat the toads and bugs that they had been taught were perfectly safe. Food aversions are tenacious ethnic markers, persisting long after other traditions have been abandoned.
Judged by the standards of modern science, disgust is manifestly irrational. People who are sickened by the thought of eating a disgusting object will say it is unsanitary or harmful. But they find a sterilized cockroach every bit as revolting as one fresh from the cupboard, and if the sterilized roach is briefly dunked into a beverage, they will refuse to drink it. People won't drink juice that has been stored in a brand-new urine collection bottle; hospital kitchens have found this an excellent way to stop pilferage. People won't eat soup if it is served in a brand-new bedpan or if it has been stirred with a new comb or fly-swatter. You can't pay most people to eat fudge baked in the shape of dog feces or to hold rubber vomit from a novelty store between their lips. One's own saliva is not disgusting as long as it is in one's mouth, {379} but most people won't eat from a bowl of soup into which they have spat.
Most Westerners cannot stomach the thought of eating insects, worms, toads, maggots, caterpillars, or grubs, but these are all highly nutritious and have been eaten by the majority of peoples throughout history. None of our rationalizations makes sense. You say that insects are contaminated because they touch feces or garbage? But many insects are quite sanitary. Termites, for example, just munch wood, but Westerners feel no better about eating them. Compare them with chickens, the epitome of palatability (“Try it — it tastes like chicken!”), which commonly eat garbage and feces. And we all savor tomatoes made plump and juicy from being fertilized with manure. Insects carry disease? So does all animal flesh. Just do what the rest of the world does — cook them. Insects have indigestible wings and legs? Pull them off, as you do with peel-and-eat shrimp, or stick to grubs and maggots. Insects taste bad? Here is a report from a British entomologist who was studying Laotian foodways and acquired a firsthand knowledge of his subject matter:
None distasteful, a few quite palatable, notably the giant waterbug. For the most part they were insipid, with a faint vegetable flavour, but would not anyone tasting bread, for instance, for the first time, wonder why we eat such a flavourless food? A toasted dungbeetle or soft-bodied spider has a nice crisp exterior and soft interior of souffle consistency which is by no means unpleasant. Salt is usually added, sometimes chili or the leaves of scented herbs, and sometimes they are eaten with rice or added to sauces or curry. Flavour is exceptionally hard to define, but lettuce would, I think, best describe the taste of termites, cicadas, and crickets; lettuce and raw potato that of the giant Nephila spider, and concentrated Gorgonzola cheese that of the giant waterbug (Lethocerus indicus). I suffered no ill effects from the eating of these insects.
The psychologist Paul Rozin has masterfully captured the psychology of disgust. Disgust is a fear of incorporating an offending substance into one's body. Eating is the most direct way to incorporate a substance, and as my camp song shows, it is the most horrific thought that a disgusting substance can arouse. Smelling or touching it is also unappealing. Disgust deters people from eating certain things, or, if it's too late, makes them spit or vomit them out. The facial expression says it all: the nose is wrinkled, constricting the nostrils, and the mouth is opened and the tongue pushed forward as if to squeegee offending material out. {380}
Disgusting things come from animals. They include whole animals, parts of animals (particularly parts of carnivores and scavengers), and body products, especially viscous substances like mucus and pus and, most of all, feces, universally considered disgusting. Decaying animals and their parts are particularly revolting. In contrast, plants are sometimes distasteful, but distaste is different from disgust. When people avoid plant products — say, lima beans or broccoli — it is because they taste bitter or pungent. Unlike disgusting animal products, they are not felt to be unspeakably vile and polluting. Probably the most complicated thought anyone ever had about a disfavored vegetable was Clarence Darrow's: “I don't like spinach, and I'm glad I don't, because if I liked it I'd eat it, and I just hate it.” Inorganic and non-nutritive stuff like sand, cloth, and bark are simply avoided, without strong feelings.
Not only are disgusting things always from animals, but things from animals are almost always disgusting. The nondisgusting animal parts are the exception. Of all the parts of all the animals in creation, people eat an infinitesimal fraction, and everything else is untouchable. Many Americans eat only the skeletal muscle of cattle, chickens, swine, and a few fish. Other parts, like guts, brains, kidneys, eyes, and feet, are beyond the pale, and so is any part of any animal not on the list: dogs, pigeons, jellyfish, slugs, toads, insects, and the other millions of animal species. Some Americans are even pickier, and are repulsed by the dark meat of chicken or chicken on the bone. Even adventurous eaters are willing to sample only a small fraction of the animal kingdom. And it is not just pampered Americans who are squeamish about unfamiliar animal parts. Napoleon Chagnon safeguarded his supply of peanut butter and hot dogs from his begging Yanomamo informants by telling them they were the feces and penises of cattle. The Yanomamo, who are hearty eaters of caterpillars and grubs, had no idea what cattle were but lost their appetite and left him to eat in peace.
A disgusting object contaminates everything it touches, no matter how brief the contact or how invisible the effects. The intuition behind not drinking a beverage that has been stirred with a flyswatter or dunked with a sterilized roach is that invisible contaminating bits — children call them cooties — have been left behind. Some objects, such as a new comb or bedpan, are tainted merely because they are designed to touch something disgusting, and others, such as a chocolate dog turd, are tainted by mere resemblance. Rozin observes that the psychology of disgust obeys {381} the two laws of sympathetic magic — voodoo — found in many traditional cultures: the law of contagion (once in contact, always in contact) and the law of similarity (like produces like).
Though disgust is universal, the list of nondisgusting animals differs from culture to culture, and that implies a learning process. As every parent knows, children younger than two put everything in their mouths, and psychoanalysts have had a field day interpreting their lack of revulsion for feces. Rozin and his colleagues studied the development of disgust by offering children various foods that American adults find disgusting. To the horror of their onlooking parents, sixty-two percent of toddlers ate imitation dog feces (“realistically crafted from peanut butter and odorous cheese”), and thirty-one percent ate a grasshopper.
Rozin suggests that disgust is learned in the middle school-age years, perhaps when children are scolded by their parents or they see the look on their parents’ faces when they approach a disgusting object. But I find that unlikely. First, all the subjects older than toddlers behaved virtually the same as the adults did. For example, four-year-olds wouldn't eat imitation feces or drink juice with a grasshopper in it; the only difference between them and the adults was that the children were less sensitive to contamination by brief contact. (Not until the age of eight did the children reject juice briefly dipped with a grasshopper or with imitation dog feces.) Second, children above the age of two are notoriously finicky, and their parents struggle to get them to eat new substances, not to avoid old ones. (The anthropologist Elizabeth Cashdan has documented that children's willingness to try new foods plummets after the third birthday.) Third, if children had to learn what to avoid, then all animals would be palatable except for the few that are proscribed. But as Rozin himself points out, all animals are disgusting except for a few that are permitted. No child has to be taught to revile greasy grimy gopher guts or mutilated monkey meat.
Cashdan has a better idea. The first two years, she proposes, are a sensitive period for learning about food. During those years mothers control children's food intake and children eat whatever they are permitted. Then their tastes spontaneously shrink, and they stomach only the foods they were given during the sensitive period. Those distastes can last to adulthood, though adults occasionally overcome them from a variety of motives: to dine with others, to appear macho or sophisticated, to seek thrills, or to avert starvation when familiar fare is scarce. {382}
What is disgust for? Rozin points out that the human species faces “the omnivore's dilemma.” Unlike, say, koalas, who mainly eat eucalyptus leaves and are vulnerable when those become scarce, omnivores choose from a vast menu of potential foods. The downside is that many are poison. Many fish, amphibians, and invertebrates contain potent neuro-toxins. Meats that are ordinarily harmless can house parasites like tapeworms, and when they spoil, meats can be downright deadly, because the microorganisms that cause putrefaction release toxins to deter scavengers and thereby keep the meat for themselves. Even in industrialized countries food contamination is a major danger. Until recently anthrax and trichinosis were serious hazards, and today public health experts recommend draconian sanitary measures so people won't contract salmonella poisoning from their next chicken salad sandwich. In 1996 a world crisis was set off by the discovery that Mad Cow Disease, a pathology found in some British cattle that makes their brains spongy, might do the same to people who eat the cattle.
Rozin ventured that disgust is an adaptation that deterred our ancestors from eating dangerous animal stuff. Feces, carrion, and soft, wet animal parts are home to harmful microorganisms and ought to be kept outside the body. The dynamics of learning about food in childhood fit right in. Which animal parts are safe depends on the local species and their endemic diseases, so particular tastes cannot be innate. Children use their older relatives the way kings used food tasters: if they ate something and lived, it is not poison. Thus very young children are receptive to whatever their parents let them eat, and when they are old enough to forage on their own, they avoid everything else.
But how can one explain the irrational effects of similarity — the revulsion for rubber vomit, chocolate dog turds, and sterilized roaches? The answer is that these items were crafted to evoke the same reaction in people that the objects themselves evoke. That is why novelty shops sell rubber vomit. The similarity effect merely shows that reassurance by an authority or by one's own beliefs do not disconnect an emotional response. It is no more irrational than other reactions to modern simulacra, such as being engrossed by a movie, aroused by pornography, or terrified on a roller coaster.
What about our feeling that disgusting things contaminate everything {383} they touch? It is a straightforward adaptation to a basic fact about the living world: germs multiply. Microorganisms are fundamentally different from chemical poisons such as those manufactured by plants. The danger of a chemical depends on its dose. Poisonous plants are bitter-tasting because both the plant and the plant-eater have an interest in the plant-eater stopping after the first bite. But there is no safe dose for a microorganism, because they reproduce exponentially. A single, invisible, untastable germ can multiply and quickly saturate a substance of any size. Since germs are, of course, transmittable by contact, it is no surprise that anything that touches a yucky substance is itself forever yucky, even if it looks and tastes the same. Disgust is intuitive microbiology.
Why are insects and other small creatures like worms and toads — what Latin Americans call “animalitos” — so easy to revile? The anthropologist Marvin Harris has shown that cultures avoid animalitos when larger animals are available, and eat them when they are not. The explanation has nothing to do with sanitation, since bugs are safer than meat. It comes from optimal foraging theory, the analysis of how animals ought to — and usually do — allocate their time to maximize the rate of nutrients they consume. Animalitos are small and dispersed, and it takes a lot of catching and preparing to get a pound of protein. A large mammal is hundreds of pounds of meat on the hoof, available all at once. (In 1978 a rumor circulated that McDonald's was extending the meat in Big Macs with earthworms. But if the corporation were as avaricious as the rumor was meant to imply, the rumor could not be true: worm meat is far more expensive than beef.) In most environments it is not only more efficient to eat larger animals, but the small ones should be avoided altogether — the time to gather them would be better spent hunting for a bigger payoff. Animalitos are thus absent from the diets of cultures that have bigger fish to fry, and since, in the minds of eaters, whatever is not permitted is forbidden, those cultures find them disgusting.
What about food taboos? Why, for example, are Hindus forbidden to eat beef? Why are Jews forbidden to eat pork and shellfish and to mix meat with milk? For thousands of years, rabbis have offered ingenious justifications of the Jewish dietary laws. Here are a few listed in the Encyclopedia Judaica: {384}
From Aristeas, first century BC: “The dietary laws are ethical in intent, since abstention from the consumption of blood tames man's instinct for violence by instilling in him a horror of bloodshed. . . . The injunction against the consumption of birds of prey was intended to demonstrate that man should not prey on others.”
From Isaac ben Moses Arama: “The reason behind all the dietary prohibitions is not that any harm may be caused to the body, but that these foods defile and pollute the soul and blunt the intellectual powers, thus leading to confused opinions and a lust for perverse and brutish appetites which lead men to destruction, thus defeating the purpose of creation.”
From Maimonides: “All the food which the Torah has forbidden us to eat have some bad and damaging effect on the body. . . . The principal reason why the Law forbids swine's flesh is to be found in the circumstances that its habits and its food are very dirty and loathsome. . . . The fat of the intestines is forbidden because it fattens and destroys the abdomen and creates cold and clammy blood. . . . Meat boiled in milk is undoubtedly gross food, and makes a person feel overfull.”
From Abraham ibn Ezra: “I believe it is a matter of cruelty to cook a kid in its mother's milk.”
From Nahmanides: “Now the reason for specifying fins and scales is that fish which have fins and scales get nearer to the surface of the water and are found more generally in freshwater areas. . . . Those without fins and scales usually live in the lower muddy strata which are exceedingly moist and where there is no heat. They breed in musty swamps and eating them can be injurious to health.”
With all due respect to rabbinical wisdom, these arguments can be demolished by any bright twelve-year-old, and as a former temple Sunday School teacher I can attest that they regularly are. Many Jewish adults still believe that pork was banned as a public health measure, to prevent trichinosis. But as Harris points out, if that were true the law would have been a simple advisory against undercooking pork: “Flesh of swine thou shalt not eat until the pink has been cooked from it.”
Harris observes that food taboos often make ecological and economic sense. The Hebrews and the Muslims were desert tribes, and pigs are animals of the forest. They compete with people for water and nutritious foods like nuts, fruits, and vegetables. Kosher animals, in contrast, are {385} ruminants like sheep, cattle, and goats, which can live off scraggly desert plants. In India, cattle are too precious to slaughter because they are used for milk, manure, and pulling plows. Harris’ theory is as ingenious as the rabbis’ and far more plausible, though he admits that it can't explain everything. Ancient tribes wandering the parched Judaean sands were hardly in danger of squandering their resources by herding shrimp and oysters, and it is unclear why the inhabitants of a Polish shtetl or a Brooklyn neighborhood should obsess over the feeding habits of desert ruminants.
Food taboos are obviously an ethnic marker, but by itself that observation explains nothing. Why do people wear ethnic badges to begin with, let alone a costly one like banning a source of nutrients? The social sciences assume without question that people submerge their interests to the group, but on evolutionary grounds that is unlikely (as we shall see later in the chapter). I take a more cynical view.
In any group, the younger, poorer, and disenfranchised members may be tempted to defect to other groups. The powerful, especially parents, have an interest in keeping them in. People everywhere form alliances by eating together, from potlatches and feasts to business lunches and dates. If I can't eat with you, I can't become your friend. Food taboos often prohibit a favorite food of a neighboring tribe; that is true, for example, of many of the Jewish dietary laws. That suggests that they are weapons to keep potential defectors in. First, they make the merest prelude to cooperation with outsiders — breaking bread together — an unmistakable act of defiance. Even better, they exploit the psychology of disgust. Taboo foods are absent during the sensitive period for learning food preferences, and that is enough to make children grow up to find them disgusting. That deters them from becoming intimate with the enemy (“He invited me over, but what will I do if they serve . . . EEEEU-LJLJW!!”). Indeed, the tactic is self-perpetuating because children grow up into parents who don't feed the disgusting things to their children. The practical effects of food taboos have often been noticed. A familiar theme in novels about the immigrant experience is the protagonist's torment over sampling taboo foods. Crossing the line offers a modicum of integration into the new world but provokes open conflict with parents and community. (In Portnoy's Complaint, Alex describes his mother as pronouncing hamburger as if it were Hitler.) But since the elders have no desire for the community to see the taboos in this light, they cloak them in talmudic sophistry and bafflegab. {386}
Language-lovers know that there is a word for every fear. Are you afraid of wine? Then you have oenophobia. Tremulous about train travel? You suffer from siderodromophobia. Having misgivings about your mother-in-law is pentheraphobia, and being petrified of peanut butter sticking to the roof of your mouth is arachibutyrophobia. And then there's Franklin Delano Roosevelt's affliction, the fear of fear itself, or pkobophobia.
But just as not having a word for an emotion doesn't mean that it doesn't exist, having a word for an emotion doesn't mean that it does exist. Word-watchers, verbivores, and sesquipedalians love a challenge. Their idea of a good time is to find the shortest word that contains all the vowels in alphabetical order or to write a novel without the letter e. Yet another joy of lex is finding names for hypothetical fears. That is where these improbable phobias come from. Real people do not tremble at the referent of every euphonious Greek or Latin root. Fears and phobias fall into a short and universal list.
Snakes and spiders are always scary. They are the most common objects of fear and loathing in studies of college students’ phobias, and have been so for a long time in our evolutionary history. D. O. Hebb found that chimpanzees born in captivity scream in terror when they first see a snake, and the primatologist Marc Hauser found that his laboratory-bred cotton-top tamarins (a South American monkey) screamed out alarm calls when they saw a piece of plastic tubing on the floor. The reaction of foraging peoples is succinctly put by Irven DeVore: “Hunter-gatherers will not suffer a snake to live.” In cultures that revere snakes, people still treat them with great wariness. Even Indiana Jones was afraid of them!
The other common fears are of heights, storms, large carnivores, darkness, blood, strangers, confinement, deep water, social scrutiny, and leaving home alone. The common thread is obvious. These are the situations that put our evolutionary ancestors in danger. Spiders and snakes are often venomous, especially in Africa, and most of the others are obvious hazards to a forager's health, or, in the case of social scrutiny, status. Fear is the emotion that motivated our ancestors to cope with the dangers they were likely to face.
Fear is probably several emotions. Phobias of physical things, of social {387} scrutiny, and of leaving home respond to different kinds of drugs, suggesting that they are computed by different brain circuits. The psychiatrist Isaac Marks has shown that people react in different ways to different frightening things, each reaction appropriate to the hazard. An animal triggers an urge to flee, but a precipice causes one to freeze. Social threats lead to shyness and gestures of appeasement. People really do faint at the sight of blood, because their blood pressure drops, presumably a response that would minimize the further loss of one's own blood. The best evidence that fears are adaptations and not just bugs in the nervous system is that animals that have evolved on islan