SEHR, volume 4, issue 1: Bridging the Gap
Updated 8 April 1995

Professor Simon's reply appears here in two parts.

reply to commentaries (2nd and last part)

literary criticism: a cognitive approach

Herbert A. Simon

Understanding, Ah-Ha's, and Intuition

Hayles (62) says that she has "difficulty imagining a computer having the "Ah-ha" experience that I associate with grasping meaning." As with the definition of meaning itself, I would suggest that we first elucidate how we detect humans having an "ah-ha," and then see what the corresponding event might be for a computer. To us humans, an "ah-ha" means that at a particular point in time we do not have the solution of a problem, or even an understanding of it, but a moment (a few hundred milliseconds?) later we have a solution or a clear idea of how to reach one, and a feeling that we know why it will work.

Human oral "ah-ha's" are not infrequently heard in the psychological laboratory. I have encountered at least three varieties. In the simplest cases, a person given a problem-solving task recognizes it as belonging to a familiar species, one known from previous experience, and recovers from memory many of the things previously associated with it, including, for example, a solution or solution path, or operators known to lead to a solution. At that moment, he or she often says "ah-ha!"

It is typical of all acts of recognition (a.k.a. intuition and insight), including the most common, that they take only a fraction of a second in the presence of the appropriate familiar cues and that the persons experiencing them are unable to report what led to the recognition. We are, for example, unable to report the particular cues that led us to recognize an old friend-that's why we cannot describe the friend's appearance in any detail to another person. Most routine professional work (e.g., medical diagnosis) is accomplished by the expert largely by acts of recognition, with only a little accessory problem-solving search.

A second kind of "ah-ha" that has been observed involves planning. A problem solver abstracts from the details of a problem, finds a skeletal path to the solution in the abstracted planning space, utters an "ah-ha," and then returns to the original problem space and attempts to implement the plan. Newell and I reported such phenomena in Human Problem Solving (1972). The "ah-ha" reflects the subject's confidence that he or she can now solve the problem and has a clear picture of the (abstracted) problem structure that shows the way to the solution. Interestingly enough, the plan does not always work, because details omitted from the abstraction can turn out to be essential.

A third kind of "ah-ha" has been observed in the case of very difficult problems where reaching a solution requires discovering a radically different representation of the problem than the one presented initially to the solver. The "mutilated checkerboard" is a classical problem of this kind. I will not try to describe it here, but simply refer to Kaplan and Simon (1990). The "ah-ha" is uttered when a subject discovers the representation needed to reach a solution and almost immediately sees the very short and direct solution path. Few subjects find this solution without several hours' work, and usually even then only with the help of hints. The "ah-ha!" is correspondingly emphatic.

All of these "ah-ha's" have in common: suddenness, accompanying surprise, and resulting awareness not only of a problem solution, but usually of the structure of the path leading to the solution (that is, an "understanding" of the problem). There may be other sorts of "ah-ha's," but these are probably the ones that we most often experience and observe. All three have been simulated by computer: the first by EPAM, the second by a version of the General Problem Solver that had abstracting and planning capabilities, and the third by a program written by Kaplan (unpublished), and in some respects by another program, KEKADA, written by Kulkarni (Kulkarni and Simon 1988) to simulate the experimental strategies of human scientists.

I do not expect that I have answered all of Professor Hayles' problems with the concepts of "ah-ha" and understanding, but I hope I have sketched a strategy for studying the corresponding phenomena in the psychological laboratory and then modeling the mechanisms in computer programs. I would propose the same strategy in answer to Murray's (88) questions about how we would model a new gestalt or understanding puns. I quite agree with her suggestion that "the most promising part of the cognitive model is its invitation to literary critics to think in terms of process rather than product." I will take up her concerns with embodiment in the next section.

Johnston, who is generally sympathetic to my paper, observes (67) that, "We needn't wait for artificial minds to come into being. . . .for the work to proceed." Indeed, not. I would only add: first, if the artificial minds are not indispensable to progress, still they are a great help to it; and second, we surely don't need to "wait for artificial minds to come into being," because artificial minds, however much they fall short of what we would ultimately hope them to be, are already here and have been thinking at a growing rate since the middle 1950s.

Body and Mind

Closely related to the topic of affect, a number of commentators remind us that the human mind is embodied (Murray, 88). Turner observes (110) that "a human brain resides in a human body in a human environment which that brain must make intelligible if it is to survive." Hayles (62) describes computers as "lacking embodiment." Dreyfus (51) argues that the meaning of texts "depends on what human beings find important. That in turn depends partly on having bodies."

I can only agree with all but the third of these statements, for they do not in any way contradict my paper. The notion that there is no way of representing embodiment (e.g., of emotions) in computer programs was refuted in the previous section of my reply. Hence, I will confine myself here to a few additional remarks about embodiment and meaning in preparation for the discussion of situated action in the next section.

Independently of whether computers can think, and disregarding them entirely for the present, we can ask whether the kind of analysis of cognition and literary criticism carried out in my paper ignored, to its detriment, the fact that people have bodies. In part, this raises again the questions of how to treat affect. But I have already shown that I did not at all ignore motives or emotions or esthetic sensibilities. The fact that we have bodies leads to the particular affective responses we experience. A representation of our body also plays a crucial role in our mental models of ourselves, which we carry about as part of our awareness of relations with our social and physical environment.

We have seen that the mechanisms I postulated for evoking symbols in the process of reading are fully capable of evoking affect and affective responses, and at the same time evoking our images of self, including awareness of our bodies in relation to the situations in which they and we find ourselves. We frequently empathize with characters in the stories we read. I would suppose this means that the novelist's placing a character in a particular situation evokes in us, among other things, some of the feelings we would have if placed in the same situation. Moreover, evocation can enable us to view the situation (including the empathizing self) from "outside" as well as inside.

An important part of our life, with its accompanying thoughts, focuses upon the protection of our body, the pleasures of our body, the appearance of our body, the comfort of our body, the powers of our body, the autonomy of our body. So we are embodied, responding to text (among other things) in terms of that embodiment. Nothing in embodiment is antithetical to analyzing writing as symbolic process that takes meanings into words and reading as symbolic process that takes words into meanings.

Several of the commentators are proponents of the currently popular positions labeled "situated action" and "literary and cultural theory." These two viewpoints are not, of course, identical; but they share a large area of overlap, and can conveniently be juxtaposed.

Situated Action

The underlying idea of situated action is that human behavior can only be understood fully if it is embedded in the rich physical and especially social context that surrounds it (e.g., Rotman, 99). I know no one who disagrees with that position. Nowadays, however, the phrase "situated action" is often used to denote more specific and debatable ideas. Alonso Vera and I, in a recent analysis of the literature (Vera and Simon, 1993), identified six themes that often fly the banner of situated action, although with significant variation in the emphases placed on each. For our present purposes, these six strands can be woven together into two main claims: that it is unprofitable and/or impossible to analyze human action in symbol processing terms, and that human action can be analyzed profitably only in its total context.

Regarding the first of these, Keil-Slawik (72), for example, says that "more and more empirical studies and laboratory observations of the programming process reveal that the meaning created in the communicative processes among programmers, designers, users, and managers, cannot be captured by documents. . . .. We have to distinguish between programming as discourse, which is a meaning-creating activity, and programs as text (his italics)." From this assertion, he concludes that it is impossible to formalize "the relations between software and the usage context." He also claims as a consequence that "meaning cannot be located in a text as long as the interpretation of the respective text requires some kind of learning." It is ironic that Turner (110), evoking Derrida, Foucault, and Barthes, proclaims the exact opposite: that the meaning is only in the text and that, in Barthes' words, "the modern scriptor is born simultaneously with the text." I will let Keil-Slawik and Turner fight that issue out, hoping that both texts and human beings will survive the melée.

As Keil-Slawik claims that his supposed facts are revealed by empirical studies, we can leave to such studies the ultimate determination of whether he is correct. I have already pointed out his error in supposing that learning processes cannot be modeled symbolically (and are not carried out symbolically by people), for there is a lot of empirical evidence that they can be (and are). But this simply gets us back to our earlier discussions of the processes by means of which meanings are evoked, and I have nothing more to add here. Readers who are interested in the relation of situated action to symbolic representation of thought can find a rather detailed discussion of that topic in my paper with Vera, referenced above.

The more interesting side of the situated-action position is its insistence on looking at everything in context. I found it rather astonishing that a number of commentators interpreted me as claiming that human beings are not situated. For example, Wynter (124) thinks that I "reenact the central assumptions of the premise of aculturalism." As one other example of this misconception, Velotti (115) says, "Evidently, Simon thinks that we are not situated beings, but disembodied spirits." He arrives at this curious view because, in my paper, I state that "we can read the Bible (or Homer, or Chaucer) in the context of the ideas of Medieval Europe or 16th century China." He interprets that sentence (out of context) as claiming that we can wholly divorce ourselves from our own cultural baggage and insert ourselves completely into another culture.

Of course that is nonsense. What is true, however, and what any reader seeking my intent would easily conclude I meant, is that by acquiring knowledge of another culture we can reasonably infer how a member of that culture would extract meaning from a text. Cultural historians will not find that claim either unfamiliar or doubtful, for it is the foundation of their scholarship. (See, for example, Biddick's essay on pages 35-38 of this volume.) "Whiggishness" is just one of the crimes of which a historian may be accused when he fails to interpret historical events in the context of the culture in which they occurred.

Some very strong methodological conclusions, one for research and one for education, are often drawn from the postulate that action is situated. The conclusion for research is that laboratory experiments under simplified and controlled conditions are largely worthless, for they abstract away from the all-important context. The conclusion for education is that classroom education of the usual kinds is worthless, for it abstracts knowledge from the all-important contexts in which it will be used.

Stated in less extreme terms, neither of these claims is without merit. One might restate the first in this way: in applying the findings of laboratory experiments to real world situations, we must take account of the multitude of variables present in the world that were absent from the laboratory or held constant. We should not suppose that test-tube experiments on various mixtures of atmospheric gases will unravel all of the complexities of the Earth's atmosphere and their implications for air quality and climate. No atmospheric scientist thinks they will.

Similarly, the second claim might be restated: if we want things that are taught to be used in real world situations, then we must give careful attention to assisting students to recognize when they are relevant in the real world and how they must be modified and adjusted to be applied fruitfully. I am not acquainted with any researchers or classroom teachers who take exception to these modified statements. We can all agree (unless we take an extreme "situated" position) that neither the laboratory nor the classroom is a complete substitute for the world, and vice versa (my italics).

On the other hand, one must also be careful not to exaggerate how far the laboratory (or the classroom) is isolated from the real world, particularly when the subjects or students are human. Every human subject and student carries into laboratory and classroom a well-stocked memory, and this memory is a product of everything that person has learned through a short or long lifetime of interaction with the physical and social environment. Subjects typically come to my laboratory already speaking English-most of them native speakers. Decortication of subjects and students being frowned on, how can we speak of running experiments or classes without context?

To be sure, the stimuli we present in the laboratory and classroom environments can be quite lean in context, but it is not, in fact, difficult to surround these materials with rich context evoked from subjects' or students' memories. I once participated with a research team in running a large and elaborate experiment simulating an entire Early Warning Station of the U. S. Air Force staffed by a military complement of about thirty men and officers. When, due to a mistaken judgment by the airmen who were monitoring the simulated radar screen to detect unidentified planes, Seattle was "bombed," a number of the participants wept. It was hard for them to remember that the bombing was only a simulated event, not the real thing.

The importance of drama as an art form in most cultures testifies to the readiness with which people, both actors and audience, deal with rather lean simulated environments, supplying the missing elements from their own evoked memories. The real difficulties in running experiments with human subjects may lie in our ignorance of exactly what beliefs and attitudes they bring with them into the laboratory; and the real difficulties of teaching effectively in a school environment may lie in our ignorance of the students' knowledge bases that we are seeking to extend. The difficulties of applying the findings of experiments or the lessons learned in school may be due less to the austerity of the artificial environments in which the testing or learning occurs than to the richness of thoughts (often erroneous) and feelings that subjects and students bring into these situations.

I do not want to discount entirely the "lean context" concern, but it is important to challenge the more extreme claims advanced by the advocates of situated action. When Velotti says (116) that "evidently, Simon thinks that we are not situated beings, but disembodied spirits," he has clearly managed to avoid reading or understanding everything that I said about context and about the origins of memory contents in the social and physical world outside.

Literary and Critical Theory

A number of the commentators take deconstructionism and other recent movements in literary and critical theory as their starting point. I have already noted Turner's proposal to locate meaning in texts, hence outside the mind. Similarly, Van Brakel (113) asks, "if memory contents find their origin in social contents, wouldn't it be better to look there for the meaning of meaning instead of talking about frameworks that model part of the mechanism that underlies the use of (social) meanings?" But of course his premise is wrong. Memory contents do not find their origin in social contents; they find their origins in interactions of the memory's owner with a social and physical world. The mind does not receive passively what that world provides, but actively filters and reconstructs its inputs (See Holland, 65).

Talking about the environment or the text while dismissing the mind ignores the whole rich process by which a human being's mind absorbs and transforms many elements from the social and physical world in which he or she lives, while remaining quite distinct from that world. How boring it would be if we were all simply small replicas of our society (or of our class, our gender, our ethnic group, or what not). Fortunately, we are influenced by but not copies of the environment.

Van Brakel also raises the problem of circularity ("One pattern, say 'meaning,' points to another pattern, say 'meaning.' How do we get out of this circle?"). We get out of it in the way in which every dynamic system gets out of its circularity, so that chickens can emerge, after a time, from eggs, and other eggs, still later, from the chickens. There is nothing about van Brakel's statement that time subscripts won't fix. Van Brakel fills his comment with "deep" questions of this sort, the sort that enliven introductory philosophy courses. I will leave the rest of them as exercises for the reader.

The other comments coming from the adherents to LCT are a miscellaneous lot, not easily summarized. I have tried, as far as possible, to dissect them into the specific issues raised and have discussed many of these in previous sections of this reply. I might just mention a couple of themes that run through many of the commentaries, without dealing with them in detail.

First, there is a frequently expressed abhorrence of "reductionism." (For example, see Wild, 121). In its traditions, humanism has always been tugged by analysis on one side, and synthesis on the other. As the sciences, with their relatively analytic predilections, were successively spun off from the humanities, the disciplines that remained increasingly stressed holism and the togetherness of things. Schleifer (103) observes that "the mode of much literary criticism is not so much measurement and comparison aimed at simplification and generalization but, as Gaston Bachelard. . . .says of the 'new' science of the twentieth century, the 'complexification of what appeared to be simple.'"

Clearly, my paper took the analytic route. The correctness of my arguments rests heavily on the premise that systems, even complex systems, can in principle be analyzed without losing the relations of wholes to parts or ignoring the system-level, holistic phenomena that depend on these relations. This is not the place to reargue the case for analysis and reductionism; I simply acknowledge what my position is on this issue.

Second, many objections are based on assumptions of cultural and epistemological relativism, ranging from the moderate to the extreme. As already noted, some commentators challenge my view of cognitive science on the ground that other views are held by other cognitive scientists (e.g., by connectionists). Although this is true, I do not take it to imply that my view is incorrect, but only that more evidence will have to be accumulated before consensus is achieved. Until that time, I will have to be forgiven for proceeding from the views I think correct.

Other relativists challenge the objectivity of cognitive science and my claims for its hegemony over literature. I rather like Wild's (121) way of putting it: "In short, is it not that literary theory could just as well underlie cognitive science and provide the principles of its functioning?" My reply is that, as the principles of functioning to be provided are matters of fact, only a discipline that has a strongly established methodology for establishing facts can provide sound principles. I would claim (this is another essay) that literary theory has a weak discipline of empirical methodology, and I would submit into evidence the whole set of commentaries on my essay, which rarely cite a finding of empirical research or provide references to places where such results can be found. Cognitive psychology does have such a methodology, and it is for this reason that it may be of some help in understanding the psychological processes with which literary theory is concerned.

Yet other relativists pronounce a kind of politically correct anathema on imperialism in general and cognitive imperialism in particular. Perhaps the clearest example is Biddick (35-38): "Simon's model of memory. . . .represents this architecture crafted in the post-Holocaust, post-Hiroshima, post-colonial days of the 1950s." This remarkable statement is followed by a footnote that erroneously asserts that "von Neumann computer architecture separates memory from processing and is built according to nineteenth-century notions of localizing brain functions." Before we can even discuss statements like these, we have to return from the realms of fantasy to facts. When we do, there is little left to discuss.

A central theme of my argument is that literary criticism is concerned with a great many empirical questions of the kinds that cognitive psychology commonly deals with and has much to say about, substantively and methodologically. As words are written and read by people, there could hardly fail to be a close relation between understanding the processes of writing and reading and understanding human behavior in general. The connection is fortified by the fact that most writing, especially the kind of writing we call "literature," is about people-their actions and interactions, and the thoughts and feelings that lead to them.

It is quite natural to ask, therefore, what methods of inquiry are available to us for studying these questions and whether it might be profitable to borrow methods back and forth between literature and psychology. This is, of course, a very old idea, which hardly would have needed stating during the ages when psychology was not separated from philosophy and rhetoric.

With the revival of rhetoric in recent times, the opportunity to re-form the link has emerged again, and some scholars have been taking advantage of that opportunity. An example is the well-known research on the processes of expository writing of my colleagues Linda Flower of our English Department and John R. Hayes of the Psychology Department.

I was pleased to see methodological questions of this kind addressed by several of the commentators. Currie (48-49) draws a convincing picture of what we can learn about the cognitive processes used in handling irony and metaphor, respectively, from close study of the behavior of autistic children, an area that has been extensively cultivated by psychologists. Such research can perform a dual function in giving us a deeper understanding of autism at the same time that it casts light on what is involved in literary uses of irony and metaphor.

Again, Miall (82-84) suggests that students of literature might very well play a major role in gathering evidence about cognitive processes, for example, by collecting "all the evidence we can from as many different kinds of readers as we can about what is actually taking place during literary reading." Such activities would be greatly welcomed by psychologists who stand on the other end of the bridge.

There are a number of topics addressed in the commentaries that do not fall under any of the headings used above. I would like to comment on just a few of them that especially caught my attention, and that seem to me to raise questions of particular interest.

The Mind's Eye

Vinograd (118-120) has some interesting things to say about visual perception, although I cannot agree with all of them. He does not like the hypothesis of a "mind's eye" because it makes visual perceptions, real pictures, and visual memories share the same representation within the head. But that is exactly the reason why a mind's eye has been postulated by Kosslyn (1980) and others as the basic structure for processing non-verbal visual information. The evidence now is quite extensive and consistent that information perceived visually does have nearly the same internal representations as visual memories and is processed in the same way; and moreover, that when someone makes a drawing from such a "mind's eye" representation, there is near-isomorphism between the internal objects and relations and the objects and relations in the drawing. Some of the evidence is presented in Kosslyn's book; additional evidence can be found in papers based on work that Qin, Tabachneck, and I have done in the past five years. On the basis of all this (and other) evidence, we must take the "mind's eye" seriously.

Vinograd proposes an alternative view of perception, in terms of the "affordances" proposed by J. J. Gibson, and quite popular today among the advocates of situated action. As Vinograd's discussion of affordances (119 shows, they are not a mechanism that accounts for perception; they are a denial of the need to postulate such a mechanism. For a more extensive critique of affordances and the reason why I do not think that they provide any explanation of the mechanisms of perception, see my essay with Vera, referred to earlier.

But in spite of my disagreements with Vinograd, I am glad that he raised the topic of pictorial representations, which are all but ignored in my paper. (They are mentioned briefly on page 13.) A great deal of human thinking employs a pictorial rather than a verbal modality. Often, even when a verbal text is presented, one of the first steps that the reader takes (after "parsing") is to convert the meaning into a mental picture or diagram. Some kinds of reasoning are easier in verbal form, some in pictorial form, and the availability of both representations is critical in many kinds of problem solving. Einstein consistently denied (as do many other scientists) thinking "in words." Today, we are just beginning to learn what these distinctions mean and how thinking proceeds in different modalities.

Petitot (96) also comments on visual imagery, endorsing the concept of the "mind's eye," but draws the conclusion "that we must henceforth model linguistic structures using mathematical models which generalize those of computational vision." This conclusion seems to me correct as applied to the first steps in visual perception, in particular, the extraction of features from the visual scene. It does not seem to me to entail, as far as the more central processes are concerned, a "mathematical" ("Marr-like") representation of the mind's eye, as distinct from the symbolic (not "linguistic" or "logical") representations we now use. But this is an issue that calls for a lengthier discussion than can be provided here and will be settled only when much more evidence has been gathered than is now available about the successive stages of encoding information from the eye.

Meanings in Music

Palma (92) makes the interesting comment: "Music seems to me to evoke meanings purely via its syntax and we should try to develop a comprehension of what the contemplation of syntax brings about in terms of meanings attached to meaningless texts." In recent years, there have been a number of efforts to carry forward precisely this undertaking-of discovering how people "understand" music in terms of its syntactical structure. Some of this work lies within the framework of Chomskyan linguistics; other work relates musical pattern to other forms of pattern that have been studied in psychology. See, for example, Krumhansl (1990), and the references there to Smoliar, Simon and Sumner, Longuet-Higgins, and others.

Is Cognitive Science a Monolith?

Finally, a number of the commentators scolded me for treating warring schools of literary criticism quite differently from the way I treated disagreements among cognitive scientists (e.g., Rotman, 99; Palma, 92; Patel, 94). An excellent example is Miers' response to what he regards as

Simon's monolithic vision of cognitive science as a unified paradigm capable of bringing harmony to the warring schools of literary criticism. The problem here is not the criticism of criticism. . . .The difficulty lies, rather, with the problem of the science itself, with the now wholly indefensible claim that one version of cognitive science-the AI symbol processing model-is somehow the only possible account of human cognition. (85)

How can cognitive science bring peace to literary criticism if it is itself engaged in civil war? In my replies, I have already undertaken to answer several parts of that question. First, the "war" between the serial symbolic and connectionist views is to be settled (except in the eyes of some philosophers who have volunteered for service on one side or the other) by the gradual accumulation of evidence, and such evidence is now being gathered patiently by all parties. Second, among the active researchers actually engaged in the enterprise there is considerable (not complete) agreement that what will ultimately emerge is a division of labor (more or less along the central-peripheral lines I suggested earlier) between the two paradigms. Third, while the issues are being decided in this way, the empirical findings themselves provide an important source of insight, available to literary critics, on the processes of extracting meaning from text.

With respect to the second "war": that between the serial symbolic and connectionist views, on the one hand, and situated action or cultural relativism on the other, my answer would refer again to the essay that Vera and I have published. In that essay, we showed that there is no such contradiction between symbolic and "situated" views as some of the proponents of the latter have claimed. Three of the four discussants of our article, all of them prominent in espousing situated action, expressed general agreement that the incompatibility of the two views has been exaggerated.

There is, of course, a much more serious conflict between the "scientific" views, in general, and the extreme view that Feyerabend and others have put forward: that there is no objectivity in science, and therefore, "anything goes." It remains true that, although we all have our identifications with class, gender, and ethnos, and these identifications color our interpretations of evidence, human beings, when we allow ourselves to be exposed to a wide range of empirical evidence, do tend to converge on interpretations consistent with that evidence. Even in the Soviet Union, Lysenkoism did not long thrive, nor McCarthyism in the United States, nor Derridaism anywhere. If that were not so, I wonder why we would bother even to talk to each other?

This reply already represents a very condensed summary of the thoughts that were evoked in me by the 33 highly individual commentaries written in response to my initial paper. Trying to summarize my reply would be like making a microfilm of a whole set of microfilms. An essay devoting an average of only one page to every four pages of comment cannot do justice to all of the ideas that were put forth. I have focused on principal points of disagreement, on the grounds that agreement requires no comment.

I have organized my reply around the broad issues that are indicated by my section headings, creating the danger (indeed, the certainty) that other issues have been missed. In general, I expect I have spent more effort in replying to criticisms that were presented thoughtfully and quietly than those that were more combative in tone. This reflects a long-held belief that the amount of knowledge exchanged in a discussion like this is inversely proportional to the brilliance of the fireworks displayed. But by following this relatively peaceful course (with some lapses), I may have ignored issues that were worth taking up.

None of us who have participated in this exercise have any illusion, I expect, that we have settled the issues. In lieu of turning my reply into a book-perhaps even a whole shelf of books-I have tried to point readers to other books and papers where I have discussed these same issues, or similar ones, at greater length, and have presented concrete evidence for my views.

Finally, I would like once more to thank the commentators for joining with me in this exploration of the relations between literary theory and cognitive psychology. What we have learned from and about each other, both in agreement and disagreement, will surely facilitate the passage of all of us forth and back across the bridge between the two cultures.