Thursday, January 31, 2008

Metal Representation in the Brain

Before I start with my ‘background check’ for what we need for the creation and maintenance of a shared perspectival systemic space, I want to address two core issues of perspectivity. I already alluded to the first issue in this posts title, which is an amusing typo to be found on page 92 of Talmy Givón's (2005) book "Context as Other Minds: The Pragmatics of Sociality, Cognition and Communication"

1. Mental Representations

The first issues is largely an epistemic one. In a speculative account of what might have happened evolutionary so that we have the ability to entertain cognitive perspective, I would begin like this:

  1. The first necessary component for perspective is the ability to form mental representations, i.e. internal representations of outside reality.

The evolution from reactive systems not ‘having internal perspective’ (NIP) to systems ‘having internal perspective’, i.e. systems that are able to predict and, later on, even plan future events based on internal models of reality, can be seen as the first step from cognitive to non-cognitive organisms. (Cruse 2003)

Some authors, such as Ray Jackendoff, are unhappy with the term ‘mental representation.’ In Jackendoff (2007: 5-7), for example, he argues that we should discard the term ‘mental representation’ and instead speak of ‘mental structures’. His argument is that, from an epistemological point of view, we cannot say that some neuronal activity represents a state in the world, i.e. is ‘about’ something in the real world (what philosophers call, ‘aboutness’ or ‘Intentionality’). Also, he argues, ‘representation’ would imply that there is actually someone perceiving or interpreting these mental structures. But the mental functions of the ‘mind/brain’ bear only an indirect relation to the ‘real world’, mediated by long chains of neuronal connections and neurons interacting with each other. Mental capacities and the ‘real world’ are on this and on the opposite ends of a long chain of indirect, limited and intertwined neuronal interactions.

But I think it is important that at this point we are still talking “Beanbag Semantics” (Dennett 2003, Dennett 1995): Nowadays we know, for example, that the relation between single genes and the actual phenotypic traits an organism develops is not as easy as Mendelian beanbag genetics, but is in fact incredibly complex and depends on a variety of factors. Yet in some contexts it still makes sense for a biologist to speak of a ‘gene for x’, which then only implies that there seems to be a causal connections between to things, which can later be investigated more closely. Whereas in biology we by now have a pretty good idea how genes work, we still don’t really have a clue what ‘mental representations’ or ‘mental structures’ actually are, how they work, and how they are implemented in the brain. We are still at the ‘beanbag’ level, so to speak, and thus it seems perfectly alright to attribute ‘mental representations’ until we have found more about how this process actually works.

“There is, then, nothing wrong with using a term like ‘mental representation’ to label an entity whose physical properties we are only beginning to understand" (Cheney & Seyfarth 2007: 240)

So of course even if I follow Bühler and Köller in my notation of the “origo of the deictic sphere” /systemic space as a two-dimensional graph,

it is important to note that this conceptualization is above all a “‘metaphor’ for individual systems of orientation” (Bredel 2002: 176, Graumann 1994: 61)

2. The Objective Rationality of Perspectives

Another fascinating facet of any approach to perspectivity is the question of how rational decisions are grounded in the perspectives and of an individual and result in perspectival actions and choices.

To be more precise. On the one hand the frame of reference of an individual is one of the most basic parts of her perspective (Bischof-Köhler 2000, Bischof-Köhler & Bischof 2007) On the other hand, we can construe individuals a rational agents who try to act according to their “Natural Rationality”, based on multiple interacting processes of decision-making, trying to act as rational as possible given their precise perspective and choices, acting on clearly calculated incentives.

In this fascinating Blogginheads discussion, Tim Harford presents his new book “The Logic of Life: The Rational Economics of an Irrational World”, in which he argues that even when a prostitute agrees to have sex without a condom, or someone starts smoking, these decisions are based on rational calculation, given the current perspective of these economically acting agents.

Similarly, Benoit Hardy-Vallée, on his cool blog Natural Rationality, regularly posts on decision-making in humans and non-human animals, based on the theoretical assumption that natural selection is a powerful player able to implement rational decision making strategies in organisms, and that many irrational-seeming decisions are based on calculations which seem perfectly rational from the agent’s present perspective. Hardy-Vallée cites Dan Gilbert’s work on affective forecasting as an example of this tendency to base rational decisions on faulty information. In this case, our predictions are imperfect because we are unable to predict our own future psychological states correctly. This happens because we often simulate future events based on memories that don’t represent the past only inaccurately. These memories are ‘essentialized’, only containing the core features of past events, and are abbreviate,

which means that mental simulations tend to overrepresent the moments that evoke the most intense pleasure or pain.” (Gilbert & Wilson 2007: 1353)

The TED-Website features a very entertaining talk by Dan Gilbert on his research on our flawed prospective system.

Now we can add another important factor to the study of perspectivity: A rational agent able to consider multiple overlapping frames of reference, who is able to adopt multiple perspectives on a situation, is clearly at an advantage when it comes to making decisions.

A virtual systemic space thus is a key property of our ability to make advanced rational decisions, even if these still seem to be seriously flawed in some respects.

So it seems that shared systemic spaces do not only enable us to communicate, cooperate and share intentions about these mental representations. Neither do they only enable us to form mental models of reality and basing our predictions on these models by taking the physical, design, or intentional stance towards certain referents in these internal models.

On a more general level, in the optimal condition they enable us to base our decisions on our natural multiperspectival rationality.

As I have to start studying for my exams, I'm afraid that I'll have to pause my inquiry into the cognitive properties of perspective for a short time. In the next two weeks, I'll return to a matter I'm a bit more familiar with: The Evolution of Language, which of course also has many converging point with the study of the ontogenetic and phylogenetic rise of perspective. I'll follow the example of the Language Evolution Blog and post about some people or papers holding - I think - interesting or important views on this topic.


Bischof-Köhler, Doris. 2000. Kinder auf Zeitreise: Theory of Mind, Zeitverständnis und Handlungsorganisation. Bern et al.: Hans Huber.

Bischof-Köhler, Doris & Norbert Bischof . 2007. Is Mental Time Travel a Frame of Reference Issue? Behavioral and Brain Sciences 30.3: 316-317.

Bredel, Ursula. 2002. “You can say you to yourself.” Establishing Perspectives with Personal Pronouns. In: Carl F.Graumann & Werner Kallmeyer (Eds.): Perspective and Perspectivation in Discourse. Amsterdam/Philadelphia: John Benjamins Publishing Company. 167-180.

Cheney, Dorothy L. and Robert M. Seyfarth. 2007. Baboon Metaphysics: The Evolution of a Social Mind. University of Chicago Press, Chicago

Cruse, Holk. 2003. The Evolution of Cognition – A Hypothesis. Cognitive Science 27
2003: 135–155

Dennett, Daniel C. 1995. Darwin's Dangerous Idea: Evolution and the Meanings of Life. New York: Simon & Schuster.

Dennett, Daniel C. 2003. Beyond Beanbag Semantics. Behavioral and Brain Sciences 26.6 673-674.

Gilbert, Daniel. T., & Wilson, Timothy D. (2007). Prospection: Experiencing the Future. Science 317, 1351-1354.

Givón, Talmy. 2005. Context as Other Minds: The Pragmatics of Sociality, Cognition and Communication. John Benjamins.

Graumann, Carl F. 1994. Wieviel Zeigen steckt im Nennen? Zur Situiertheit des Sprachgebrauchs. In: H.J. Kornadt, J. Gabrowski, & R. Mangold-Allwinn (Eds.) Sprache und Kognition. Perspektiven moderner Sprachpsychologie. Heidelberg: Spektrum. 55-69.

Jackendoff, Ray. 2007. Language, Consciousness, Culture: Essays on mental Structure. Cambridge, MA: MIT.

Monday, January 28, 2008

The Cognitive Foundations of Perspective III

In this post I continue to elaborate on my inquiry into the cognitive structure of the shared systemic space. As we have seen Discourse Representation Theory, File Change Semantics, the Theory of Visual Indexes, and the theory of object files together construe a useful methodology for a research program interested in the structure of mental representation.

Hurford himself uses Kamp & Reyle’s (1993) box-notation of Discourse Representation Structures to describe the mental representations of non-human animals, because the observations I discussed in my last post led him to conclude that
“it is natural to assume that human language evolved by building upon pre-existing representational schemes in animals” (Hurford 2007: 140).
There is one thing we have to make clear, before we can make the findings I describe in my last post fruitful for research into the properties of the systemic space: the contributions of global and local attention. Basically, when we perceive a visual scene
“An initial rapid pass through the visual hierarchy provides the global framework and gist of the scene and primes competing identities through the features that are detected. Attention is then focused back to early areas to allow a serial check of the initial rough bindings and to form the representations of objects and events that are consciously experienced.” (Treisman 2005: 541)

To give you an example adapted from Hurford (2007: 152), if we want to represent the results of global and local scans toward a scene in Hurford’s adapted notation of Kamp & Reyle’s (1993) Discourse Representation Structures, the result of a quick global scan would look like this:
but if the result of focal attention to the individual scene would have the following mental representation:
This process is closely related to profiling, that is the distinction between figure, the
“integrated visual experience that ‘stands out’ in the center of attention” (Coren et al. 1999: 564),
and 'ground',
“the background against which figures appear” (Coren et al. 1999: 565).
The most famous illustration of this principle is Rubin’s reversible face-vase figure, where we can either see a white vase as the figure which stands in front of a black ground or two black faces that are in front of a white background (Goldstein 1999: 187).

As there is additional “evidence that imagery engages brain mechanisms that are used in perception and action“ (Kosslyn et al. 2001: 635), and the fact that “perceptual representations are routinely activated during comprehension” (Zwaan 2004), (and as we have already established that we can indeed we can draw an analogy between these two areas), we are able to make an analogy between this “primitive example of perceptual organization” (Coren et al. 1999: 296) and language comprehension.

Thus we can say that the mental representations underlying language comprehension, i.e. the structure of the systemic space, probably underlies the same principle of global and local attention/figure and ground. What I mean by this is that when we create a discourse universe, we create layers of meaning on a variety of planes, such as temporal layers (Before I started studying, I… But now… etc.), Theory of Mind layers (I thought that he knew that she knew that I…), degrees of relevance, and so on.
In such a structured systemic space, some aspects are more important than others, are ‘background knowledge’ so to speak, whereas other aspects are in the foreground and are brought to the spotlight of our attention. They represent the ‘figure’ of the message against the ‘ground’ of context. (Köller 2004: 442f)
Köller (2004: 442ff.), following the German linguist Harald Weinrich, calls this power of language to establish such layered and meta-structured systemic spaces its ability to create reliefs.

Summarizing these considerations, there are now some additional properties of the systemic space, and we have gained additional insight into how language can locate things in the coordinate system of subjective orientation:

In my next post I'll speculate a bit about the ontogenetic as well as phylogenetic pathway that may have led to our modern ability to take perspectives.

On a related note, the admin of the language evolution blog has posted the first post on "Major Language Evolution Papers", this time about Tomasello et al.'s (2005) great paper on the "Origins of human cognition" - go check it out!


Coren, Stanley, Lawrence M. Ward and James T. Enns. Sensation and Perception. 5th
ed. Fort Worth: Harcourt Brace, 1999.

Goldstein, E. Bruce. Sensation & Perception. 5th ed. Pacific Grove: Brooks/Cole, 1999.

Hurford, James M. 2007. The Origins of Meaning: Language in the Light of Evolution. Oxford: OUP.

Kosslyn, Stephen M., Giorgio Ganis and William L. Thompson. “Neural Foundations
of Imagery.” Nature Reviews Neuroscience 2 (2001): 635-642.

Kamp, Hans and Uwe Reyle. 1993. From Discourse to Logic: Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory. Dordrecht, Holland: Kluwer Academic.

Köller, Wilhelm. 2004. Perspektivität und Sprache. Zur Struktur von Objektivierungsformen in Bildern, im Denken und in der Sprache. Berlin/ New York: de Gruyter.

Treisman, Anne (2005). Psychological issues in selective attention. In Michael A. Gazzaniga (Ed.), The Cognitive Neurosciences, III,. Cambridge, MA: MIT Press.: 529–544.

Zwaan, Rolf A. (2004). The immersed experiencer: toward an embodied theory of language comprehension. In: B.H. Ross (Ed.), The Psychology of Learning and Motivation, Vol. 44. New York: Academic Press.

Thursday, January 24, 2008

The Cognitive Foundations of Perspective II

So in my last post I summarized a bunch of overlapping scientific research programs which can be combined under the notion of mental representation as a creation of a virtual “systemic space”, which can be is exemplified by the idea of short-time memory as a “workbench”, where you can store and manipulate virtual objects, which is used frequently in the cognitive sciences.

Quite strikingkly, when we use Wittgenstein’s (1953) notion of shared activities as ‘games’ and the creation of a shared systemic space as a ‘language game’, there are also some interesting implication of Pinker et al.’s (2008) statement that language serves as a as a
“reference point in coordination games.” (Pinker et al. 2008).
Thus we can also locate their definition of “focal point” i.e.,
"a salient location that two rational agents can agree on when they would be better off coordinating their behavior than acting independently.” (Pinker et al. 2008: 837)
on Bühler’s coordinate system of subjective orientation, or our notion of a shared systemic space.
On this view we can regard a discourse as the negation of the exact focal position of a proposition in the coordinate system of the shared systemic space, (what Pinker et al. call the ‘problem space’). According to Pinker et al., such negotiation is mostly used
“to negotiate the type of relationship holding between speaker and hearer (in particular, dominance, communality, or reciprocity)” (Pinker et al. 2008: 833)
Hurford (2007) reviews further approaches which can be subsumed under this notion:
The first he mentions is Discourse Representation Theory (DRT: Kamp and Reyle 1993). DRT is concerned with describing from a semantic point of view how in discourse we build up a universe consisting of the things we mention. In Kamp and Reyle’s notation this ‘discourse universe’ is represented by a box, which they call a Discourse Representation Structure. The most simple kind of structure in such a DRS would look something like this, with the set of ‘discourse referents’ (x, y, z, etc.) at the top of the box:

This, of course, is basically what I, following Köller (2004), would call a systemic space.
According to Kamp and Reyle, the process of semantic representation is the following: On hearing a sentence (S1), we create a DRS. When the next sentence (S2) is uttered in discourse, this sentence contributes new information to the already constructed DRS, or in my notation, new propositions are transferred into the systemic space or old ones are transformed. This process goes on and on with every new sentence.
Thus, the hearer relates the new sentence to the informational structure already obtained, thereby dynamically construing and manipulating the shared systemic space. (Kamp & Reyle 1993: 59). We could say that with every sentence we change from one mental model of the discourse to another (i.e. M1 ->M2 ->M3, etc.) (Kamp & Reyle 1993: 96).
Kamp & Reyle also have something very interesting to say about the cognitive and attentional underpinnings of the representation of shared systemic spaces: When you hear a proper name in discourse, you assign it an index (like x, y, z, etc.) to make temporary reference possible and to keep track of the discourse referent. They call this an ‘external anchor’ for a discourse referent (x) which maps x onto some real individual (like say, Zombie-Scientist George if he really existed) (Kamp & Reyle 1993: 248)

This proposal is closely related to some other theories of cognition. According to Zenon Pylyshyn’s theory of FINST, regarding the ability to keep track of moving objects in a visual scene,
“a small number of visual objects can be preattentively indexed or tagged and thereby accessed more rapidly by a subsequent attentional process (e.g., the traditional "spotlight of attention") (Sears & Pylyshyn 2000: 1)
These visual indices can be seen as mental labels that can be attached to objects in order to keep track of them. (Hurford 2007: 92).

Another related theory, highlighted by Hurford (2007: 139), is ‘File Change Semantics”, according to which
“A listener’s task of understanding what is being said in the course of a conversation bears relevant similarities to a file clerk’s task. Speaking metaphorically, let me say that to understand an utterance is to keep a file which, at every time in the course of the utterance, contains the information that has so far been conveyed by the utterance.” (Heim, 1983:167)
Again, there is a psychological theory which closely echoes this assessment in the visual domain. According to Kahneman & Treisman (1992) set up ‘object files’
“as a temporary episodic representation, within which successive states of an object are linked and integrated” (Kahneman & Treisman 1992: 175)
These files are constantly updated by new information about the target’s features or location.
The attentional limit of things we can consciously be aware of seems to lie at 4 target objects, (Hurford 2007: 93, Cowan 2001) and seems to hold true for the perceptual space as well as for the systemic space.

This research about visual indexes as
“a means of setting attentional priorities when multiple stimuli compete for attention” (Sears and Pylyshyn 2000: 2) )
,as well as the idea of information ‘files’ goes very well with our notion of language as a means to pilot attention toward certain propositions in the systemic space.

In sum, we see that there are overlapping theories concerning mental representations of the perceptual space as of the virtual systemic space. Hurford (2007) argues that this independent convergence of several areas of research indicates that:
“one bit of language-processing machinery has been co-opted (and probably adapted somewhat) from pre-existing visual scene processing machinery.” (Hurford 2007: 140)


Cowan, Nelson. 2000. The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences 24(1), 87–114.

Heim, Irene. 1983. File change semantics and the familiarity theory of definiteness. In R. Bäuerle, C. Schwarze, and A. von Stechow (Eds.), Meaning, Use, and Interpretation of Language. Berlin: Walter de Gruyter: 164– 189.

Hurford, James M. 2007. The Origins of Meaning: Language in the Light of Evolution. Oxford: OUP.

Kahneman, Daniel. and Anne. Treisman.1992. The reviewing of object files: object-specific integration of information. Cognitive Psychology 24, 175–219.

Kamp, Hans and Uwe Reyle. 1993. From Discourse to Logic: Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory. Dordrecht, Holland: Kluwer Academic.

Köller, Wilhelm. 2004. Perspektivität und Sprache. Zur Struktur von Objektivierungsformen in Bildern, im Denken und in der Sprache. Berlin/ New York: de Gruyter.

Pinker, Steven, Martin A. Nowak and James L. Lee. 2008. The logic of indirect speech. Proceedings of the National Academy of Sciences 105.3: 833–838.

Sears, Christopher R. and Zenon W. Pylyshyn. 2000. “Multiple object tracking and attentional processing.” Canadian Journal of Experimental Psychology 54(1), 1–14.

Wittgenstein, Ludwig. 1953. Philosophical Investigations. Oxford: Basil Blackwell. (Translated by G. E. M. Anscombe)

Monday, January 21, 2008

The Cognitive Foundations of Perspective

In my last three post I have highlighted convergences between the arguments made by German linguist Wilhelm Köller in his 2004 book “Perspektivität und Sprache” (Perspectivity and Language”) and findings in the general area of cognitive science.

In this post, I want to dig a little deeper, and ask what happens when we engage in discourse in regard to the cognitive representations that are established. Köller argues that through the use of language we create a shared ‘systemic space’ (a term he takes over from the German art historian Erwin Panofsky) a virtual, conceptual model of what we are talking about. We can then create and direct attention to certain proposition in this shared systemic space, thereby advertising a certain perspective on the world, and highlighting relevant information. Köller bases this view of language on German psychologist and linguist Karl Bühler’s (1934) notion of the “Deictic/Symbolic field” which we construct, and wherein we place and transport referential semantic messages, and which basically can be illustrated like this:
Bühler calls this the ‘coordinate system of subjective orientation’. The O represents the ‘Origopoint’, the point of origin for all referential messages, and is also called the “I-here-now-Origo”, because from this point of view/perspective the sender refers to thing in the world, and introduces new referents into the discourse. Köller adopts this view of language, but adds to it the perspectival and attention-piloting nature of discourse.

In sum, in Köller’s view Language is:
  1. a means to create a shared, ‘virtual’ systemic space
  2. a means to transfer relevant propositions in to the systemic space and
  3. a means to focus and pilot attentions and thus to
  4. create perspective

This view of mental representations as virtual, internal models of some state of affairs in the world is echoed in a variety of other proposals in cognitive science.

First, and most importantly, Michael Tomasello and his colleagues from our beloved Max Planck Institute for Evolutionary Anthropology argue for the importance of Shared Intentionality, in human cognition, that is
“the ability to participate with others in collaborative activities with shared goals and intentions” (Tomasello et al. 2005: 675).
In order to both communicate or collaborate, it is necessary that:
“each participant cognitively represent both roles of the collaboration in a single representational format – holistically, from a “bird’s-eye view,” (Tomasello et al. 2005: 681).

Thus, we need to have the knowledge of that there exists a “shared space of common psychological ground” (Tomasello & Carpenter 2007: 121).
This means that the most fundamental aspect of human cognition that enables joint attention and shared intentional actions is the ability to cognitively represent a “shared space of meaning” (Moll & Tomasello 2006: )/ a ‘systemic space’ from a central perspective (Köller 2004).

Dan Dennett (1996) also holds that one of the major feats of human cognition is the ability to build virtual models of reality, especially the ability to build rich intentional models of yourself and others. In this virtual model, you can take the physical stance, the design stance, or the intentional stance toward the systems created and basically see what happens.

In addition, Bickerton (1990) argues that the most special trait of human cognition is the ability to entertain counterfactual propositions. To give you an example in the vein of Köller (2004): If I say something like:
“There aren’t any Zombies in this room,”
not only do I introduce the concept of ‘Zombies’ into the systemic space, but negated Zombies:

Such counterfactual propositions needn’t necessarily be linguistic. In his 2006 Jean Nicod Prize lecture, Michael Tomasello gave the example of a child always wearing a belt when it goes for a walk. When the child and her mother go out of the house, and the mother then sees that the child doesn’t have his belt on, but only points to the child’s waistband, the child goes ‘Ooops’ , goes back inside and fetches the belt. Mother and child thus communicate nonverbally about a counterfactual proposition based on mutual knowledge/common ground about a conventionalized set of actions. Bickerton proposes that this ability is syntactically structured and enables us to manipulate, physical and virtual events, objects, and processes in general.

Don Ross (2007), reviewing Dennett’s and Bickerton’s proposals, argues that on this view language is a means to stabilize ‘referential fixed points’ in these virtual spaces. So on the one hand language can be seen as cognitive aid or artifact which enhances cognitive representations via conceptual labeling, and on the other hand it can be seen as a means of introducing fixed points into the virtual shared systemic space of joint attentional discourse.
Ross then draws our attention to the fact that we also introduced ourselves into these virtual spaces, thus ‘narrating’, and thus creating our selfhood and our identity both in private mental representation and shared intentional discourse.

Suddendorf & Corballis (2007) have similar ideas about the importance of self- and other-projection into virtual spaces as a method of ‘Mental Time Travel’, which enables us to plan and adapt to possible future stages of the world as well as future need by learning from and re-experiencing the past and projecting ourselves into future situations where we can practice various plans of action. They liken this process to a theater production, as the necessary precondition for virtual planning and decoupled thinking processes is a virtual systemic space, or, if you go with their theater-metaphor, a stage. According to Suddendorf & Corballis (2007), further necessary components of Mental Time Travel are:
  • a declarative database or script from which you can infer how to act in a given situations (i.e. the playwright),
  • the ability to represent yourself and others realistically, ergo a Theory of Mind (i.e. the actors),
  • the ability to create an adequate physical context in which the mental representation can operate (the set),
  • the motivation and ability to practice and rehearse future actions in the virtual space.(the director)
  • the ability to voluntarily control and execute the ‘best plan’ to achieve the future goal, which is relatively ‘rational’, and decoupled from present stimuli and neeeds. (The executive producer)
  • and additionaly, many MTTs are expressed publicly via language: "More generally, humans use language to exchange and complement their mental travels into the past and their ideas about future events, as well as to cooperatively coordinate plans and strategies" (Suddendorf & Corballis 2007: 310) (The broadcaster)

Virtual internal modeling can of course also seen as a solution to the exploration-exploitation trade-off as well as to the unstable environment homo is said to have emerged from.

Interestingly, there is additional support for this internal modeling hypothesis namely, from robotics.
As you may recall, In my first post I briefly described an experiment done by Floreano and Nolfi (1996) in which a robot evolved to internally represent the area surrounding him, thus enabling him to act in relation to the virtual internal body-related map he had created.
Even more intriguing, Bongard et al. (2006) describe a robot which adapts to an unstable environment as well as injuries (such as losing a leg) by continuous internal self-modelling.

Of course, the notions of internal systemic representation is practically all-pervasive throughout the field of cognitive science, as echoed for example in the controversy between Theory-Theory and Simulation Theory proponents in Theory of Mind research, as well as the ongoing controversies regarding the relation between mirror neurons and mental imagery, and practically every aspect of cognition, but I find the combination of internal representational spaces, shared intentionality and perspective incredibly, and I write a bit more about their cognitive and linguistic foundations in my next post. I’ll also try to stay more down to earth in my next post.


Bickerton, Derek .1990. Language and Species. Chicago: University of Chicago Press,

Bongard, J. V. Zykov, & Lipson, H. 2006. “Resilient Machines Through Continuous Self-Modeling” Science 314: 1118-1121.

Bühler, Karl. 1934. Sprachtheorie. Die Darstellungsfunktion der Sprache. Jena: Gustav Fischer.

Dennett, Daniel C. 1996. Kinds of Minds. New York: Basic Books.

Floreano, Dario, and Francesco Mondada. 1996. “Evolution of homing navigation in a real mobile robot”, IEEE Transactions on Systems, Man, and Cybernetics – Part B: Cybernetics 26:396–407.

Köller, Wilhelm. 2004. Perspektivität und Sprache. Zur Struktur von Objektivierungsformen in Bildern, im Denken und in der Sprache. Berlin/ New York: de Gruyter.

Moll, Henrike, & Michael Tomasello. 2007. Co-operation and human cognition: The Vygotskian intelligence hypothesis. Philosophical Transactions of the Royal Society 362: 639-648.

Ross, Don. 2007. H. sapiens as ecologically special: what does language contribute? Language Sciences 29.5: 7 10-731.

Tomasello, Michael.1999.: The Cultural Origins of Human Cognition. Cambridge, Massachusetts; London, England: Harvard University Press

Tomasello, Michael and Malinda Carpenter. 2007. Shared Intentionality. Developmental Science 10:1:121-125.

Tomasello, Michael, Malinda Carpenter, Josep Call, Tanya Behne, and Henrike Moll. 2005. “Understanding and Sharing Intentions: The Origins of Cultural Cognition.” Behavioral and Brain Sciences 28.4: 675-735.

Suddendorf, Thomas & Michael C. Corballis. 2007. The Evolution of Foresight: What is mental time travel, and is it unique to humans? Behavioral and Brain Sciences 30.3: 219-313.

Thursday, January 17, 2008

Wilhelm Köller’s Lack of "Perspective"

In my last post I showed some of the foundations we need  in order to get a communication system of the ground, so that a shared symbolic storage is able to evolve. But how are we able to stabilize a shared lexicon?
Köller (2004) quotes G.H. Mead, who wrote in his 1932 essay “The Objective Reality of Perspectives” that:
“it is only insofar as the individual acts not only in his own perspective but also in the perspective of others, especially in the common perspective of a group, that a society arises […]. The limitation of social organisation is found in the inability of individuals to place themselves in the perspectives of others, to take their points of view. […] we find here an actual organisation of perspectives […] This principle is that the individual enters into the perspectives of others, insofar as he is able to take their attitudes, or occupy their points of view” (Mead 1932)
Flavell (1985) argues for the importance of role-taking (inferring psychological processes in other people) in the development of communication skills in a similar fashion.

As we have seen in the robot-Experiment by Luc Steels & Martin Loetzsch (2007) I described in my last post, in order to arrive at a stabilization of a shared lexicon, it is crucial for the communicating agents to be able to take the perspective of another agent in order to reconstruct the scene from his viewpoint and thus check for all possible referents and meanings of a communicative utterance.
In this experiment, the perspective-taking described is of course only a visual alignment of perspectives, but one can make a conjecture that the same holds true for more ‘abstract’, conceptually-driven, higher-order communicative acts, and that in order to stabilize a full-fledged, semantically laden language system as that of a human language, the same process must be carried out in respect to mental states and thoughts of others. Thus, in order to check for possible interpretations of an utterance in a pragmatic context, we have to be able to infer mental states to others, i.e. have a Theory of Mind which enables us to not only take the visual perspective of another person into account, but also his beliefs, desires, knowledge, etc.
It is likely that visual perspective taking precedes full-fledged mental mental perspective-taking, and Flavell’s distinction of Level 1 and Level 2 acts proves very useful in this context:
“At Level 1, [the child] is capable of nonegocentrically inferring that [someone else] sees an object presently nonvisible to [the child] himself. At Level 2, [the child] is also capable of nonegocentrically inferring how an object that both currently see appears to [someone else], that is, how it looks from his particular spatial perspective.” (Masangkay et al. 1974: 357)
Level 1 perspective-taking seems to be mastered at 24-months of age. At this age children are able to infer that they can see a toy which the adult himself cannot see because his line of sight is occluded. (Moll & Tomasello 2006)

The classic test for Level 2 perspective taking is Piaget & Inhelder’s (1956) ‘three-mountain- experiment’, in which a child has to chose a photograph which corresponds to another person’s viewpoint (in this case a doll's) of three toy mountains. Köller uncritically adopts Piaget & Inhelder’s assessment that only at the age of 8 years were children able to transcend their own egocentric perspective. Before, they would insist that the doll would see the mountains in the exact same way. (Köller 2004: 148)

However, as subsequent research has shown, children already succeed in more child-friendly versions of the task at 4-5 years of age. (Masangkay et al. 1974, Flavell et al. 1979, Light & Nix 1983.) Although these advances in research on perspective-taking are more than 25 years old by now, Köller doesn’t mention them.

There is another, much more elaborate paradigm of cognitive development research which gives insights into what children know about the mental representations of others, namely the Theory of Mind-paradigm. Originally the term stems from primatology: in 1978, David Premack & Guy Woodruff asked “Does the chimpanzee have a Theory of Mind?” ToM was defined as the ability to impute mental states in yourself and others. (The question is still hotly debated, and I’m really interested how the next Behavioral and Brain Sciences Batttle will turn out when it is published along with its target articles).

In 1983 Heinz, Wimmer & Josef Perner developed the following experimental paradigm to check for ToM in human children:
“A story character, Maxi, puts chocolate into a cupboard x. In his absence his mother displaces the chocolate from x into cupboard ‘y. Subjects have to indicate the box where Maxi will look for the chocolate when he returns. Only when they are able to represent Maxi’s wrong belief (‘Chocolate is in x’) apart from what they themselves know to be the case (‘Chocolate is in y’) will they be able to point correctly to box x. “ (Wimmer & Perner 1983:106)
In Wimmer & Perner’s experiment, ToM performance rose significantly around the ages of 4 to 6 years.

In 1985, Simon Baron-Cohen and his colleagues further popularized the ToM-paradigm, with this experiment:
“There were two doll protagonists, Sally and Anne. […] Sally first placed a marble into her basket. Then she left the scene, and the marble was transferred by Anne and hidden in her box. Then, when Sally returned, the experimenter asked the critical Belief Question: “Where will Sally look for her marble?“. If the children point to the previous location of the marble, then they pass the Belief Question by appreciating the doll’s now false belief. If however, they point to the marble’s current location, then they fail the question by not taking into account the doll’s belief. (Cohen et al. 1985: 41)
Baron-Cohenand his colleagues tested children with autism, as well as children with Down syndrome and normal children. Whereas most normal children succeeded in the test - as did the children with Down-syndrome - most children with autism failed the test.

Typically, normal children pass these ‘classic’ ToM tests somewhere between the age 3-5, depending on a variety of factors (Wellman et al. 2001)

Level 2 perspective-taking seem to be closely associated with Theory of Mind, because it includes the “realization that minds can take different perspectives on the world because they represent it differently“ (Aichhorn et al. 2006: 1067). Thus it isn’t surprising, that Level 2 perspective-taking is mastered around 4, which is the same time children succeed in theory of mind tests (Aichhorn et al. 2006: 1059).
Thus, in order to answer the question posed at the beginning of this post, in order to stabilize a shared lexicon, not only do we need the ability for purely ‘physical’ perspective shifts, but we also need to be able to put ourselves in the ‘cognitive shoes’ (Tomasello 1999) of another person to check for the possible referents of an utterance. Or, as Michael Tomasello puts it:
“As the child masters the linguistic symbols of her culture she thereby acquires the ability to adopt multiple perspectives simultaneously on one and the same perceptual situation. As perspectivally based cognitive representations, then, linguistic symbols are based not on the record- ing of direct sensory or motor experiences, as are the cognitive representations of other animal species and human infants, but rather on the ways in which individuals choose to construe things out of a number of other ways they might have construed them, as embodied in the other available linguistic symbols that they might have chosen, but did not. Linguistic symbols thus free human cognition from the immediate perceptual situation not simply by enabling reference to things outside this situation […] but rather by enabling multiple simultaneous representations of each and every, indeed all possible, perceptual situations.” (Tomasello 1999: 9)
Sadly, this approach is completely absent from Köller’s account of children’s cognitive and perspectival development, and in my next post I’ll write a little more about the cognitive foundations of perspectivity that I plan to write my term paper about.

P.S.: All in all, of course, that's not to say that I don't think "Perspektivität und Sprache" is a great book - arguably it is - but as every book, it has its weaknesses. In other part of the book, on the perspectival implications of lying, for example, he in fact does cite primatological evidence, referring to work done by evolutionary anthropologist Volker Sommer and as well as arguments made by Dan Dennett regarding to Machiavellian Intelligence.


Aichhorn, Markus, Josef Perner , Martin Kronbichler , Wolfgang Staffen & Gunther Ladurner. 2006. “Do visual perspective tasks need theory of mind?” Neuroimage 30: 1059 – 1068.

Baron-Cohen, Simon, Alan M. Leslie & Uta Frith. 1985. Does the autistic child have a “theory of mind”? Cognition 21: 27-46.

Flavell, John H. (²1985): Cognitive Development. Englewood Cliffs, N. J.: Prentice-Hall.

Flavell, John H., Barbara Abrahams Everett, Karen Croft, & Eleanor R. Flavell. 1981. Young Children’s Knowledge about Visual Perception: Further Evidence for the Level 1 – Level 2 Distinction. In: Developmental Psychology 17, 99– 103

Köller, Wilhelm. 2004. Perspektivität und Sprache. Zur Struktur von Objektivierungsformen in Bildern, im Denken und in der Sprache. Berlin/ New York

Light, Paul. and Carolyn Nix.1983.: Own View versus Good View in a Perspective-Taking Task. In: Child Development, 54.2, 480–483.

Masangkay, Zenaida. Kathleen A. McCluskey, Curtis W. McIntyre, Judith Sims-Knight, Brian E. Vaughn, aund John H. Flavell .1974.: The Early Development of Inferences about the Visual Percepts of Others. In: Child Development, 45, 357–366

Mead, George Herbert. 1932. The Philosophy of the Present. Edited by Arthur E. Murphy. La Salle, Ill.: Open Court.

Moll, Henrike and Michael Tomasello (2006): Level 1 Perspective-Taking at 24 Months of Age. British Journal of Developmental Psychology 24, 603–613

Piaget, Jean & Bärbel Inhelder. 1956. The child’s conception of space. London: Routledge

Premack, David and Guy Woodruff. 1978 Does the chimpanzee have a theory of mind? Behavioral and Brain Sciences 1: 515-526.

Steels, Luc and Martin Loetzsch. 2007. “Perspective Alignment in Spatial Language.” Spatial Language and Dialogue. Eds. K.R., Coventry, T. Tenbrink, and J.A. Bateman. Oxford: Oxford University Press.

Tomasello, Michael.1999.: The Cultural Origins of Human Cognition. Cambridge, Massachusetts; London, England: Harvard University Press

Wimmer, Heinz and Josef Perner (1983): Beliefs about Beliefs: Representation and Constraining Function of Wrong Beliefs in Young Children’s Understanding of Deception. Cognition 13, 103-128

Monday, January 14, 2008

A Second Perspective

Blogging on Peer-Reviewed ResearchSo in my last post I criticized German linguist Wilhelm Köller’s book “Perspectivity and Language” (Too be honest, I’m afraid this book is never going to be translated into any other language and even I as a native speaker of German sometimes have a hard time at extracting the meaning from his convoluted sentences) for not taking into account the research done in much of biology and cognitive science over the last 25 years or so. And there is definitely a need of integrating such data into an account of perspectivity and especially perspective-taking, and without it Köller’s book seems (to me at least) awkwardly incomplete.
Yet I also stressed that much what Köller has to say is nevertheless in accordance with recent work done in the field of cognitive science.

To give you some examples:
Köller claims that perspectivity is essentially a context-dependent, dynamic and systematic process of meaning construal. The meaning of metaphors, for example, is in this view created by variable reciprocal semantic interactions of dynamic structural patterns. And Indeed, recent work in cognitive poetics has shown that the mere act of understanding sentences requires perspective shifts (Gibbs 2003: 35ff.). It is further supported by views of cognition which do not treat meaning and categories as static amodal entities, but as instantiated products of “dynamic meaning construal.” (Gibbs 2003).
Thus Köller’s approach is basically compatible with Larry Barsalou’s notion of conceptualization as the integrated interaction of multimodal perceptual symbol systems (1999) That is, a simulation which employs stored ‘captured states’ of specific percepts, such as the groan of a zombie, the smell of rotting flesh, the sight of the white lab coat, and the introspective response of fear in order to form a fully-fledged mental representation of, say, zombie-scientist George.) Of course this assessment needs “caveatting”, as both employ wholly different levels of explanation of perspectives. Yet they also concur in their emphasis of the aspectual and embodied nature of human cognition as one of its main constituents.
Köller also operates under the old philosophical assumption that what we perceive and communicate about actually is just a cognitively constructed model. And Indeed there is a lot of evidence from neuroscience and cognitive science that supports such a view (Metzinger 2004)

When Köller treats perspectivity as a basic semiotic category of human cognition which is deeply ingrained into the structure of human perception, categorization, and communication he also converges with much recent work done in the field of language evolution.
The potential available to express and yield perspectives is an intrinsic property of the language system. It is there because is represents the accumulated attempts of generations of speakers who tried to perspectivize their environment in distinct way, who tried to communicate with each other and directed attention to certain aspects of the world around them in particular ways. Köller also emphasizes the importance of the direction of attention via language, especially that a main function of language is to bring others to focus on distinct properties relevant in a specific socio-pragmatic context.
These observation are in the same vein as the proposals of “Relevance Theory”, first developed by Dan Sperber and Deirdre Wilson in their seminal book “Relevance: Communication and Cognition” (1986). whose main assumption is that
“Human cognition tends to be geared to the maximisation of relevance.” (Wilson & Sperber 2004)
The medium by which
“attention and processing resources are allocated to information that seems relevant.” (Wilson 1999)
is language.

Relevance and communicative success are also seen as the major driving forces in the evolution of language in a communicating group of agents, and a lot of research has been done in this vein (for examples of computer modeling see Kirby & Christiansen 2003, for examples of mathematical modeling see Nowak et al. 2002, and for examples of work done in evolutionary robotics see Lipson 2007.)
The most impressive account of this account I know of is presented by Luc Steels and Martin Loetzsch (2007). They developed
“situated embodied agents that self-organise a communication system for dialoging about the position and movement of real world objects in their immediate surroundings.”
Steels and Loetzsch had two AIBOs play robot soccer with an orange ball and coordinate their actions by communicating (via a highly complex language processing software) about the movements and whereabouts of the ball.

They came to the conclusion that
“Perspective alignment is possible when the agents are endowed with two abilities: (i) to see where the other one is located, and (ii) to perform a geometric transformation known as Egocentric Perspective Transform.”
But what was even more remarkable that under these conditions not only where the agents able to stabilize a shared vocabulary and generate a successful communication system because they were able to verify the possible meaning of a signal from both their own perspective and that of the other, but, given a certain cognitive architecture, their language system also developed perspective markers, which reduced the cognitive effort of perspective alignment and perspective-taking!
I find this a pretty darn incredible result and I’ll try to blog a bit more on it in the future.

Still there are some things in Köller’s book which I’m really unhappy about, mainly his treatment of cognitive development, and I’ll come back to that in my next post.


Barsalou, Lawrence W. 1999. "Perceptual Symbol Systems." Behavioral and Brain Sciences 22.4: 577-660.

Gibbs, Raymond W., Jr. “Prototypes in Dynamic Meaning Construal.” Cognitive Poetics in Practice. Eds. Joanna Gavins and Gerard Steen. London: Routledge, 2003. 27-40.

Kirby, Simon and Morten H. Christiansen, 2003. “From language learning to language evolution.” Language Evolution. Eds. Christiansen, M. and Kirby, S.,Oxford: Oxford University Press. 272–294.

Köller, Wilhelm. 2004. Perspektivität und Sprache. Zur Struktur von Objektivierungsformen in Bildern, im Denken und in der Sprache. Berlin/ New York

Metzinger, Thomas. “The Subjectivity of Subjective Experience: A Representationalist Analysis of the First-Person Perspective.” Networks 3-4 (2004): 33-64.

Lipson, Hod. 2007. Evolutionary Robotics: Emergence of Communication. Current Biology 17.9: 330-332.

Nowak, Martin, Natalia L. Komarova & Partha Niyogi. 2002. “Computational and evolutionary aspects of language” Nature 417: 611-617.

Sperber, Dan., and Deirdre Wilson. 1986. Relevance: Communication and Cognition. Oxford: Blackwell.

Steels, Luc and Martin Loetzsch. 2007. “Perspective Alignment in Spatial Language.” Spatial Language and Dialogue. Eds. K.R., Coventry, T. Tenbrink, and J.A. Bateman. Oxford: Oxford University Press.

Wilson, Deirdre. 1999. Relevance Theory. 719-722.

Wilson, Deirdre & Dan Sperber. 2004. “Relevance Theory.” The H andbook of Pragmatics. Eds. L. Horn & G. Ward Oxford: Blackwell: 607-632

Thursday, January 10, 2008

Back and From a Different Perspective I

So I'm back from my internetless Christmas holidays (my newsfeed told me that I had about 300 unread posts - Argh!) and I think that ’ll write a little bit about what I “do for a living” for a change, (or rather, what I do in order to be able to do something for a living somewhere in the distant future), that is, study German and English Philology.

This semester I’ll be writing two term papers and I’ll expand a bit on which issues I’d like to write about.
In German, I’m currently taking a course entitled “Perspectivity in language from a grammatical point of view”, whose main focus lies on the opus magnum of the German linguist Wilhelm Köller, called “Perspektivität und Sprache” (Perspectivity and Language”; The subtitle is too hard for me to translate…).
Blogging on Peer-Reviewed ResearchKöller’s main idea is that when we talk we not only express that we’ve taken a certain perspective toward a situation or set of facts in the world, but that we also try to bring others to take the same perspective. When we use language, we advertise a certain point-of-view from which to interpret and process information. By our words, we stratify and structure information in certain ways and try to bring others to taking a certain perspective toward what we talk about.
In his 900-page monstrum, Köller examines in detail the various means by which we express perspectivity in, say, conversations, talks, or texts, how we direct attentional focus toward certain aspects of a state in the world.
According to Köller, perspectivity is an intrinsic property of language in general, and he shows how such things as the case system, tenses, verbs in general, conjunctions, negations, etc., all posses a potential for expressing perspectivity, and advertising certain points-of-view. Metaphors, for example, he treats as heuristic tools for the generation of meaning and orientation, bringing the world into focus in variable ways (Köller 2004: 600).
Köller’s book is divided into four parts, A a general introduction, B perspectivity in the visual domain, C perspectivity in the cognitive domain, D perspectivity in the linguistic domain. Interestingly, Köller comments on some of the key themes of discussions about human cognition and its evolution, although being oblivious of most of the recent research done in the English-speaking scientific community. For someone whose focus is primarily a linguistic/philosophical one, Köller takes a fairly interdisciplinary approach.
In his chapter on “perspectivity in the visual” domain, he stresses the importance that our “minds have bodies that are situated in environments” (Poirier et al. 2005: 741) which he describes as a preconditional a priori of all our experience, which defines our point-of-view and demarcates what we can and cannot perceive (Köller 2004: 133). As the quote from Poirier et al.’s 2005 paper already showed, Köller shares with them and other modern researchers the idea that embodiment is an important aspect of all cognitive processes.
In his chapter on “perspectivity in the cognitive domain“ Köller addresses perspectivity as a primeval anthropological problem, and tries to gain insight into the phylogenetic history of perspectivity by looking an the ontogenetic cognitive development of children and drawing conclusion from them. He sums up the theories of Jerome Bruner, Lev Vygotski, Alexander Luria, Jean Piaget and John H. Flavell and comments on their implications for a theory of perspective taking.

Sadly, this is as far as he goes on the scientific timeline. The most recent work he cites on the topic of cognitive development stems from 1976. No mention of Theory of Mind, or any of the experiments trying to infer when this ability really kicks off in children, no Tomasello etc.
Thus the main problems of this section are that firstly, Köller seems to swallow Piaget’s idea of egocentric speech whole, (which, I think, is clearly refuted by such data accumulated by e.g. Tomasello et al. 2005).
Secondly he uncritically adopts Piaget’s take on the development of perspective taking, which, as a ton of research done since then (e.g. Premack & Premack 1997, Hamlin et al. 2007, Rakocy et al. 2007, Surian et al. 2007) clearly shows, gives children way too little credit for their cognitive achievements in early years, and generally sets the development of such cognitive traits such as Theory of Mind way too late. I really can't understand why Köller only describes Piaget & Inhelder's (1956) "Three-Mountain" experiments, which established that only at the age of eight were children able to consider that someone else saw a mountain from a different angle,  without ever mentioning the important revisions made by Nasangkay et al. (1974), Flavell et al. (1981), and Light & Nix (1983), who argued and presented evidence that children were already able to succed at more child-friendly versions of this task at the age of 4-5.
Thirdly, Köller doesn’t cite any primatological or comparative ethological research in order to gain insight into the evolution of perspective-taking in humans (which is especially startling given the slight phonetic resemblance between the names Wilhelm Köller and Wolfgang Köhler)(e.g., again, Tomasello et al. 2005).
So this is what I’d like to do in my term paper: present a cognitive science update of Köller’s inquiries into the phylogeny and ontogeny of perspective-taking.
I’ll come back to that in my next post.

On a related note, Benoit Hardy-Vallée has posted a cool summary about "Embodied, Situated and Distributed Cognition", which is really worth to be checked out.


Flavell, John H., Barbara Abrahams Everett, Karen Croft, & Eleanor R. Flavell (1981): Young Children’s Knowledge about Visual Perception: Further Evidence for the Level 1 – Level 2 Distinction. In: Developmental Psychology 17, 99– 103

Hamlin, J Kiley, Karen Wynn & Paul Bloom.2007. “Social evaluation by preverbal infants.” Nature 450: 557-560.

Köller, Wilhelm. 2004. Perspektivität und Sprache. Zur Struktur von Objektivierungsformen in Bildern, im Denken und in der Sprache. Berlin/ New York

Light, Paul. und Carolyn Nix, (1983): Own View versus Good View in a Perspective-Taking Task. In: Child Development, 54.2, 480–483.

Masangkay, Zenaida. Kathleen A. McCluskey, Curtis W. McIntyre, Judith Sims-Knight, Brian E. Vaughn, aund John H. Flavell (1974): The Early Development of Inferences about the Visual Percepts of Others. In: Child Development, 45, 357–366

Poirier, Pierre, Benoit Hardy-Vallée and Jean-Frédéric Depasquale. 2005. “Embodied
Categorization.” Handbook of Categorization in Cognitive Science. Eds. Henri Cohen and Claire Lefebvre. Amsterdam: Elsevier.

Premack, David and Ann James Premack. 1997. Infants Attribute Value to the Goal-Directed Actions of Self-Propelled Objects. Journal of Cognitive Neuroscience 9:6: 848-856.

Rakoczy, Hannes, Felix Warneken, Michael Tomasello.2007.““This way!”, “No! That way!”—3-year olds know that two people can have mutually incompatible desires.” Cognitive Development 22: 47–68

Surian, Luca, Stefania Caldi, and Dan Sperber. “Attribution of Beliefs by 13-Month-Old Infants” Psychological Science 18.7: 580-586

Tomasello, Michael, Malinda Carpenter, Josep Call, Tanya Behne, and Henrike Moll. 2004. “Understanding and Sharing Intentions: The Origins of Cultural Cognition.” Behavioral and Brain Sciences 28.4: 675-735.