Although I knew I wanted to do cognitive neuroscience from early on in my graduate training, it would be years before I actually started taking pictures of brains with a giant magnet. Part of this was the availability of the equipment. When I was in graduate school it was not de rigeur for research-intensive psychology departments to have easy and consistent access to an MRI scanner, so the intrepid psychologists trying to do brain imaging at my school were dependent on filling in awkward time slots on a clinical MRI housed in a trailer on the medical campus about a twenty minute drive from our department.
In the long run this was probably good for me. Careers can be a bit like that Belle and Sebastian song that goes “if you dance for much very longer, you’ll be known as the boy who’s always dancing.” Often the technique you train on as a student tends to determine what you do for at least the early part of your career. If you become “an imager,” the questions you will be expected to ask will always have answers in terms of pseudocolored maps of brains.
My training was a little different, because I was encouraged to try out a range of different techniques in pursuit of interesting questions. In that way, I wound up accumulating a set of computational models, behavioral experiments on humans, and a neuroethological study of zebra finches over my graduate career. The thread that tied this odd assortment of papers together was the issue of how plasticity changes over the life span. So when I finally started to use neuroimaging, in my post-doctoral training, it was with the goal of asking questions that I had developed while working with computational modeling, neuroethology and experimental psychology. I had both scientific and practical reasons for going in this direction. I wanted to understand why people lose the ability to learn new speech categories as they grow up and learn to speak, and it seemed that looking at the brain could teach us something new about this. But I also wanted to be employable, and if I was already leaning in the direction of doing neuroimaging research, I got a not-so-gentle push from the exigencies of the job market. There were many openings for people with imaging backgrounds, and essentially none for computational modeling or neuroethology.
The phenomenon I focused on in my first neuroimaging studies — the “sensitive period” for speech — is part of several larger debates about how plasticity changes over the lifespan in general, and about the biological specialness of speech itself. My plan to deal with these issues depended on identifying and adapting an appropriate set of tools. I would spend the first year or so of my post-doctoral training developing the argument that the best tools for this research involved observing brain responses to passive presentation of speech sounds. The argument went like this: Behavioral research on speech categorization necessarily uses tasks that involve making some sort of decision, making decisions is complicated and involves lots of mechanisms that are not part of how we usually process speech, so we should be looking for ways to study speech without asking people to make these decisions. What could be simpler, under the circumstances, than just asking people to listen passively to speech sounds, and measuring their brain responses?
The most popular way to study second-language speech perception behaviorally is with a “perceptual assimilation task.” This involves asking two different questions. First, we want to understand how the second language sounds relate to the more familiar, well-learned categories of a listener’s native language, so we ask them to listen to the sounds and write down what they think they heard. For example, we want to know why English speakers are so bad at distinguishing Mandarin alveo-palatal fricatives and affricates. This is the sound pair that, if you’re a Mandarin speaker, distinguishes “qi,” — which, for those of you who are kung fu fans, traditional Chinese medicine enthusiasts, or Scrabble players, is familiar as a kind of supernatural life force — from “xi,” which, because of the rampant homophony of Chinese, could mean “west,” “to wash,” or “to smoke,” among other things.
To native English speakers, “qi” sounds a bit like “cheese” without the “s” sound at the end, and the word “xi” sounds a bit like “she.” But these relationships are only approximate, and if you asked a lot of English speakers to identify these sounds multiple times, you’d find that there isn’t much agreement about which category each sound belongs to. Rather, people might assign them more or less at random to two or more different categories. This is interesting to know because it will predict how well they can tell these sounds apart, and how easily they can be learned. We can study how easily people can distinguish these sounds from one another very straightforwardly, by having them listen to pairs of sounds and press one key if they think the two sounds are the same, and another if they think they are different. The general pattern of results from such studies is that people have a more difficult time distinguishing sounds that are identified as members of the same native-language categories than sounds that are identified as members of different categories. So, for those Mandarin affricates, if you thought they were both “chee,” you would have a hard time telling them apart.
Most people have had the experience of trying to repeat a foreign word to a native speaker:
“So, it’s chee?“
“Almost! It’s chee.”
“Oh. I think I get it: chee.”
“No, it was better the last time: chee.”
“But that’s exactly what I said! chee chee chee!“
“Those were three different words!”
Leaving aside difficulties of pronunciation, a big part of the problem here is that we don’t even seem to hear the difference between what our would-be teacher is saying and our futile attempts to duplicate it. Asking people directly to categorize and discriminate these sounds from one another is a very direct and intuitive way to investigate this phenomenon in the lab. And indeed, there are a number of models of how the perception of second-language speech sounds in terms of native-language categories impacts learning based on data from such studies. At the same time, this approach has a couple of problems. The behavioral measures depend — especially the identification phase — on meta-linguistic judgments about speech sounds. Some theorists would go as far as to say that these judgments have more to do with how we reason about speech based on our experience with text than how we actually use speech.
That’s a somewhat extreme position, but everyone agrees that the relationship between what’s observed in these laboratory studies, and what’s going on in our brains when we listen to speech in a second language is a bit complicated. This is a general problem in measuring perception. Even if you’ve never been in a psychology experiment, you probably have some experience with this general kind of issue from having your vision measured. If you’ve ever been tested for corrective lenses, you’ve had the experience of being asked to make psychometric decisions about relatively complex perceptual phenomena. Think about all of the things that have to happen between the adjustment of the refractive error of your cornea and telling the doctor or technician what letter you’re seeing, and you get a taste of what the psychometrician is up against.
First, consider that the image on your retina has to be transmitted to a part of the brain that has produces some response that can be used to determine what letter you’re looking at. It seems reasonable to expect that, given the wide variety of sizes and shapes letters can take on there might be some specialized bit of cortex that responds similarly whenever we encounter an “A,” whether it is presented in upper or lower case, Helvetica, English Gothic, or my barely decipherable handwriting. But in the case of a vision test like the Snellen Acuity eye chart, these variables are held constant, there is also useful information about what letter you’re viewing at a much lower level of description. Expectations based on what you know about the shapes of letters might be impacting the very earliest cortical responses.
There might even be individual differences in how people exploit these different sources of information to make decisions about what to report to the eye doctor. There is also room for personality to have an impact. If I’m not entirely sure whether something is a “P” or an “F,” for example, I will usually say so. I have no idea what the doctor does with this information. If I were, say, a salesperson, or an executive, rather than a scientist — i.e., someone who is used to projecting confidence under conditions of ambiguity — I might be more inclined to make a guess, and if, by chance, I happened to be correct in all of my guesses, I would end up with a weaker prescription than if I were incorrect in all of them.
Then there are all kinds of situational variables: Do the letters seem easier to read because you’ve been reading them over and over again for the last couple of minutes? Are they getting harder because your eyes are tired? Are you biased by your previous responses somehow? Are you trying to wrap things up because you’re already late for another appointment across town? None of these things are particularly relevant to what the optometrist is hoping to measure, but they all have an impact on the final choice of lenses. This process is good enough to pick a pair of glasses — if you can’t tell the difference between two lenses, they are probably both equally good for day-to-day use. But it leaves us without a satisfying answer to the narrower question the whole experimental apparatus is supposedly built around, i.e., what is the patient’s corneal refractive error?
The questions we ask when studying second language speech perception are often cast in reductionist terms, as if the ability to distinguish two speech sounds from one another were a stable fact about a listener, like her corneal refractive error. But of course many of these issues — about how perceptual systems are deployed, about how decisions are made, and about how task demands evolve over the course of measurement — also arise in any behavioral test of speech perception. Because of these problems with measuring perception, the “seductive allure of brain imaging” for me was not so much that it would explain the “neural correlates” of anything. Rather, the idea was that the brain activity elicited under passive conditions was a better measure of the limits of perception than behavioral tasks because the tasks themselves were interfering unacceptably with what we wanted to observe. This is a bit like Heisenberg’s uncertainty principle, in that you can’t make an observation without in some way altering the thing you are hoping to observe. Our idea was to use a general property of neural responses — repetition suppression — to get around this difficulty.
It seemed like a good idea at the time…
Image credit: Multilingual Snellen charts from India, from here.