I just finished my first neuroscience postgraduate course at the Open University on Neural networks & Cognitive Neuropsychology. (Another reason why I have not been posting as often) I passed, which might come as a shock to those who so far dismiss my arguments on the basis that I only have a PhD in physics and they have studied X, Y, Z. In fact, I got 90% in my final essay, and 75% in my assignments. I am posting my essay as evidence! I hope you enjoy. Please feel free to comment.
ECA DS 871
Author: Tom Weidig
Words: 4820 words with references
The advent of connectionist modelling represented a considerable blow to proponents of the neuropsychological approach.
Imagine you have a wrinkly dirty fatty pulp in front of you. You can study the fine microscopic details, find the nerve and glia cells, and still have only increased your fascination for the brain. This pulp is a dead fellow human's brain which analysed sensory input and listened to Chopin's nocturne, generated mental processes and deviced a plan to rob a bank, stored information and remembered childhood memory of the little creek behind their house, and controlled a body and hit his opponent a bloody nose. But you have no clue how the brain did what it did. What do you do? You might also be a doctor, and witness the diverse strange behaviour of patients with strokes or blows to the head. He comprehends speech but cannot speak any more. She can speak but not comprehend speech any more. He doesn't like his left arm any more and wants to get rid of it. Her memory span is restricted to five minutes telling you the same joke over and over again and laughing to your same joke every single time. There must be a pattern behind the functional break-downs giving clues to the organisation of the brain. You cannot look inside a living brain, but you can do a post-mortem anatomical examination looking for abnormal brain structure and relating your finding to observed deficiencies.
That's pretty much the atmosphere and constraints within which cognitive neuropsychology emerged. Clinicians were faced with a plethora of deficiencies caused by strokes, brain injury or brain disease, and the science-minded ones wanted to know what caused A to only loose speech comprehension and B only to loose speech. And with the hope that the study of these special brains sheds light on the normally functioning brain. The French physician Paul Broca's research is one of the earliest examples on how you can understand the normal brain better by looking at the damaged brain. He systematically studied aphasic patients, patients with speech production issues, post-mortem, and found lesions in the inferior frontal lobe of the left brain hemisphere - an area that is now known as Broca's area. His life work had great influence on the understanding of the brain, specifically on the lateralisation and localisation of speech functions, and he inspired others to follow and distil his approach. For example, the inspired German physician Carl Wernike looked at patients with inability to understand speech. He was able to identify more posterior regions of the left hemisphere, which were critical for language comprehension - a lack of understanding of language prevents attaching meaning to perceived speech sounds. Their prominent work in the 1860s and 1870s inspired many others to find similar relationships between physically localised lesions and functional deficiencies. The approach also reveals the interdependence of deficiencies and therefore gives information on whether the corresponding underlying processes share resources or information. A good example of double dissociation is given above: He comprehends speech but cannot speak any more. She can speak but not comprehend speech any more. Patients with pure Broca's aphasia and with pure Wernicke's aphasia suggest that speech production and language comprehension are two independent processes that do not share resources or information. Scientists started to realize the importance of searching for case studies with double dissociation.
Intensive studying of damaged brains gave rise to many interesting relationships and independences. Scientists started to order the new knowledge into flow (box and arrow) diagrams. Such frameworks are enormously helpful as they make explicit the qualitative relationships between different functions and processes. A constructive and focused debate is now possible, because a rejection of a framework cannot be categorical and vague any more. Your counterpart needs to identify the disputed box or arrow and provide solid counterarguments based on experimental findings or logical arguments. Or challenge your experimental evidence and identify flaws. A good example is the field of visual agnosia. Based on case studies, Lissauer (1890) proposed a simple framework for visual agnosia: High-level visual processing involves visual analysis first and then attribution of meaning to the perceived object. He labels a corresponding deficit in these areas as "apperceptive agnosia" (inability to create a stable percept from intact incoming low-level visual information) and "associative agnosia" (inability to associate meaning from semantic memory to the percept). Lissauer's scheme is a good first-order approximation and as such still clinically useful today. A simple copying test can distinguish between the two types of aphasia. If you are able to copy an object in drawing but unable to identify the meaning of the copied object, you will have appercetive agnosia. However, later more carefully designed and detailed studies have identified cases with different types of apperceptive agnosia: some pass the copying test but still have apperceptive agnosia. For example, Kartsounis and Warrington (1991) found that Mrs FRG was able to pass visual acuity tests but was unable to discriminate between object and background. Warrington (1985)'s model use a variety of different types of agnosia to suggest the existence of sub-stages of visual processing, and propose refinements to Lissauer’s model. Visual analysis divides up into shape coding, figure-ground segmentation, and perceptual classification. Another example of a box and arrow diagram approach is Ellis and Young (1988)'s functional architecture for hearing and speaking words. Again, the framework was developed over many years and based on sharper and sharper studies using older frameworks are guiding lights.
As with every experimental paradigm, you try to push as far and as deep as possible until you reach insurmountable limits. And limits were found. Very often the physical and functional damage is broad and fuzzy. Especially strokes can affect many regions and range from total to minor physical damage. The notable exception are purposeful lesions in animals and surgical intervention to ease severe epileptic attacks. And possibly the case of NA whose room mate's sword pierced through his nose into his brain causing a pure form of amnesia! Moreover, patients rarely show clear deficiencies but a range of deficiencies, and brain plasticity is a non-negligible factor, too. The rarity of clear case studies in lesion and dysfunction raises statistical issues, because they might have arisen by chance. Whereas if you have hundreds of cases, random fluctuations average out. Also, at least before the imaging age, anatomical studies could only be done after the patient's death when the brain would have aged and might have adapted somewhat. Finally, experiments are hard to control and all kind of biases can creep in, even for the most experienced and clever scientist. A good example is the debate on whether the brain handles living and non-living things differently. After initial excitement, other explanations emerged. First, pictures of living things are generally more complex: e.g. compare the picture of a fly to the picture of a cup. Any visual system should have more difficulties in processing a complex picture than a simple one, especially if damaged. Second, living things look similar to each other whereas non-living things do not. For example, there are many animals with four legs, a head, and a tail, but tools are all different because each tool has a different function and we would therefore expect them to look differently. A damaged visual system should have more difficulties distinguishing between similar objects. Third, familiarity could play a role. Humans see and handle tools on a daily basis, and the visual system processes pictures of familiar objects much more often. It is easy to see how a visual system tuned to familiar objects retains some abilities after damage. No doubt all of these factors can influence the performance between living and non-living, and the brain might well not think in terms of living and non-living. Therefore, Stewart et al (1992) and Funnel and de Mornay-Davies (1996) asked for more careful designs to control for these factors.
Connectionist modelling has a completely different approach. Remember. At the beginning, we found the fine microscopic details of the brain, the nerve and glia cells, but this new knowledge had only increased our fascination with the brain. Cognitive neuropsychology ignores the foundations on which brain processes must ultimately rest - the neuron: it is a top-down approach. Connectionist modelling on the other hand is a bottom-up approach. With the advance in computing power connectionist modelling was able to take the neurons seriously. Let us take a physics analogy. A gas is macroscopically well described by Boyle's perfect gas law with pressure, temperature, and volume, but microscopically a gas consists of billions of atoms colliding like fast moving gluey tennis balls. Computer simulations are now able to derive the macroscopic properties of a gas from the microscopic properties of the individual atoms. In neuroscience, the microscopic world of the individual neuron is reasonably well described, and so is the macroscopic behaviour of the brain like reading or re-calling memory. However, the holy grail of understanding is unreached: to describe the brain's macroscopic behaviour in terms of neurons. Connectionist modelling takes up this challenge: to directly simulate the interactions of neurons however difficult. No-one would doubt that all cognitive neuropsychologists when drawing their boxes and arrows knew of the ultimate incompleteness of their approach. They, especially the clinicians, would rightfully argue that theirs is a functional approach, trying to understand the broad flows of information from process to process. A first order approximation sufficient to understand and treat their case load. Surely, the neural networks would only but confirm their boxes and arrows as an emergent structure of neurons, for they have used a scientific approach to gain knowledge. However, I believe that neuropsychologists committed a forgiveable logical fallacy: If a theory fits well all known observations, it must be correct. I am not referring to the Popperian ideal of a falsifiable theory, but to the possibility that several theories (of a very different nature) could fit all known observations and how do you then decide which one is actually implemented by nature? Imagine you dance at a night club and no-one dances with you. Your theory that the girls are just too shy fits your observation, but of course there is an alternative theory, namely that you are a bad dancer or that no-one likes you! Connectionnist modelling gave birth to new insights, new and different ways of getting the same ouput but with a brainy feel. Neural networks strikingly show brain-like properties, to name a few: resistance to damage, tolerance to noisy inputs, retrieval by content, distributed memory, parallel processing, typicality effects in memory retrieval (i.e. ability to hold a prototype of a set of similar objects), ability to deal with new inputs based on experience. An amazing feat, considering that many features of neuronal activity are left out: the glia cells, the neurotransmitter levels at the synapses, the 1000s of connections, firing for no reason, non-linear effects. And now you can start asking interesting questions: Which model is correct: the box and diagram one or the neural network one? Is the box and arrow diagrams not too deterministic and computer-like? If indeed the brain is like the neural network, how can we ever model memory with box and arrow diagrams? Moreover, neural networks have brought in the quantitative area of neuroscience. It is not good enough to just run a qualitative approach with flow diagrams, but you need to quantify and explain exactly what is going on. Let's take Lissauer's model on visual processing. He assumes object identification first and then attachment of meaning to the object. It does not really help you to know this if you want to build a visual system yourself. It is like telling an overweight person to eat a bit less and do some sports. Correct, but how do you implement this strategy? Only by going through the steps, writing down an algorithm, constructing a neural network can you find out whether what looks good on paper is also implementable. Connectionist modeling is a useful tool to check whether realistic models can at least in principle be built using the current theoretical understanding about the system.
Here is an example on how connectionist modelling became a rival in love. Two very different models, one neuropsychological and one network, generate the same output: the past tense in the English language. Many verbs have regular endings of the past tense (show-> show-ed), but there are also irregular ones (go->went). How does the brain conjugate the verbs? Maybe the brain of a child has learned the rule: if you want to have the past tense of a verb, just add -ed to the end. And how do you deal with went, came, fought? Maybe the child has the exceptions to the rule learned by heart and added them to list: go -> went, come -> came, fight -> fought. Therefore for every verb, the brain might look in the exception list first, uses it if present, and applies the rule if absent. But that's a box and arrow model. How does the brain do it? Pinker (1994) argues in favour of such a dual-route model based on a double dissociation argument: in some disorders the regular form is impacted and in others the irregular one. In Pinker and Prince (1998), they suggest that there is a rule-based route (adding –ed to the ending), and a route with a list of exceptions that block the creation of a regular form for an irregular verb. (I never understood why the list has to block the rule. Is it not simpler to first look up the exception and if not found, use the rule?) In any case, the over-regularisation argument is key here. Children often make errors like goed instead of went, even though they have used the correct past tense before. The argument goes that at first children learn past tense by heart, then learn a rule on how to form past tense, and here is where they make the over-regularisation mistake until the exceptions list strengthens. Connectionist modelling challenges this neat and clear-cut computational approach to brain processes. For example Plunkett and Marchman (1996) have created a neural network that is able to conjugate the verb correctly, and most importantly without any rules or list of exceptions. It is just a neural network with many weights trained on regular verb and irregular verbs. Suddenly it is at the least feasible that a single route model with a messy structure very unlike a computer does both functions, which means that both functions share the same resources.
But why was Pinker able to argue for a dual neuropsychological model supported by a double association argument if the real thing could be a single route neural network instead? Juola and Plunkett (1998) showed that it is possible to construct a neural network that has the appearance of a double dissociation. Double dissociations can arise from nothing else but extreme cases of the natural variation of a lesioned network, and possibly amplified by the publication bias (only interesting and rare cases are published). Again, there was a network trained to conjugate verbs. The information of the irregular and regular verbs are distributed in the weights of the connections between neurons. However, by chance there are some weights which are more responsible for regular verbs and some weights are more responsible for the irregular once. If you selectively damage or remove a connection, the deterioration of the performance will not be evenly distributed between regular and irregular verbs. I saw this phenomena in my TMA02 work: depending on which weight I set to zero, a different horse's name was affected. Now assume that you have 100 people with a stroke. In the majority of cases, weights dedicated to both type of past tense will be affected. But in a few the pure weights are the only ones affected: either the weights most responsible for regular verbs are affected by chance, or the weights mostly dedicated to irregular verbs. So you can get a few patients with inability to do regular verbs and a few patients with an inability to do irregular verbs. A typical double dissociation but based on a single route. This opened up a Pandora box on neuropsychology's use of double dissociation and modularity concepts: Is the ability to perform a certain function distributed across the whole brain or an interaction of different brain regions? Or is there only one specific brain area responsible? Is a double association evidence for anatomically distinct modules or only evidence that the processes underlying the two functions do not share resources? However, we should not forget that learning the past tense in English is a cultural thing and cannot be an innate module. In Norwegian, you have two regular verb forms plus irregular verbs. So are there three routes? Today we think of the brain as composed of interacting modules shaped by evolutionary pressures. I would argue that the failure of the double dissociation argument is restricted to learned functions rather than on a wide category of related functions or at innate functions. The interesting question is what is innate: the past tense is not! But is the language ability as such an innate module or an interaction of more general purpose modules arisen from interaction with the environment especially culture?
Connectionist modelling also show interesting alternative learning paths similar to real development, and thereby highlights the inability of neuropsychological models to adequately model the development of functions in children in any detail. Neural networks are not just possible models on how a skill is generated but also on how the skill is learned. Above, we have come across the past tense. Plunkett and Marchman (1996)'s model is not just interesting as an end-product, but the training of the model can be viewed as a model of how children learn. Children learn in a U-curve. First they perform well because they learn the past tense by heart. Then they make mistakes while at the same time learning the rule of regular verbs, and finally they have mastered the past tense. Plunkett and Marchman (1996) has taught the net by feeding it with regular and irregular verbs in the same way as children encounter verbs. They compared the performance to real data from a children's study by Marcus et al (1992), and find similar results: early learning is error-free, over-regularisation errors happen but are rare for common irregular verbs, irregularisation (conjugating a regular verb irregularly) errors are rare. To conclude, a blank slate neural network is all you need to learn the past tense, no innate module is required. At least not for this task. Again we should probably make a distinction between evolutionarily recent skills and ancient skills needed for spreading the genes (like survival).
However, neural networks have an Achilles’ heel: they do not include a realistic model of learning. In a sense, neural networks do what in fact drove them away from computationalism, namely to mathematically describe a phenomena rather than show how it emerges. The weights of the connections between neurons are artificially determined by mathematical rules like the gradient descent method with the help of artificial concepts like error back propagation. How can connectionist modeling claim to build up the brain from scratch if the learning process is not well modelled? In fact, in my opinion the lack of realistic learning leaves neural networks open to a serious loophole: data mining. Neural networks have a huge number of weights that need to be trained, and there are billions of billions of billions possible constellations. In a sense it is not surprising that they can model any system with a bit of fine-tuning, trial and error, and little constrain on learning, because they have so many degrees of freedom. Let's take an analogy. This essay has about 5000 words, and in fact there are an unimaginable variety of possible texts possible; including a scence from Hamlet and the author's eulogy! I can easily train my document by adjusting the words of the essay to match those of a passage in Hamlet. Surely, we would not say that this is in any way a success of my document itself. So would we not expect neural networks to fit any system? Yes, neural networks are trained on a training set and then tested on a different set, but often the training set implicitly contains the rules. I believe, the artificial learning might well leave the system too unconstrained. On the other hand, you can argue that that's precisely the secret of the brain - it's enormous flexibility. And maybe most of its activity is indeed not real understanding but just re-coding and fitting output to expected output.
Not all is lost for cognitive neuropsychology. Sure, connectionist modelling has become a rival in love, challenges many assumptions, and provides food for thought. The neuropsychological tools are not as sharp as previously thought. However, the advent of new revolutionary tools to explore the brain has rejuvenated the field. And will keep everyone busy for many years to come. First of all, the imaging techniques have become very powerful, allowing real-time insight into the structure and functioning of the living brain. Instead of having experimental results now and a post-mortem study years later, you have experimental results and imaging data at the same time. You now have three points of reference: the task performance, the functional imaging data, and the structural data. And very importantly the resulting quantitative data allows to quantitatively compare to control brains and run proper group difference statistical tests. For example, the advent of CAT and MRI scans have made it possible to see the structure visually in vivo, and detect blood clots, squeezed brain regions, and dead tissue at high resolution. The DTI technique is also interesting, because it allows to track fibre structures in the brain and see whether there is disruption between different regions. Functional imaging tools add another twist to the study of damaged brains. You can monitor the indirect BOLD (blood oxygenation level dependency) signal and the direct electrical activity of neurons. PET and fMRI work on BOLD. The idea is that functionally active regions with plenty of firing neurons consume more blood to re-fill their battery literally - the action potential. Functional imaging data is revolutionary in the sense that you can take a time series of scans over several seconds, and see the brain at work. Or, you can scan the damaged brains over a longer period of time and monitor recovery. Of course, this can also be done for the normal brain but a damaged brain gives extra information. However, the functional BOLD methods are reaching their limits because blood flow is by nature fuzzy, not 100% correlated to neuronal activity and does not have a high time resolution. And of course because the real stuff is the electrical activity of the billions of neurons. Two techniques stand out here. EEG is more of a surface electrical activity measurement tool, whereas MEEG aims to give a 3D view of electrical activity in the brain using magnetic fields. MEEG has technical but probably solvable issues: from a 2D surface map you need to deduce a 3D activation pattern - an inverse problem. The imaging field is maturing in terms of resolution and technology (with the notable exception of MEEG which requires more work), and progress will mostly be on improved signal analysis and interpretation, using several techniques at once, lighter scanners and experimental design improvements. Very importantly, the critique on using case studies is also dramatically weakened, because we now have so much more information on a single damaged brain. Surely, a competitor in the flat screen industry only needs to look at one of your screens to be able to understand your new innovative technology, because he can take it apart.
Another area of improvements is purposeful temporary lesioning in humans without ethical concerns. The technique is called TMS, and sends out magnetic pulses that temporarily disturb the electrical activity in the targeted brain region therefore knocking out a region for a brief period of time. Experimental subjects can then be tested on tasks and their performance can be compared to a control group. For example, you ask a subject to talk and then you can aim the TMS on Broca's region. The subject will suddenly not be able to talk any more. Aiming TMS at other areas does not have this effect. However, TMS is a rather clumsy method and it is difficult to control the target area. Therefore, animal models are still very important for our understanding of the brain. Scientists are able to experimentally control the degree and area of damage with knocking out genes, well-controlled surgical lesions, targeted medication all inside a multi-technique scanner. Finally, open brain surgery inspired by Penfield's work is still done and perfected. We should also not overlook the importance of more indirect tools like genetics. Our understanding of the human genome and some gene's impact on brain function has become a powerful tool. Many genes play a crucial role in the building of the brain via protein coding. Research in many brain-based disorders have revealed a genetic component, and current research is revealing the impact of specific genes on proper brain functioning. Clearly, gene research is not relevant to patients with brain strokes and injuries, but for brains with a permanent dysfunction. A good example is the FOXP2 gene. Specific language issues, developmental verbal dyspraxia, has been linked to mutations in this gene. So the study of damaged brains can now be seen from yet another angle and tackle developmental brain disorders. The experimental tasks given to these patients are then linked to the genetic and imaging data. In fact, the impact is even wider, because evolutionary biological consideration can into play. The gene is needed for proper language, when did it evolve? Do apes have it, too? Is it just helping out other genes? The FOXP2 is an impressive example on how the study of damaged brains can open up new avenues into understanding our brain. Finally, neuropharmacology is also adding new tools to discover the damaged brain or to induce temporary damage to a region or pathway. How does cocaine impair the brain? Do dopamine-receptor blocking agents modify the functioning of a damaged brain?
To conclude, did the advent of connectionist modelling represent a considerable blow to proponents of the neuropsychological approach? Let me first be pedantic, and religiously point out that science is not about people but about statements, arguments supporting a statement, and counterarguments! So did connectionist modelling represent a considerable blow to the neuropsychological approach? No, it is more of a wake-up call. There are very different ways to model a brain. You need to explain every step quantitatively in detail. You have to be more careful with double dissociation arguments. However, the core idea of neuropsychology is still alive: you can find out a lot by studying what went wrong. No approach is better than the other, like cities: neither Paris nor London are better, they are just different ways of living. In fact, I argued above that the revolution in imaging and genetics considerably help neuropsychology to re-invent itself. You can zoom in much more into case studies. But maybe due to the genetics revolution there should be more emphasis on developmentally damaged brains like in dyslexia, stuttering, laTourette syndrome as opposed to messy damaged brains by stroke or injury. These are exciting times with so much new information from many different tools exploring each a completely different aspect of the brain. But of course there is only one brain, and at some point they all need to come together in meta-analysis. A challenging task, and maybe even impossible. I like to compare the brain to a big city with many interacting objects: factories, highways of information, objects and people, executive branches shaping the city, construction plans as genes, neurotransmitters as the weather and so on. There are many different ways to explore a city: by foot, by listening to people's stories, by buying TimeOut, by satellite pictures, by statistics. Still even though I have information on all of it, I still find it difficult to say exactly what for example London is. Where does it start where does it end? If I take Westminster away, is it still London? Whatever happens. At least the clinical side of neuropsychology will never die. The patients with damaged brains will always exist and they have specific deficiencies, and a good clinician must ask what causes these deficiencies in order to best devise a plan for the best possible recovery. But they don't care about neurons or networks, they want simple box and arrow models that are roughly right. But still I want to understand from the neuron onwards why someone doesn't like his left hand any more and wants to get rid of it.
Kartsounis, L.D. and Warrington, E.K. (1991). Failure of object recognition due to a breakdown of figure-ground discrimination in a patient with normal acuity. Neuropsychologia, 29, 969-80.
Warrington (1985). Agnosia: the impairment of object recognition. In P.J. Vinken, G.W. Bruyn, and H.L. Klawans (eds), Handbook of Clinical Neurology (vol. 45, pp. 333-49). Amsterdam: Elsevier Science Publishers.
Ellis, A.W. and Young, A. W. (1988). Human Cognitive Neuropsychology. London: Lawrence Erlbaum Associates.
Lissauer, H (1890). Ein Fall von Seelenblindheit nebst einem Beitrage zur Theorie derselben, Archiv für Psychatrie und Nervernkrankheiten, 21, 222-70
Stewart, F., Parkin, A. J., and Hunkin, N. M. (1992). Naming impairments following recovery from Herpes Simplex Encephalitis: Category-specific? The Quarterly Journal of Experimental Psychology, 44A, 261-84.
Funnel, E, and de Mornay-Davies, P.D. (1996). JBR: A re-assessment of concept familiarity and a category-specific disorder for living things. Neurocase, 2, 135-153.
Pinker, S. (1994). The language instinct: how the mind creates language. William Morrow, New York.
Pinker, S. and Prince, A. (1998). On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28, 73-193.
Plunkett, K. and Marchman, V. A. (1996). Learning from a connectionist model of the acquisition of the English past tense. Cognition, 61, 299-308.
Farah and McClelland (1991) in Cohen,G., Johnston, R. and Plunkett, K. (eds) 2000 Exploring Cognition: Damaged Brains and Neural Nets, Hove, Psychology Press (Taylor & Francis).
Devlin in Cohen,G., Johnston, R. and Plunkett, K. (eds) 2000 Exploring Cognition: Damaged Brains and Neural Nets, Hove, Psychology Press (Taylor & Francis).
4 comments:
Congratulations!
Congratulations.
As it happens, I'm doing a masters in physics at the moment (just for fun), and have just today received my assessement for my 2 essays (9000 words each). The official overall results will be released in 2 weeks, but it looks like I scored over 90% for both subjects. I'm due to finish the course in the middle of next year, and I'd like to do a neuroscience course after that. I'd be very interested about the course you're doing, because I have been unable to find a good neuroscience course anywhere.
Congratulations Dr. Weidig! Well done!
Congrats! Glad you've passed :) (although I didn't doubt you would gain anything but a fantastic score!)
Post a Comment