The psychological validity of collocation and related association measures



In this presentation, I discuss the results of four experiments which combine methods from corpus linguistics and cognitive neuroscience in order to investigate the psychological validity of collocation and different measures of collocation strength. For each experiment, I extracted collocational adjective-noun bigrams from the BNC1994. I then constructed matched non-collocational bigrams which are absent from the BNC1994, and examined concordance lines to find suitable sentence contexts for each bigram pair. Participants then read these sentences on a computer screen one-word-at-a-time while their brain activity was recorded using scalp electrodes. This method of detecting the electrical activity of the brain by placing electrodes across the scalp is known as electroencephalography (EEG). More specifically, I used the Event-Related Potential (ERP) technique of analysing brainwave data, where the brain activity is measured in response to particular stimuli.

The aim of Experiment 1 was to pilot this procedure for determining whether or not there is a neurophysiological difference in the way that the native speaker brain processes collocational adjective-noun bigrams (e.g. clinical trials) compared to matched non-collocational adjective-noun bigrams (e.g. clinical devices). The aim of Experiment 2 was to replicate the results of Experiment 1 in another group of native English speakers, and the aim of Experiment 3 was to investigate the same phenomena in non-native speakers of English (specifically, native speakers of Mandarin Chinese). Finally, in Experiment 4, I treated collocationality as a continuous rather than a dichotomous variable in order to investigate the gradience of the ERP response, and I also aimed to investigate the psychological validity of the following association measures: transition probability, mutual information, log-likelihood, z-score, t-score, Dice-coefficient, MI3, and raw frequency. The results of this research have important implications for the field of corpus linguistics.