Research seminars

Wednesday 19th October 2016

Using a core word to identify different realisations of semantically related formulaic sequences as a marker of authorship.

Samuel Larner (Manchester Metropolitan University, UK) 

In some crimes, the issue of authorship (who wrote a text) may constitute a crucial piece of evidence, such as with a disputed will, a forged suicide note, or an anonymous terrorist threat. Forensic linguists analyse the language in these texts to see if they can identify the likely author. The problem is that authors can attempt to disguise their style (e.g. Shuy, 2001). Stronger markers of style are likely to be those which move beyond relatively surface level features such as non-standard spellings, and instead focus on features which authors may less easily disguise. In previous research, I have argued that formulaic sequences—prefabricated sequences of words believed to be stored as holistic units—should make an excellent marker of style because authors are unlikely to be aware of the individual words contained within (Larner, 2014). However, there is no clear-cut way to robustly identify all, and only, formulaic sequences in text. This research argues that if one particular word can be isolated which occurs frequently in formulaic sequences—a core word—then a reasonable sub-set of word sequences will be identified, the majority of which can be expected to be formulaic. Using the core word way which occurs in many formulaic sequences (e.g., in a way, by the way, by way of), the aim of this research is to establish whether individual authors use different way-phrases from each other. Secondly, the method attempts to establish whether, on the occasions that the authors express a particular meaning linked to a way phrase, they use way-phrases or alternatives that do not include this core word. The results indicate that for one author, the phrase in a way appeared to be used distinctively. Therefore, there is potential for formulaic sequences to be used as a marker of authorship, albeit for only one author out of twenty, which diminishes the usefulness of such a marker in a forensic context.


Shuy, R. 2001. 'DARE's role in linguistic profiling', DARE Newsletter, 4 (3 (Summer)), 1-5.

Larner, S. 2014. 'A preliminary investigation into the use of Fixed formulaic sequences as a marker of authorship', The International Journal of Speech, Language and the Law, 21 (1), pp. 1–22. 

Hazel Price