This module introduces the students to the foundations of Computational Linguistics, Natural Language Processing and Corpus Linguistics. Natural language is regarded from the point of view of how it could be understood and produced (and in generally processed) by computers and its rich diversity (and complexity) will be shown. Corpus Linguistics applications are presented as a practical complement to Computational Linguistics.
Students attending this module will be able to formulate and carry through an experiment using an appropriate methodology (Natural Language Processing, Corpus Linguistics); in addition to becoming acquainted with formal and computational approaches to natural language, students will learn how computers could be successfully used in document processing and corpora. Since a user-orientated approach will be adopted, students will be expected to familiarise themselves with new computer-based language technologies and tools (programs).
- K. Aijmer & B. Altenberg 1996 English corpus linguistics. Longman
- D. Arnold, L. Balkan, R. Lee Humphreys, S. Meijer and L. Sadler. 1994. Machine Translation. An Introductory guide. Blackwell publishers
- J. Allen 1995 Natural Language Understanding. Benjamin/Cummings Publishing Corporation
- R. Grishman 1986 Computational Linguistics. Cambridge University Press
- R. Mitkov 2002 Anaphora resolution. Longman
- R. Mitkov (Ed) 2003 Oxford Handbook of Computational Linguistics. Oxford University Press
- MARS: Mitkov's Anaphora Resolution System: http://clg.wlv.ac.uk/MARS/index.php
- Animacy recogniser: http://clg.wlv.ac.uk/demos/gender/
- tools for investigating the genre differences using frequency lists: http://www.clg.wlv.ac.uk/projects/style/corpus/index.php
- automatic summarisation systems: http://clg.wlv.ac.uk/projects/CAST/demo/
- online English parser based on link grammar: http://www.link.cs.cmu.edu/link/submit-sentence-4.html
- other corpora resources: http://clg.wlv.ac.uk/~dinel/corpora.html
- An online multilingual summarisation system developed by Pertinence
- Summ-It a summariser developed at the University of Surrey
- SweSum a text summariser for English, Danish and Swedish