LN4004: Machine Translation and Other NLP applications
Module leader: Dr. Georgiana Marsic
Semester: 2
The module explores the main theoretical and practical aspects of many Natural Language Processing (NLP) applications, including Machine Translation (MT), i.e., the use of computers in translation, as well as many other language technologies such as Question Answering, Information Extraction, Text Summarisation, Text Retrieval, Text Simplification, Web/Opinion Mining, Dialogue Systems and Speech Recognition.
The students familiarise themselves with popular language processing software, look at freely available solutions and learn the rational and main applications of such software.
On the successful completion of the module students will be able to demonstrate:
- Thorough understanding of MT and other key NLP applications
- Knowledge of how MT and the different NLP applications work and the challenges associated with them
- Knowledge about the areas where NLP could have a beneficial impact
- Evaluation, comparison and critical analysis of NLP systems
Recommended reading for this module
- D. Arnold, L. Balkan, R. Lee Humphreys, S. Meijer and L. Sadler. 1994. Machine Translation. An Introductory guide. Blackwell publishers. (On-line version in ps and pdf format available at http://www.essex.ac.uk/linguistics/clmt/MTbook/.
- W. John Hutchins and Harold L. Somers. An Introduction to Machine Translation. Academic Press, 1992
- Y.Wilks. Machine Translation: Its Scope and Limits. Springer, 2009
- P. Koehn. Statistical Machine Translation. Cambridge University Press, 2010
- C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing. Cambridge, MIT Press, 1999
- D. Jurafsky and J. Martin. Speech and Language Processing. 2nd Edition. Prentice-Hall, 2008
- A. Lopez. Statistical Machine Translation. In ACM Computing Surveys 40(3), Article 8, pp. 1 - 49, 2008 (http://homepages.inf.ed.ac.uk/alopez/papers/survey.pdf)
Electronic resources for this module
- Moses: Statistical MT system: http://www.statmt.org/moses
- Joshua: Hierarchical Statistical MT system: http://cs.jhu.edu/~ccb/joshua/
- Corpora for translation: http://www.statmt.org/europarl/
- CAST: Automatic summarisation system: http://clg.wlv.ac.uk/projects/CAST/demo/
- Cunei: example-based MT platform: http://www.cunei.org/about/
- GATE: Framework for Information Extraction: http://gate.ac.uk/
- Google: Information Retrieval system: http://www.google.com
- Text categorization demo/system: http://alias-i.com/lingpipe/demos/tutorial/classify/read-me.html
- Ferret: Plagiarism detection system: http://homepages.feis.herts.ac.uk/~pdgroup/
- HTK: Speech recognition software: http://htk.eng.cam.ac.uk/