BiRD - Research Questions

An automatic system to Build Resource Databases for researchers

Project's logo


Research questions and problems to be addressed

This project will investigate the extent to which methods from computational linguistics can be used to automatically compile lists of resources. In addition to the system to be developed, the project will provide new insights into the discourse structure of email messages and web pages, and will create a corpus of emails and web pages containing information about resources which will be annotated for inter and intra document coreference, and for important notions, such as names. Emerging fields from computational linguistics like cross-document coreference and multi-document summarisation will be investigated. A new evaluation methodology will also be elaborated. Templates will be used to encode information about resources. Templates are normally built by experts which makes them expensive. A semi-automatic corpus-based template acquisition process will be sought in this project. All the modules to be developed in this project will be tuned to process web pages. This will be beneficial for other researchers processing similar texts.

Motivation
The system to be implemented




This page is mantained by Richard Evans