Demos of LT applications

  • Syntactic Complexity Sign tagger

    This demo is an implementation of the machine learning method presented in Dornescu et al. (2013) to classify signs of syntactic complexity. In this setting, signs of syntactic complexity are a predefined set of conjunctions, relative clauses, and punctuation marks. The tagger classifies them in accordance with the scheme described by Evans and Orasan (2013). There are three broad groups of signs: coordinators, left boudnaries of subordinate clauses, and right boundaries of subordinate clauses. The classification scheme is fine-grained, with numerous sub-classes of each of these three broad groups. The classifier was trained on the annotated corpus listed below (see signs of syntactic complexity). Sign tagging is an essential part of the sentence analysis exploited in our approach to sentence rewriting (Evans et al., 2014).
  • MARS

  • This demo has been deactivated, due to restrictions on our use of essential third party resources (server obsolescence).
  • This demo is an implementation of the knowledge-poor approach to anaphora resolution presented in Mitkov et al. (2002). It includes a module to classify the pronoun it as either anaphoric or non-anaphoric, in accordance with the scheme presented in Evans (2001).

Annotated corpora

  • Signs of syntactic complexity

    This resource comprises three collections of text from the genres/domains of news, patient healthcare information, and literature. In each case, a subset of signs of syntactic complexity has been annotated with information about their syntactic linking and bounding functions. The subset includes conjunctions, complementisers, wh-words, and punctuation marks. From this page, you can access the annotation scheme, the annotation guidelines, and the corpus.
