Please enter the text:


Compression rate:   ?      Token:  ?

Weighting method:  ?      Produce:  ?

Type of text:  ?      Use stoplist:  ?      Show terms:  ?


Compression rate: indicates how much of the text you want to keep in the summary. The length is computed in sentences rather than words and characters, so sometimes the extracts might seem long.

Tokenisation method: allows to choose which tokenisation method will be used to compute the statistics. The possible values are word when the actual surface form is used, lemma from WordNet when WordNet is used to lemmatise the text, lemma from FDG when FDG is used as lemmatiser, stem Porter's stemmer is used to reduce the words, and truncation when only the first 6 characters of each word are used. Each method has advantages and disadvantages. For example, truncation is very simple, but sometimes it fails to identify that two words are derived from the same morphological root. A more detailed discussion about each method can be found in our LREC paper.

Weighting method: allows to choose the weighting method used to score the words. An explanation for each of them can be found in our LREC paper.

Produce: indicates the type of output produced by the system. The possible values are extract when an extract is produced, in context when the senteces selected are marked with a different color in the text, and XML output when an XML text is produced.

Type of text: indicates what kind of text is to be summarised. This parameter influences all the weighting methods except term frequency. The available values are newswire text and scientific text. In order to get the best results use the genre which is the closest to the one of the text to be summarised.

Use stop list: indicates if a stop list is used during the summarisation process. The stop lists are used to filter words which do not have a benefic influence on the summarisation process. Experiments show that the quality of summaries improve when stoplists are used.

Show terms: displays the top 20 terms identified by the program usign the selected term weighting method.

.
Last changed: 23 Mar 2011