logo
topnav

tagline

 

spacer
printr1
demos corner topright
spacer spacer spacer spacer spacer
spacer
spacer

This demo illustrates Celatro's ability to stem various types of documents. Stemming is a very important step for a large number of NLP applications, such as machine translation and information retrieval.

Instructions

To use this demo, please use the following controls to specify the text whose tokens should be stemmed:

  • Use Demo File: Select a demo file from a drop-down list offering three literary works.
  • Upload File: Specify a URL or browse to a specific folder and file (text, xml, html or rtf files only; must be smaller than 2 MB).
  • Specify an URL: Specify any document you can locate using an URL.
  • Write Some Text: Type or paste text to be tokenized.
  • Select Encoding: Depending on your previous choices, you might also need to select the encoding you wish applied from a drop-down list: UTF-8 or UTF-16.

After configuring your search and clicking Submit, scroll down to view the results page, which will show the outcome of Celatro's tokenization: a) the ordinal number of the token, b) the token itself, c) the stem, and d) the suffixes.

Please select a language:

[text, xml, html or rtf]

spacer
spacer
bottomimage