|
|
|

This demo illustrates Celatro's ability to stem various types of documents.
Stemming is a very important step for a large number of NLP applications, such
as machine translation and information retrieval.
Instructions
To use this demo, please use the following controls to specify the text whose
tokens should be stemmed:
-
Use Demo File:
Select a demo file from a drop-down list offering three literary works.
-
Upload File:
Specify a URL or browse to a specific folder and file (text, xml, html or rtf
files only; must be smaller than 2 MB).
-
Specify an URL:
Specify any document you can locate using an URL.
-
Write Some Text:
Type or paste text to be tokenized.
-
Select Encoding:
Depending on your previous choices, you might also need to select the encoding
you wish applied from a drop-down list: UTF-8 or UTF-16.
After configuring your search and clicking
Submit, scroll down to view the results
page, which will show the outcome of Celatro's tokenization: a) the ordinal
number of the token, b) the token itself, c) the stem, and d) the suffixes.
|
|
|
|
|
|
|
|