logo
topnav

tagline

 

spacer
printr1
demos corner topright
spacer spacer spacer spacer spacer
spacer
spacer

This demo illustrates Celatro's ability to quickly tokenize various types of documents. In order to process different languages and the writing systems they might use, Celatro's tokenizing capabilities are based on language-specific alphabets.

Instructions

To use this demo, first choose a language from the drop-down menu provided below, then use the following controls to specify the text to be tokenized:

  • Use Demo File: Select a demo file from a drop-down list offering three great works of the current language’s literature.
  • Upload File: Specify a URL or browse to a specific folder and file text, xml, html or rtf files only; must be smaller than 2 MB).
  • Specify an URL: Specify any document you can locate using an URL.
  • Write Some Text: Type or paste text to be tokenized.
  • Select Encoding: Depending on your previous choices, you might also need to select the encoding you wish applied from a drop-down list: UTF-8 or UTF-16.

After configuring your search and clicking Submit, scroll down to view the results page, which will show the outcome of Celatro's tokenization: a) the ordinal number of the token, and b) the token itself.

Please select a language:


spacer
spacer
bottomimage