|
|
|

Celatro features high-performance indexing tools designed to support full-text
indexing applications. These tools feature customizable alphabets and
tokenizers that can be quickly adapted to work against any language, natural or
artificial. (Language-specific versions are available for all widely spoken
languages.)
This demo illustrates the flexibility and speed of Celatro's indexing
capabilities.
Instructions
To use this demo, first choose a language from the drop-down menu provided
below, then choose the text you wish to see indexed. This can be any text you
can locate using the following demo controls:
-
Use Demo File:
Select a demo file from a drop-down list offering three great works of the
current languageās literature.
-
Upload File:
Specify a URL or browse to a specific folder and file text, xml, html or rtf
files only; must be smaller than 2 MB).
-
Specify an URL:
Specify any document you can locate using an URL.
-
Write Some Text:
Type or paste text to be indexed.
-
Select Encoding:
Depending on your previous choices, you might also need to select the encoding
you wish applied from a drop-down list: UTF-8 or UTF-16.
After configuring your index operation and clicking
Submit, scroll down to view the results,
which are provided in a table that lists: a) each token found; b) the total
count of the given token in the document; c) the frequency with which it occurs
(i.e., the total count of the token divided by the total number of tokens); and
d) the bracket-delimited position of the first five occurrences of the token in
the document. Note that, within each occurrence bracket, the first number
represents the
zero-based index of the token
in the document, and the second number represents
the offset of the token in
characters
(bytes
for ASCII data;
words
for UTF-8 and UTF-16 data).
|
|
|
|
|
|
|
|