TypeTokener
Documentation
The program is based on the algorithms described by Jiří Milička in his article Type-token & Hapax-token
Relation: A Combinatorial Model (Glottotheory 2009).
A) Input
- Choose the text(s) you want to explore. Use multiple selection if you want to add more than one text.
- Specify (in the combobox) the type of the texts. Concordance in the following formatting can be also
used:
WordType1 8
WordType2 19
(The word types should be delimited by a tab from their frequencies, each entry on a new line).
- Specify quantity you are interested in -- number of types, number of hapax legomena , dis legomena etc.
(frequency 1-4 by default).
- Specify whether you want to measure and/or calculate the quantities. Choose output files.
- Specify density of the output (by default, every 10th line will be recorded).
- Specify whether you want to save a concordance file (byproduct of the process).
- Press the "Go!" button.
B) Processing
The algorithm is quite greedy so please be patient when exploring long texts (the progress is not visible).
C) Output
- After the "Done!" message pops up you can click on
- "View RFR chart" to explore rank-frequency relation of the text (click on the "show types" button to view
which point represents which type).
- "View TTR chart" to explore type-token relation of the text. Use checkboxes to specify which line(s) you
want to show.
- Zoom: The left mouse button click and draw in the chart field.
- Scroll: The right mouse button click and draw in the chart field.
- You can save (jpg) and print the charts by clicking on the "Save" or "Print" buttons.
- Change the axes scale to logarithmic if you want to.