TypeTokener

Documentation

The program is based on the algorithms described by Jiří Milička in his article Type-token & Hapax-token Relation: A Combinatorial Model (Glottotheory 2009).

Choose the text(s) you want to explore. Use multiple selection if you want to add more than one text.
Specify (in the combobox) the type of the texts. Concordance in the following formatting can be also used:
WordType1 8
WordType2 19
(The word types should be delimited by a tab from their frequencies, each entry on a new line).
Specify quantity you are interested in -- number of types, number of hapax legomena , dis legomena etc. (frequency 1-4 by default).
Specify whether you want to measure and/or calculate the quantities. Choose output files.
Specify density of the output (by default, every 10th line will be recorded).
Specify whether you want to save a concordance file (byproduct of the process).
Press the "Go!" button.

The algorithm is quite greedy so please be patient when exploring long texts (the progress is not visible).

After the "Done!" message pops up you can click on
1. "View RFR chart" to explore rank-frequency relation of the text (click on the "show types" button to view which point represents which type).
2. "View TTR chart" to explore type-token relation of the text. Use checkboxes to specify which line(s) you want to show.
Zoom: The left mouse button click and draw in the chart field.
Scroll: The right mouse button click and draw in the chart field.
You can save (jpg) and print the charts by clicking on the "Save" or "Print" buttons.
Change the axes scale to logarithmic if you want to.