ThematicConcentration

Documentation

The program is based on the algorithms described by Radek Čech, Ioan-Iovitz Popescu and Gabriel Altmann in their article "Methods of analysis of a thematic concentration of the text" (CSLR 2011).

A) Input

Choose the name of the file containing the list of the synsemantic words. If the file for the language you explore does not exist, you should create one and fill it with the most common synsemantic words (enter delimited).
Choose the text(s) you want to explore. Use multiple selection if you want to add more than one text.
Specify the type of the text(s) in the combobox. Concordancy in the following formating can be also used:
WordType1 8
WordType2 19
(The word types should be delimited by a tab from their frequencies, each entry on a new line).
Specify the case sensitivity.
Press the "Calculate" button.

B) Processing

With a good database of synsemantic words, the vast majority of the work will be done automatically. If the "Sort out the synsemantic words manually" option is choosen, each type above the H-point (and not contained in the list of the synsemantic words) will be manually sorted (autosemantic/synsemantic). The types sorted as synsemantic will be added to the list.

C) Output

Output.txt (in the main directory of the program) is created. It contains Thematic concentration of your texts in the format suitable for further processing.
The file YourFile.txt_DIC is created and it contains the concordancy of the types above the H-point of YourFile.txt.