ThematicConcentration
Documentation
The program is based on the algorithms described by Radek Čech, Ioan-Iovitz Popescu and Gabriel Altmann
in their article "Methods of analysis of a thematic concentration of the text" (CSLR 2011).
A) Input
- Choose the name of the file containing the list of the synsemantic words. If the file for the language you
explore does not exist, you should create one and fill it with the most common synsemantic words (enter
delimited).
- Choose the text(s) you want to explore. Use multiple selection if you want to add more than one text.
- Specify the type of the text(s) in the combobox. Concordancy in the following formating can be also used:
WordType1 8
WordType2 19
(The word types should be delimited by a tab from their frequencies, each entry on a new line).
- Specify the case sensitivity.
- Press the "Calculate" button.
B) Processing
With a good database of synsemantic words, the vast majority of the work will be done automatically.
If the "Sort out the synsemantic words manually" option is choosen, each type above the H-point (and not
contained
in the list of the synsemantic words) will be manually sorted (autosemantic/synsemantic). The types sorted as
synsemantic will be added to the list.
C) Output
- Output.txt (in the main directory of the program) is created. It contains Thematic concentration of your
texts in
the format suitable for further processing.
- The file YourFile.txt_DIC is created and it contains the concordancy of the types above the H-point of
YourFile.txt.