TEM is a graph-based temporal topic modeling for very small corpora. Detailed information about TEM can be found in the following publication
Liebe L, Baum J, Cech T, Scheibel W, and Dollner J (2024). Detecting and Comparing LLM Capabilities to Human Writers through Linguistic Analysis
To use TEM, first build the model executable and term distance server Docker container.
- Build the model:
An executable will be compiled into
mkdir build && cd build cmake .. && make
build/topic_evolution_model
. - Build the term distance server:
Note that this may take a while since it downloads word2vec models (total of around 8GB).
make -C term-distance build
Once this is set up, you can start using TEM.
- Run the term distance server, e.g. with
This will expose the server on
make -C term-distance run
localhost:8000
. To stop the server, runmake -C term-distance kill
- You can then run the model executable (in
build/topic_evolution_model
) directly or use the provided Python interface in thescript/
directory, which also features parallelization.
Note that you need to specify the URL to the term distance server via the environment variableTEM_WORD_DISTANCE_ENDPOINT
, e.g. if running the executable directlyTEM_WORD_DISTANCE_ENDPOINT=http://localhost:8000/similarity \ build/topic_evolution_model [...]