Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • the import is faster, because 2 TermTagger do the work instead of one
  • saving of segments in the GUI is not slowed down by running imports

Understand the setup by explaining ow the OpenTMSTermTagger is working

For the recommended setup there are multiple reasons.

CPU usage and amount of TermTagger instances

  • While a TermTagger instance is tagging terms, it does not respond to any other requests. Each new request is waiting for the TermTagger until it responds.
    • This led to several TermTagger DOWN messages in the past, now the timeouts are adjusted and a timeouted termtagger is not considered as DOWN anymore.
  • One TermTagger instance is using one CPU core when tagging, so the count of the parallel running TermTagger instances should be oriented along to the available CPU cores
  • Since for the GUI tagger only the one saved segment is processed by the tagger, it responds fast and the latency is low, so one tagger for the GUI should be enough for most setups
  • Depending on how much imports are running and how big the imported tasks are, multiple import taggers can improve the throughput. Recommendation: 2-3 import term taggers.

Memory usage

  • The memory usage of the TermTaggers increases over the time, since the used TBX files are kept in memory.
  • Therefore the maximum memory of the import TermTaggers could be decreased (to about 1GB) while the GUI TermTagger should be slightly bigger to keep more different TBX files at the same time.
  • If the TermTaggers are consuming to much memory (3 TermTaggers with max mem 2,5GB = 7,5GB RAM just for the TermTaggers) the linux kernel may kill a TermTagger instance.
  • if a instance was killed by the kernel, check the dmesg output, it may look like:
    [Mo Mär 9 12:00:46 2020] Out of memory: Kill process 29388 (java) score 209 or sacrifice child
    [Mo Mär 9 12:00:46 2020] Killed process 29388 (java) total-vm:45973548kB, anon-rss:2246996kB, file-rss:0kB, shmem-rss:0kB
    [Mo Mär 9 12:00:46 2020] oom_reaper: reaped process 29388 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
  • When using supervisord with our example config the TermTagger instance is automatically restarted then

How to setup this set up

Edit your termTagger configuration, which you find in

...

Ensure, that in this file the following line exists and is NOT commented out (this is not needed if setup with supervisord):

Code Block
languagebash
SERVERS=("http://localhost:9001" "http://localhost:9002" "http://localhost:9003")

...

Restart your termTagger, if you run termTagger as a service. With supervisord this would look like:

Code Block
supervisorctl restart termtagger:* 

A nightly cronjob (for example when translate5 is in maintenance due your daily backup) should restart the TermTagger instances to clean memory.