Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The developer can define, in which order background processes should run over the the segments. E. g. first pre-translation, then termTagging, then segment analysis. In this case the first termTagging worker for a task will only start, after the last pre-translation worker is finished.

...

The basic idea is, that each one process mostly runs completely independent (kind of an own thread) from the others. Such a process is called a "worker" in translate5. The workers can be called directly by a php-class (then it does not run as as an own process, but as part of the calling php-class) or they can be queued in the DB-table Zf_worker.

PauseImportWorker

The worker was designed to pause another worker until certain conditions are met. 

If you need to wait until something is happened or finished before starting a particular worker do next:

  - Create a new worker class extending  MittagQI\Translate5\Import\PauseImportWorker and make it to be a dependency of the worker that you want to pause

  - Create a class implementing PauseWorkerProcessorInterface

  - Add your just created PauseWorker to be queued with a worker that you want to pause

For more information about the implementation take a look PauseMatchAnalysisWorker

The queue and load balancing

A worker is queued by calling the ZfExtended_Worker_Abstract-Method "queue". This method also kicks of the worker queue, this means starts the first worker in the queue. Depending on what is defined for a child class of ZfExtended_Worker_Abstract, the queue calls one or multiple resources for this kind of worker. For example the editor_Plugins_TermTagger_Worker_Abstract distributes the termTagging-Load to as many different TermTagger-REST-servers, as are defined in Zf_configuration for the termTagger.

Keep in mind that translate5 worker system can start, in some cases, more workers than it actually need to processneed the certain number of segments. For example, we have:

  1. Config option runtimeOptions.plugins.SpellCheck.languagetool.url.import is set as ["http://localhost:8081/v2", "http://localhost:8082/v2"] e.g two slots (LanguageTool-REST servers) available
  2. Config option runtimeOptions.worker.editor_Plugins_SpellCheck_Worker_Import.maxParallelWorkers is set as 2
  3. editor_Plugins_SpellCheck_Configuration::IMPORT_SEGMENTS_PER_CALL = 5
  4. Some task containing 10 segments

The above input conditions mean that system need 2 workers to process the task, but in fact 3 workers will be queued, because:

  1. first one will be queued in editor_Plugins_SpellCheck_QualityProvider→addWorker() by calling $worker→queue() with no args
  2. other two - in editor_Models_Import_Worker_ResourceAbstract->queue($parentId = 0, $state = null, $startNext = true), e.g. one worker per each slot defined 

So there can be the case when no real parallelization is achieved due to that first worker can be set up with slot 1, second with slot 1 and third with slot 2, But as long as it's happening randomly, this is not always the case.

But the much more important good news here is that such a lack of parallelization is in fact only affects small tasks, because the maximum negative effect is that the only first chunk of segments can be processed with no real parallelization. So the above notes are just for developers who are trying to figure out how to explain what's going on in the background, because developers may use small tasks for debugging and this thing may produce questions.


Tip: A worker can queue another worker of the same type (or another type) before exiting. This way a worker chain can be established, working itself through complex and long tasks.

...