Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Multithreading

In 0.6.44 multithreading are implemented this way

  1. you can set number of sevice threads with --servicethreads 8 commandline argument. This would be a number of threads which would handle requests, but also there would be 2-3 proxygen service threads running, and also for every import and reorganize - they would be new threads created. 
  2. there are mutexes for shared resources, like filesystem and some shared files in the ram, which are not configurable
  3. there are 3 configurable recursive timed mutexes, which can be also be used as non-timed mutexes(that means that it would not have timeout, but would wait till mutex would become free). To use them that way, they need to be set to 0

    This mutexes are -
  4. requestTMMutex - mutex for whole requestTM functions, which can just find tm in tm list, or reserve a place for tm in that list(tm would be opened later). Probably could be disabled to optimize code, was implemented as first high level mutex
  5. tmListMutex - every operation with tm list, like search, adding or deleting elements is managed with that mutex. 
  6. tmMutex - most requests which have tm_name in url, except status request, would be blocking - they would occupy tm for whole execution time(after request data would be parsed and checked). The reason for that is opentm2 code, which still have too many low level chunks, which makes multithreading impossible. 
  7. by default in import or reorganize threads(non request handlers - that would have regular mutexes, but threads, which are created by that handlers, which would run after you would receive response from your reorganize or import tmx requests) would be used non-timed mutexes. So this threads would wait till tm would be free.  You can change that with commandline argument 

    UseTimedMutexesForReorganizeAndImport 1

  8. You can set default values for tmRequestLockDefaultTimeout, tmLockDefaultTimeout, tmRequestLockDefaultTimeout using commandline arguments with this names. Value would be set in ms, default value is 0, which means that timeouts would be disabled. That change would apply for all requests without body and for requests with body if other value is not provided. For import and reorganize threads by default would be used non-timed mutexes, but if it's changed with commandline argument, would be used value from corresponding request(if provided, or default if not). 
  9. saveAllTms request could be used in 2 ways - with non timed mutexes for tms, or with timed mutexes, and in case of timeout, tm would not be saved and request would skip to the next tm. In response you would see message about saved and timeouted tms. 
  10. shutdown request would internally use saveAllTms with hardcoded non-timed mutexes. But it could fail on tmListMutexTimeout when checking how many tm are in import and reorganize status
  11.  resources request would use tmListMutex and timeout, when listing tms. It case of timeout, instead of list of tms, Failed to lock tm list: (timeout_error_msg) would be returned. But that wouldn't be threated as request error.   
  12. in case of timeout fail, you would get back (and into the logs) errorcode 506 and one of the next messages(where (msg) is generated service message about location of failed lock and backtrace):

    (msg); Failed to lock tm list:(msg) /(msg); Failed to lock requestTm:(msg) / (msg); Failed to lock tm:(msg)

    like this : {
    "ReturnValue":506,
    "ErrorMsg":"Failed to acquire the lock(tmMutex:18) within the timeout(1ms)(location: requestTM:339); Failed to lock tm:(location: requestTM:342); "
    }
  13. you can see default timeout values in initMsg. every timeout value is ms value.
  14. for requests with body, you can provide value for each type of mutexes as integer with this names:

   {

       "tmMutexTimeout": 5000, 

       "tmListMutexTimeout": 4000,

       "requestTMMutexTimeout": 15000,

...

}


Mutexes and request handling details:


Can you explain more, why the tm list mutex is needed?
Should it not be enough to block a new request from accessing a TM that is already in use?


it is mutex to block access to the tm list.
It's not quite a list, internally it's a map, hash table, or dictionary- so it has a key, which is tm name, and value, which is tm data, and its auto sorting
also, that means, as with any non-fixed size array, that I can't be sure about its memory location, because during, for example, search in one thread, it could be completely reallocated to another size in the memory, if some thread would try to add some new tm object to that list. so even read operations should be blocking
so yes, I would explain about simplier mutexes first
In 0.5 I made every request handler as a class, which have the same abstract ancestor, which implements the same strategy pattern to execute any request, but each request implements it's own methods as it need it to be implemented
here is a code to run every request, but each request type implements it's own parseJSON, checkData and execute methods.
TimedMutexGuard  is a class that I implemented to make timed recursive mutexes and be flexible and that could be used as RAII, and  tmLockTimeout is a timeout class, which I implemented to have timeout fields, flag that some timeout is out of time, so in case of that, execution should be rolled back with an error code. Also, in case of an error, that class writes down a string with all functions names and a line numbers, that was calling for that mutex even if it's nested, to trace location of failed mutex.  if timeout would be set to 0, it would be used as non-timed mutex
//pseudo-code but it's close to real 


int RequestData::run(){ 
  if(!res) res = parseJSON();
  if(!res) res = checkData();
  if(!res) res = requestTM();
  if(!res)
  {
    if(isLockingRequest())
    {
        TimedMutexGuard l(mem->tmMutex, tmLockTimeout, "tmMutex") ; // mutexes used as RAII, to open the lock automatically 
        if(tmLockTimeout.failed()){
          return buildErrorReturn(506, tmLockTimeout.getErrMsg(), _rc_);  // would return "Failed to lock tm:...   

        }      
        res = execute();
      }// here mutex would be destroyed 

      else

      { // request that doesn't require lock, so it's in general without tmName in URL + status request

          res = execute();    

      }

  }  
  buildRet(res);
  //reset pointer
  if(mem != nullptr)  mem.reset(); 

  return res;

so here you can see how tmLock is used
it's blocking selected tm for execution time only


then there are requestTM function, which would, depending on request type, request writePointer, readPointer or service pointer (some requests, like list of tm's , that don't have tm in url, or status request, as exception)
requesting writePointer or readPointer would check tm list and return tm if tm is found
else, it would init tm with found tm name and add that to tmList(it used to open tm also, but now it's separated)
service pointer wouldn't add nor open tm if it's not in list, but, can also return tm, if it's in list, without blocking it, for status request)
and that requestTM function have it's own mutex. it was implemented first, so only one tm could be requested at a time. Maybe with that tmList mutexes that I would explain later here, we don't need that requestTM function, but anyway... requestTM could also try to free some space, closing tms in the list, so at least that probably should be managed and synced.
so we would see if it's needed


and TmList mutex
so, every access to tm list block that mutex
few versions ago I had simple mutexes, but then it became recursive mutexes to simplify code and make it safer.
Then I had 2 versions of functions, safe, with mutex, and unsafe, that was used in functions that had that mutex locked on higher level
for example, I had this most primitive function


bool TMManager::IsMemoryInList(strMemName, tmListTimeout)

{  

  TimedMutexGuard l {mutex_access_tms, tmListTimeout, "tmListMutex"};// lock tms list
  if(tmListTimeout.failed()){     

    returnError(".Failed to lock tm list:");    

    return false;  

} // if TM mutex is failed in some nested function, tmListTimeout would be marked as spoiled so every other mutex that would use that timeout would be failed after first fail. in execution boolean function would return false, but also check if mutex was spoiled is needed to find out if function returned false because it didn't find (in this case) tm in list, or because its timeout is spoiled. But all checks is placed in code for now. 


  return TMManager::find(strMemName); // if lock was successful - try to find mem
}


 so it's boolean, but making it return some custom data type would make it harder to use
it's used everywhere, for example, to check if memory is loaded, but without having pointer, we have this function

bool TMManager::IsMemoryFailedToLoad(strMemName, tmListTimeout){  
 TimedMutexGuard l{mutex_access_tms, tmListTimeout, "tmListMutex"};
    bool res = false;
    if(tmListTimeout.failed())

   {      

      tmListTimeout.addToErrMsg(".Failed to lock tm list:");    

      return false;  

   }

   if(IsMemoryInList(strMemName, tmListTimeout)  
    && tms[strMemName]->isFailedToLoad())

        {    

                 res = true;    

        }

  if(tmListTimeout.failed())

  {     // if timeout was spoiled,  errMessage would be extended with new, so you would have backtrace with functions and lines in the file in the outputting message  

     tmListTimeout.addToErrMsg(".Failed to lock tm list:");

     return false;  

}

  return res;
}
 which also have blocking IsMemoryInList  but also blocks the same mutex because it's also working with tm list directly, and theoretically some other thread could change tm list between these lines

IsMemoryInList(strMemName, tmListTimeout)  // memory was checked to be present in the list
    && tms[strMemName]→isFailedToLoad()) // slow change that list was changed from the line above in another thread

so that type of boolean functions in case of timeout would return false, and then places where they were called should check if they returned false and timeout is not exiped. In case if it expired once, it would fail every next mutex lock, so even if that check is missing, that should now be handled in a right way. every addToErrMsg  sets that tmMutex is expired and adds some comment, function name and line number, so they could be tracked.

and tm list could be used not only when just requesting tm, but, for example, for resource request, or to free some space, or flush tm during shutdown.
so access to tm list should be managed and synced
that 2 classes TimedMutexGuard and MutexTimeout(tmListTimeout- is an object of that class) makes it more time consuming to implement that mutexes, because of that requiremnt
They should provide RAII possibility, be recursive and timed, but also supports non-timed mode, collect info in case if timeout failed, and also secure that in case if one timeout would fail, next mutexes would fail automatically, collect, provide and log data, where it failed

Regarding
and that requestTM function have it's own mutex. it was implemented first, so only one tm could be requested at a time
so I understand it right, as long as one TM is requested for opening (not for reading) another one can not be opened at the same time? That does not make sense to me. Why that? That means, a request for a small TM has to wait, until a large TM has been loaded into the RAM.

Load call is outside of mutex_requestTM mutex, so it wouldn't be blocked in current version.  Openning(loading) of the tm files is happening outside of the
mutex_requestTM  area.


So an active mutex to the tm list would still block every request then, also to other TMs, rigth? How long will it be blocked for example, if I import a new TMX into one TM? The whole time the TMX is imported? And same question for update?
And what about read requests? I understood, they will also block the TM list, right? But why I do not understand.

no, it wouldn't be blocked whole time for another tm's
it would be blocked for the time to check if tm is in list, and then if not, it would add it to the list. TMListLock is used only to prevent rearranging list when accessing its data. 
during that it would calculate how much space is needed and if we have enough space free
in not enough, it would delete tms from the list starting from the one that was not in use for the longest time
but if that tm is blocked, that wouldn't prevent for deletion from the list, because there are smart pointers used there, so tm would be closed when last pointer would be freed.
but it could take longer if there are some processes which uses tm list

blocking tm is necessary because of big chunk of low-level code which exists in opentm2 which operated with pointers to the memory (RAM) for a long time
I implemented read pointer and write pointer to be able in future have multiple read pointers and only single write pointer at the time, but with old code of opentm2 we need to treat every request as write request, because it can lead to memory reallocations that leads to crashes
For example if you have 2 projects translating with the same tm assigned, they would send both requests to the same tm, and that leads to crashes. that was actually one of the things that I fixed in recent versions
so tm should be blocked at least till some legacy code would be removed. like lookup tables.


Tag replacement

Design concept of tag handling:

  1. Generating ids on import and update operation: all incoming tags will be converted to those recognised by t5memory processor and ids of tags will be regenerated based on their current value.
  2. We preserve order of tags in source and target
  3. In fuzzy search we do replacement of order of ids of tags. So <ph/> will not be replaced with <bpt/> but <ph id="1"/> may be replaced with <ph id="3"/> because of query order.
  4. Content inside <bpt/> and <ept/> tags will be collapsed. We think of it as irrelevant data for matches and don't store it. So all <bpt/> and <ept/> tags will be self-closing tags.
  5. Content protection tags are handled additionally. They handled in own scope and compared by r parameter.

...