Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added info about flash request and more details about how mutexes are implemented

...

Endpoints overview

default endpoint/example

Is async?

1Get the list of TMsReturns JSON list of TMsGET/%service%//t5memory/
2Create TM

Creates TM with the provided name

POST/%service%//t5memory/
3Create/Import TM in internal formatImport and unpack base64 encoded archive of .TMD, .TMI, .MEM files. Rename it to provided namePOST/%service%//t5memory/
4Clone TM LocalyMakes clone of existing tmPOST/%service%/%tm_name%/clone/t5memory/my+TM/clone
(+is placeholder for whitespace in tm name, so there should be 'my TM.TMD' and 'my TM.TMI'(and in pre 0.5.x 'my TM.MEM' also) files on the disk )
tm name IS case sensetive in url

5Reorganize TMReorganizing tm(replacing tm with new one and reimporting segments from tmd) - asyncGET/%service%/%tm_name%/reorganize/t5memory/my+other_tm/reorganize+ in 0.5.x and up
5Delete TMDeletes .TMD, .TMI files DELETE/%service%/%tm_name%//t5memory/%tm_name%/
6Import TMX into TMImport provided base64 encoded TMX file into TM - asyncPOST/%service%/%tm_name%/import/t5memory/%tm_name%/import+
7Export TMX from TMCreates TMX from tm. Encoded in base64GET/%service%/%tm_name%//t5memory/%tm_name%/
8Export in Internal formatCreates and exports archive with .TMD, .TMI files of TMGET/%service%/%tm_name%//t5memory/%tm_name%/status
9

Status of TM 

Returns status\import status of TMGET/%service%/%tm_name%/status/t5memory/%tm_name%/status
10Fuzzy searchReturns entries\translations with small differences from requestedPOST/%service%/%tm_name%/fuzzysearch/t5memory/%tm_name%/fuzzysearch
11Concordance searchReturns entries\translations that contain requested segmentPOST/%service%/%tm_name%/concordancesearch/t5memory/%tm_name%/concordancesearch
12Entry updateUpdates entry\translation POST/%service%/%tm_name%/entry/t5memory/%tm_name%/entry
13Entry deleteDeletes entry\translationPOST/%service%/%tm_name%/entrydelete/t5memory/%tm_name%/entrydelete
14Save all TMsFlushes all filebuffers(TMD, TMI files) into the filesystemGET/%service%_service/savetms/t5memory_service/saveatms
15Shutdown serviceFlushes all filebuffers into the filesystem and shutting down the serviceGET/%service%_service/shutdown/t5memory_service/shutdown
16Test tag replacement callFor testing tag replacementPOST/%service%_service/tagreplacement/t5memory_service/tagreplacement
17ResourcesReturns resources and service dataGET

/%service%_service/resources

/t5memory_service/resources


18Import tmx from local file(in removing lookuptable git branch)Similar to import tmx, but instead of base64 encoded file, use local path to filePOST

/%service%/%tm_name%/importlocal

/t5memory/%tm_name%/importlocal

+

19 Mass deletion of entries(from v0.6.0)It's like reorganize, but with skipping import of segments, that after checking with provided filters combined with logical AND returns true. POST

/%service%/%tm_name%/entriesdelete

/t5memory/tm1/entriesdelete

+

20New concordance search(from v0.6.0)It's extended concordance search, where you can search in different field of the segmentPOST

/%service%/%tm_name%/search

/t5memory/tm1/search


21Get segment by internal keyExtracting segment by it's location in tmd file. POST

/%service%/%tm_name%/getentry

/t5memory/tm1/getentry


22NEW Import tmxImports tmx in non-base64 formatPOST

/%service%/%tm_name%/importtmx

/t5memory/tm1/tmporttmx

+

23NEW import in internal format(tm)Extracts tm zip attached to request(it should contains tmd and tmi files) into MEM folderPOST

/%service%/%tm_name%/
("multipart/form-data")

/t5memory/tm1/

("multipart/form-data")


24NEW export tmxExports tmx file as a file. Could be used to export selected number of segments starting from selected positionGET
(could be with body)

/%service%/%tm_name%/download.tmx

/t5memory/tm1/download.tmx


25NEW export tm (internal format)Exports tm archive GET

/%service%/%tm_name%/download.tm

/t5memory/tm1/download.tm


26Flush tmIf tm is open, flushes it to the disk(implemented in 0.6.33)GET

/%service%/%tm_name%/flush

/t5memory/tm1/flush


27FlagsReturn all available commandline flags(implemented in 0.6.47). Do not spam too much because gflags documentation says that that's slow. Useful to collect configuration data about t5memory to do debugging.GET

/%service%_service/flags

/t5memory_service/flags



Available end points

List of TMs

PurposeReturns JSON list of TMs
RequestGET /%service%/
Params

-

Returns list of open TMs and then list of available(excluding open) in the app.

Code Block
languagejs
titleResponse
collapsetrue
Response example:
{
    "Open": [
        {
            "name": "mem2"
        }
    ],
    "Available on disk": [
        {
            "name": "mem_internal_format"
        },
        {
            "name": "mem1"
        },
        {
            "name": "newBtree3"
        },
        {
            "name": "newBtree3_cloned"
        }
    ]
}open - TM is in RAM, Available on disk - TM is not yet loaded from disk


...

Create/Import TM in internal format

PurposeImport and unpack base64 encoded archive of .TMD, .TMI, .MEM(in pre 0.5.x versions) files. Rename it to provided name
RequestPOST /%service%/
Params

{    "name": "examle_tm",    "sourceLang": "bg-BG" , "data":"base64EncodedArchive" }

Do not import tms created in other version of t5memory. Starting from 0.5.x tmd and tmi files has t5memory version where they were created in the header of the file, and different middle version(0.5.x) or global version(0.5.x) would be represented as 
version mismatch. Instead export tmx in corresponding version and create new empty tm and import tmx in new version. 

This would create example_tm.TMD(data file) and example.TMI(index file) in MEM folder
If there are "data" provided, no "sourceLang" required and vice versa - base64 data should be base64 encoded .tm file(which is just archive that contains .tmd and .tmi files 
If there are no "data" - new tm would be created, "sourceLang" should be provided and should be match with lang in languages.xml

Starting from 0.6.52 import in internal format supporst multipart/form data, so you can send then both file and json_body. In json_body only "name" attribute is required(sourceLang would be ignored anyway).

Send it in a same way as streaming import TMX. Json body should be in pretty formatting and in a part called json_body to be parsed correctly.

Code Block
languagejs
titleResponse
collapsetrue
Request example:{ "name": "mem_internal_format", "data":"UEsDBBQACAgIAPmrhVQAAAAAAAAAAAAAAAAWAAQAT1RNXy1JRDE3NS0wXzJfNV9iLk1FTQEAAADtzqEKgDAQgOFTEHwNWZ5swrAO0SBys6wfWxFBDILv6uOI2WZQw33lr38GbvRIsm91baSiigzFEjuEb6XHEK\/myX0PXtXsyxS2OazwhLDWeVTaWgEFMMYYY\/9wAlBLBwhEWTaSXAAAAAAAAAAACAAAAAAAAFBLAwQUAAgICAD5q4VUAAAAAAAAAAAAAAAAFgAEAE9UTV8tSUQxNzUtMF8yXzVfYi5UTUQBAAAA7d3Pa5JxHMDxz+Ns09phDAYdPfaDyQqWRcYjS9nGpoYZhBeZMCISW2v2g5o6VkqQONk\/0KVzh4IoKAovnboUo1PHbuuwU8dSn8c9Pk2yTbc53y+R5\/P9fL7P1wf5Ps9zep5vIOy3iMiSiPLn0yPrQ7In+rStTQARi\/bV9chEyHcxGPIKAGDnPonl21SsHNmUYNgfHZ70nnKNDo9ET0dHozFn2L+Ll9uxZPzazPz1mYQAAAAAAAAAAAAAAAAAAAAAAAAAANDtBkXRoj5Zk7OqSFZ9q35Vn6khNa6W2wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAdBKbKHK4Em1omT5DxV6J7FrmkKFypBKt9FczvYaKtr+2DLpiqPTWVayGiq2uYjFUpC7VI6aElN8F8JPn\/QEAAAAAAAAAAAAAAAAAAAAAAAAAAAAA2ANW7U0Ag9Iv60MnT4j8uLBZ\/X5+7dxn1ztX6Uy5AgAAAAAAAAAAAAAAAAAAgA6nL1qFjmc1rAO2IwNN9bL9u4ulVUeEfcQqQAfxSNtltshZaytB7jalZZ2a5KhFGT3Qr\/ztv1pkzAnP1v06+F7UxL22tRzSNf6aFq08MdoiY078\/znmkTZo5Qm2YdoOSLSyDdbaVUop\/Cj3cDm14I6\/uqf++nDUN1u4lS+k9MbKXL4QK72+775U+phOpp8sucdK728X5nK5hVT+weJqbTiHjMiNzWG1yNxWvI8rvxZ9cTfycj71NH1nsZgbf54uJlKryWy6GFlueBT6xHrzJRupDqkPXc9eyyduJmbLkf6\/mlYRDgQDPtO++3\/uYvsazANfYHx68vLEsSvOKedxqa\/hAGowD4Jh\/1X\/dH1X5sEBZpoH6E6\/AVBLBwj3gRyzjAIAAAAAAAAAAAEAAAAAAFBLAwQUAAgICAD5q4VUAAAAAAAAAAAAAAAAFgAEAE9UTV8tSUQxNzUtMF8yXzVfYi5UTUkBAAAA7d3PS9NhHMDxz\/Y1nbp0zfw2Vw6CEjooJkkFPs9DZZaFCiIRHRxKoJUIFXk06iB0kS5Fvw6dhDp28FDgOSqiIKQ\/ICQMhIIuYVnJt2f7eK2M2Ps1xp49b8Y+fP6ArXegJy4iV0RiPx6BNAXyT6ysrKhXlLZ49PwlkKP9hw\/19XcKAOD3PZX42+PDP0+JWN9AT765u3P33vbm1nxbvj0\/3DLQ0y3r5uClsZGhC2eGxgUAAAAAAAAAAAAAAAAAAAAAAAAAgFKXllh0ahQbLHeInDb3Xc6NWrF77Jibcr22zC2YY6bVLNoX5qp97Pa5SbPc8ci8sqHpd1k7a2+ZN+6eFQAAAAAAAAAAAAAAAAAAAAAAAAAAAAD4YxISk8bVUyq6eVa905dtqtxO3fBlqyqnkrW+ZFVZCGp8aVDl9ZeELxlVjhRNsEWVa+UffAlVuf78rC\/1eoK20JfNqnzt3OhLnSp1DZW+bFJl\/467vqRUuVxV5UutKts\/JX2pUWUyXvie9OopE5U7QWEHSfWZXdmPvlSr8i75xJcqVT7fPOdLpSqj5+t9Sahy8UBhOxWqLEph6nJVHhZNvUFPXbS3MlXyYWFvgSon3xf2FldlpGiCmCoPiiYQVbLR3or\/ZT0tS04AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMC6K4t+ZSAtOWkKQpOSeTfnZty0m3CDrsu1uNB9swv2pZ21IlN23J6w1uZsuV0y82bOzJhpM2EGTZdpMaERAAAAAAAAAAAAAAAAAAAAAAAAAAAAAPjrUmteK0RypXifid5n1tyX6j7+9\/vvUEsHCGo104BhAgAAAAAAAAAAAQAAAAAAUEsBAgAAFAAICAgA912FVERZNpJcAAAAAAgAABYABAAAAAAAAAAAALSBAAAAAE9UTV8tSUQxNzUtMF8yXzVfYi5NRU0BAAAAUEsBAgAAFAAICAgA\/F2FVPeBHLOMAgAAAAABABYABAAAAAAAAAAAALSBrAAAAE9UTV8tSUQxNzUtMF8yXzVfYi5UTUQBAAAAUEsBAgAAFAAICAgA\/F2FVGo104BhAgAAAAABABYABAAAAAAAAAAAALSBiAMAAE9UTV8tSUQxNzUtMF8yXzVfYi5UTUkBAAAAUEsGBiwAAAAAAAAAHgMtAAAAAAAAAAAAAwAAAAAAAAADAAAAAAAAANgAAAAAAAAAOQYAAAAAAABQSwYHAAAAABEHAAAAAAAAAQAAAFBLBQYAAAAAAwADANgAAAA5BgAAAAA=" }
Response example:{
"name": "examle_tm"
}

TM already exists:
{
  "ReturnValue": 65535,
  "ErrorMsg": ""
}



...

Openning and closing TM

In first concept it was planned to implement routines to open and close a TM. While concepting we found some problemes with this approach:

  • First one is the realization: opening and closing a TM by REST would mean to update the TM Resource and set a state to open or close. This is very awkward.
  • Since in translate5 multiple tasks can be used to the same time, multiple tasks try to access one TM. Closing TMs is getting complicated to prevent race conditions in TM usage.
  • Since OpenTM2 loads the whole TM in memory, OpenTM2 must control itself which TMs are loaded or not.

This leads to the following conclusion in implementation of opening and closing of TMs:

OpenTM2 has to automatically load the requested TMs if requested. Also OpenTM2 has to close the TMs after a TM was not used for some time. That means that OpenTM2 has to track the timestamps when a TM was last requested.


Concept endpoints, not implemented

http://opentm2/translationmemory/[TM_Name]/openHandle

GET – Opens a memory for queries by OpenTM2

Note: This method is not required as memories are automatically opened when they are accessed for the first time.

http://opentm2/translationmemory/[TM_Name]/openHandle

DELETE – Closes a memory for queries by OpenTM2

Note: This method is not required as memories are automatically opened when they are accessed for the first time.


For now we open  TM in case of call to work with it. TM stays opened till the shutdown we wouldn't try to open more TM's, exceeding the RAM limit setupped in config file. 
In that case we would close TM in order of longest not used, till we would fit in limit including TM that we try to open.
 TM size is calcucated basicaly as sum .TMD and .TMI files
Ram limit doesn't include service RAM and temporary files



Multithreading

In 0.6.44 multithreading are implemented this way

  1. you can set number of sevice threads with --servicethreads 8 commandline argument. This would be a number of threads which would handle requests, but also there would be 2-3 proxygen service threads running, and also for every import and reorganize - they would be new threads created. 
  2. there are mutexes for shared resources, like filesystem and some shared files in the ram, which are not configurable
  3. there are 3 configurable recursive timed mutexes, which can be also be used as non-timed mutexes(that means that it would not have timeout, but would wait till mutex would become free). To use them that way, they need to be set to 0

    This mutexes are -
  4. requestTMMutex - mutex for whole requestTM functions, which can just find tm in tm list, or reserve a place for tm in that list(tm would be opened later). Probably could be disabled to optimize code, was implemented as first high level mutex
  5. tmListMutex - every operation with tm list, like search, adding or deleting elements is managed with that mutex. 
  6. tmMutex - most requests which have tm_name in url, except status request, would be blocking - they would occupy tm for whole execution time(after request data would be parsed and checked). The reason for that is opentm2 code, which still have too many low level chunks, which makes multithreading impossible. 
  7. by default in import or reorganize threads(non request handlers - that would have regular mutexes, but threads, which are created by that handlers, which would run after you would receive response from your reorganize or import tmx requests) would be used non-timed mutexes. So this threads would wait till tm would be free.  You can change that with commandline argument 

    UseTimedMutexesForReorganizeAndImport 1

  8. You can set default values for tmRequestLockDefaultTimeout, tmLockDefaultTimeout, tmRequestLockDefaultTimeout using commandline arguments with this names. Value would be set in ms, default value is 0, which means that timeouts would be disabled. That change would apply for all requests without body and for requests with body if other value is not provided. For import and reorganize threads by default would be used non-timed mutexes, but if it's changed with commandline argument, would be used value from corresponding request(if provided, or default if not). 
  9. saveAllTms request could be used in 2 ways - with non timed mutexes for tms, or with timed mutexes, and in case of timeout, tm would not be saved and request would skip to the next tm. In response you would see message about saved and timeouted tms. 
  10. shutdown request would internally use saveAllTms with hardcoded non-timed mutexes. But it could fail on tmListMutexTimeout when checking how many tm are in import and reorganize status
  11.  resources request would use tmListMutex and timeout, when listing tms. It case of timeout, instead of list of tms, Failed to lock tm list: (timeout_error_msg) would be returned. But that wouldn't be threated as request error.   
  12. in case of timeout fail, you would get back (and into the logs) errorcode 506 and one of the next messages(where (msg) is generated service message about location of failed lock and backtrace):

    (msg); Failed to lock tm list:(msg) /(msg); Failed to lock requestTm:(msg) / (msg); Failed to lock tm:(msg)

    like this : {
    "ReturnValue":506,
    "ErrorMsg":"Failed to acquire the lock(tmMutex:18) within the timeout(1ms)(location: requestTM:339); Failed to lock tm:(location: requestTM:342); "
    }
  13. you can see default timeout values in initMsg. every timeout value is ms value.
  14. for requests with body, you can provide value for each type of mutexes as integer with this names:

   {

       "tmMutexTimeout": 5000, 

       "tmListMutexTimeout": 4000,

       "requestTMMutexTimeout": 15000,

...

}


Mutexes and request handling details:


Can you explain more, why the tm list mutex is needed?
Should it not be enough to block a new request from accessing a TM that is already in use?


it is mutex to block access to the tm list.
It's not quite a list, internally it's a map, hash table, or dictionary- so it has a key, which is tm name, and value, which is tm data, and its auto sorting
also, that means, as with any non-fixed size array, that I can't be sure about its memory location, because during, for example, search in one thread, it could be completely reallocated to another size in the memory, if some thread would try to add some new tm object to that list. so even read operations should be blocking
so yes, I would explain about simplier mutexes first
In 0.5 I made every request handler as a class, which have the same abstract ancestor, which implements the same strategy pattern to execute any request, but each request implements it's own methods as it need it to be implemented
here is a code to run every request, but each request type implements it's own parseJSON, checkData and execute methods.
TimedMutexGuard  is a class that I implemented to make timed recursive mutexes and be flexible and that could be used as RAII, and  tmLockTimeout is a timeout class, which I implemented to have timeout fields, flag that some timeout is out of time, so in case of that, execution should be rolled back with an error code. Also, in case of an error, that class writes down a string with all functions names and a line numbers, that was calling for that mutex even if it's nested, to trace location of failed mutex.  if timeout would be set to 0, it would be used as non-timed mutex
//pseudo-code but it's close to real 


int RequestData::run(){ 
  if(!res) res = parseJSON();
  if(!res) res = checkData();
  if(!res) res = requestTM();
  if(!res)
  {
    if(isLockingRequest())
    {
        TimedMutexGuard l(mem->tmMutex, tmLockTimeout, "tmMutex") ; // mutexes used as RAII, to open the lock automatically 
        if(tmLockTimeout.failed()){
          return buildErrorReturn(506, tmLockTimeout.getErrMsg(), _rc_);  // would return "Failed to lock tm:...   

        }      
        res = execute();
      }// here mutex would be destroyed 

      else

      { // request that doesn't require lock, so it's in general without tmName in URL + status request

          res = execute();    

      }

  }  
  buildRet(res);
  //reset pointer
  if(mem != nullptr)  mem.reset(); 

  return res;

so here you can see how tmLock is used
it's blocking selected tm for execution time only


then there are requestTM function, which would, depending on request type, request writePointer, readPointer or service pointer (some requests, like list of tm's , that don't have tm in url, or status request, as exception)
requesting writePointer or readPointer would check tm list and return tm if tm is found
else, it would init tm with found tm name and add that to tmList(it used to open tm also, but now it's separated)
service pointer wouldn't add nor open tm if it's not in list, but, can also return tm, if it's in list, without blocking it, for status request)
and that requestTM function have it's own mutex. it was implemented first, so only one tm could be requested at a time. Maybe with that tmList mutexes that I would explain later here, we don't need that requestTM function, but anyway... requestTM could also try to free some space, closing tms in the list, so at least that probably should be managed and synced.
so we would see if it's needed


and TmList mutex
so, every access to tm list block that mutex
few versions ago I had simple mutexes, but then it became recursive mutexes to simplify code and make it safer.
Then I had 2 versions of functions, safe, with mutex, and unsafe, that was used in functions that had that mutex locked on higher level
for example, I had this most primitive function


bool TMManager::IsMemoryInList(strMemName, tmListTimeout)

{  

  TimedMutexGuard l {mutex_access_tms, tmListTimeout, "tmListMutex"};// lock tms list
  if(tmListTimeout.failed()){     

    returnError(".Failed to lock tm list:");    

    return false;  

} // if TM mutex is failed in some nested function, tmListTimeout would be marked as spoiled so every other mutex that would use that timeout would be failed after first fail. in execution boolean function would return false, but also check if mutex was spoiled is needed to find out if function returned false because it didn't find (in this case) tm in list, or because its timeout is spoiled. But all checks is placed in code for now. 


  return TMManager::find(strMemName); // if lock was successful - try to find mem
}


 so it's boolean, but making it return some custom data type would make it harder to use
it's used everywhere, for example, to check if memory is loaded, but without having pointer, we have this function

bool TMManager::IsMemoryFailedToLoad(strMemName, tmListTimeout){  
 TimedMutexGuard l{mutex_access_tms, tmListTimeout, "tmListMutex"};
    bool res = false;
    if(tmListTimeout.failed())

   {      

      tmListTimeout.addToErrMsg(".Failed to lock tm list:");    

      return false;  

   }

   if(IsMemoryInList(strMemName, tmListTimeout)  
    && tms[strMemName]->isFailedToLoad())

        {    

                 res = true;    

        }

  if(tmListTimeout.failed())

  {     // if timeout was spoiled,  errMessage would be extended with new, so you would have backtrace with functions and lines in the file in the outputting message  

     tmListTimeout.addToErrMsg(".Failed to lock tm list:");

     return false;  

}

  return res;
}
 which also have blocking IsMemoryInList  but also blocks the same mutex because it's also working with tm list directly, and theoretically some other thread could change tm list between these lines

IsMemoryInList(strMemName, tmListTimeout)  // memory was checked to be present in the list
    && tms[strMemName]→isFailedToLoad()) // slow change that list was changed from the line above in another thread

so that type of boolean functions in case of timeout would return false, and then places where they were called should check if they returned false and timeout is not exiped. In case if it expired once, it would fail every next mutex lock, so even if that check is missing, that should now be handled in a right way. every addToErrMsg  sets that tmMutex is expired and adds some comment, function name and line number, so they could be tracked.

and tm list could be used not only when just requesting tm, but, for example, for resource request, or to free some space, or flush tm during shutdown.
so access to tm list should be managed and synced
that 2 classes TimedMutexGuard and MutexTimeout(tmListTimeout- is an object of that class) makes it more time consuming to implement that mutexes, because of that requiremnt
They should provide RAII possibility, be recursive and timed, but also supports non-timed mode, collect info in case if timeout failed, and also secure that in case if one timeout would fail, next mutexes would fail automatically, collect, provide and log data, where it failed

Regarding
and that requestTM function have it's own mutex. it was implemented first, so only one tm could be requested at a time
so I understand it right, as long as one TM is requested for opening (not for reading) another one can not be opened at the same time? That does not make sense to me. Why that? That means, a request for a small TM has to wait, until a large TM has been loaded into the RAM.

Load call is outside of mutex_requestTM mutex, so it wouldn't be blocked in current version.  Openning(loading) of the tm files is happening outside of the
mutex_requestTM  area.


So an active mutex to the tm list would still block every request then, also to other TMs, rigth? How long will it be blocked for example, if I import a new TMX into one TM? The whole time the TMX is imported? And same question for update?
And what about read requests? I understood, they will also block the TM list, right? But why I do not understand.

no, it wouldn't be blocked whole time for another tm's
it would be blocked for the time to check if tm is in list, and then if not, it would add it to the list. TMListLock is used only to prevent rearranging list when accessing its data. 
during that it would calculate how much space is needed and if we have enough space free
in not enough, it would delete tms from the list starting from the one that was not in use for the longest time
but if that tm is blocked, that wouldn't prevent for deletion from the list, because there are smart pointers used there, so tm would be closed when last pointer would be freed.
but it could take longer if there are some processes which uses tm list

blocking tm is necessary because of big chunk of low-level code which exists in opentm2 which operated with pointers to the memory (RAM) for a long time
I implemented read pointer and write pointer to be able in future have multiple read pointers and only single write pointer at the time, but with old code of opentm2 we need to treat every request as write request, because it can lead to memory reallocations that leads to crashes
For example if you have 2 projects translating with the same tm assigned, they would send both requests to the same tm, and that leads to crashes. that was actually one of the things that I fixed in recent versions
so tm should be blocked at least till some legacy code would be removed. like lookup tables.






TM files structure and other related info

Info below is actual for version 0_5_x

Starting from version 0_5_0 .mem file is excluded from TM files - tm now consists only with .tmd and .tmi files. That files have 2kb headers which have some useful information, like creation date and version in which that file was created. In general, changing mid_version number means binary incompatible files. During reorganize there would be created new empty tm and then segments would be reimported from previous, and then old files would be deleted and new ones would be renamed to replace old files. That means that reorganize would also update creation t5memory version of files to the newest.


TM file is just archive with tmi and tmd files. 

tmd and tmi files should be flushed in a safe way - saved on disk with temporary filename and then replacing old files.(Should be implemented)

There is tmmanager(as singletone) which have list of tm, and one tm instance have two binary trees(for both (tmd)data and (tmi)index files), with each have own filebuffer instance(before there used to be a pool of filebuffers and it's files operation functions, like write, read, close and open was handling requests). 

Request handler - it's an instance of class in request handler hierarhy classes. For each type of requests there is class to handle it. In general it have private functions "parseJSON"(would parse json if provided and would return error if json is invalid), "checkData"(whould check if all required fields was provided), "requestTM"(would request readOnly, write or service tm handlers. It would load tm if it is not loaded in RAM yet) and "execute" - original requests code. And also it has public function "run" which is stategy template to operate listed private function. 

The TMs is saved in TMManager using smart pointers(it's pointer which track references to itself and call destructor automaticaly). That means that on request it's possible to clear list from some TM, while it would still be active in other thread(like in fuzzy search). Then ram would be freed at the end of last request handling that TM.
In case if in the middle of some request(like fuzzy search) there was a call to delete tm, first we clear TMlist(but we keep smart pointer in fuzzy requests thread, so this is not calling destructor yet, but would after fuzzy request would be done).  Destructor would try to flush filebuffer into filesystem but because there is no files in the disk, filebuffers would not create them again and it would just clean the RAM(in that case log would be writen about filebuffer flush not founding file in the folder).  

From TMManager, request could ask for one of 3 types of tm handers - readonly, write or service. ReadOnly\write requests here have it's name from inside-tm perspective(so operations with tm files in filesystem is service requests).
ReadOnly(concordance search, fuzzy search, exportTmx) would be provided if there is no write handlers, for write handlers(deleteEntry, updateEntry, importTmx) there should be no other write handlers and no readOnly handlers. Service handlers could mean different for different requests. For example status request should be able to access something like readonly handler, but it shouldn't be blocked if there is any write requests, since it's used for checking import\reorganize status and progress. For some filesystem requests(deleteTM, createTM, cloneTM, importTM, exportTM(internal format)) there should be other blocking mechanism, since most of them even doesn't require to load tm into the ram. 

 In case if tm is not in RAM, requesting handler from TMManager would try to load TM into the RAM, considering RAM limit explained in this document. 

...