...
- import new openTM2-TMs
- delete openTM2-TMs
- create new empty openTM2-TM
- import TMX
- open TM and close TM: not possible see extra section in this document. Maybe we need trigger to flush tm to the disk, but also it could be done in some specific cases...
- query TM for Matches: one query per TM, not quering multiple TMs at once.
- query TM for concordance search
- extract segment by it's location
- save new entry to TM
- delete entry from TM
- localy clone TM
- reorganize TM
- get some statistics about service
- also you can use tagreplacement endpoint to test tag replacement mechanism
...
Endpoints overview | default endpoint/example | Is async? | ||||
---|---|---|---|---|---|---|
1 | Get the list of TMs | Returns JSON list of TMs | GET | /%service%/ | /t5memory/ | |
2 | Create TM | Creates TM with the provided name | POST | /%service%/ | /t5memory/ | |
3 | Create/Import TM in internal format | Import and unpack base64 encoded archive of .TMD, .TMI, .MEM files. Rename it to provided name | POST | /%service%/ | /t5memory/ | |
4 | Clone TM Localy | Makes clone of existing tm | POST | /%service%/%tm_name%/clone | /t5memory/my+TM/clone (+is placeholder for whitespace in tm name, so there should be 'my TM.TMD' and 'my TM.TMI'(and in pre 0.5.x 'my TM.MEM' also) files on the disk ) tm name IS case sensetive in url | |
5 | Reorganize TM | Reorganizing tm(replacing tm with new one and reimporting segments from tmd) - async | GET | /%service%/%tm_name%/reorganize | /t5memory/my+other_tm/reorganize | + in 0.5.x and up |
5 | Delete TM | Deletes .TMD, .TMI files | DELETE | /%service%/%tm_name%/ | /t5memory/%tm_name%/ | |
6 | Import TMX into TM | Import provided base64 encoded TMX file into TM - async | POST | /%service%/%tm_name%/import | /t5memory/%tm_name%/import | + |
7 | Export TMX from TM | Creates TMX from tm. Encoded in base64 | GET | /%service%/%tm_name%/ | /t5memory/%tm_name%/ | |
8 | Export in Internal format | Creates and exports archive with .TMD, .TMI files of TM | GET | /%service%/%tm_name%/ | /t5memory/%tm_name%/status | |
9 | Status of TM | Returns status\import status of TM | GET | /%service%/%tm_name%/status | /t5memory/%tm_name%/status | |
10 | Fuzzy search | Returns entries\translations with small differences from requested | POST | /%service%/%tm_name%/fuzzysearch | /t5memory/%tm_name%/fuzzysearch | |
11 | Concordance search | Returns entries\translations that contain requested segment | POST | /%service%/%tm_name%/concordancesearch | /t5memory/%tm_name%/concordancesearch | |
12 | Entry update | Updates entry\translation | POST | /%service%/%tm_name%/entry | /t5memory/%tm_name%/entry | |
13 | Entry delete | Deletes entry\translation | POST | /%service%/%tm_name%/entrydelete | /t5memory/%tm_name%/entrydelete | |
14 | Save all TMs | Flushes all filebuffers(TMD, TMI files) into the filesystem | GET | /%service%_service/savetms | /t5memory_service/saveatms | |
15 | Shutdown service | Flushes all filebuffers into the filesystem and shutting down the service | GET | /%service%_service/shutdown | /t5memory_service/shutdown | |
16 | Test tag replacement call | For testing tag replacement | POST | /%service%_service/tagreplacement | /t5memory_service/tagreplacement | |
17 | Resources | Returns resources and service data | GET | /%service%_service/resources | /t5memory_service/resources | |
18 | Import tmx from local file(in removing lookuptable git branch) | Similar to import tmx, but instead of base64 encoded file, use local path to file | POST | /%service%/%tm_name%/importlocal | /t5memory/%tm_name%/importlocal | + |
Available end points
19 | Mass deletion of entries(from v0.6.0) | It's like reorganize, but with skipping import of segments, that after checking with provided filters combined with logical AND returns true. | POST | /%service%/%tm_name%/entriesdelete | /t5memory/tm1/entriesdelete | + |
20 | New concordance search(from v0.6.0) | It's extended concordance search, where you can search in different field of the segment | POST | /%service%/%tm_name%/search | /t5memory/tm1/search | |
21 | Get segment by internal key | Extracting segment by it's location in tmd file. | POST | /%service%/%tm_name%/getentry | /t5memory/tm1/getentry | |
22 | NEW Import tmx | Imports tmx in non-base64 format | POST | /%service%/%tm_name%/importtmx | /t5memory/tm1/tmporttmx | + |
23 | NEW import in internal format(tm) | Extracts tm zip attached to request(it should contains tmd and tmi files) into MEM folder | POST | /%service%/%tm_name%/ | /t5memory/tm1/ ("multipart/form-data") | |
24 | NEW export tmx | Exports tmx file as a file. Could be used to export selected number of segments starting from selected position | GET (could be with body) | /%service%/%tm_name%/download.tmx | /t5memory/tm1/download.tmx | |
25 | NEW export tm (internal format) | Exports tm archive | GET | /%service%/%tm_name%/download.tm | /t5memory/tm1/download.tm | |
26 | Flush tm | If tm is open, flushes it to the disk(implemented in 0.6.33) | GET | /%service%/%tm_name%/flush | /t5memory/tm1/flush |
Available end points
List of TMs | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Returns JSON list of TMs | |||||||||
Request | GET /%service%/ | |||||||||
Params | - | |||||||||
Returns list of open TMs and then list of available(excluding open) in the app.
| ||||||||||
List of TMs | ||||||||||
Purpose | Returns JSON list of TMs | |||||||||
Request | GET /%service%/ | |||||||||
Params | - | |||||||||
Returns list of open TMs and then list of available(excluding open) in the app.
|
Create TM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Creates TM with the provided name(tmd and | |||||||||
Create TM | ||||||||||
Purpose | Creates TM with the provided name(tmd and tmi files in/MEM/ folder) | |||||||||
Request | Post /%service%/%tm_name%/ | |||||||||
Params | Required: name, sourceLang | |||||||||
|
...
Clone TM localy | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Creates TM with the provided name | |||||||||
Request | Post /%service%/%tm_name%/clone | |||||||||
Params | Required: name, sourceLang | |||||||||
Endpoint is sync(blocking)
|
Flush TM | |
---|---|
Purpose |
If TM is open, flushes it to the disk |
Request |
Get /%service%/%tm_name%/flush | |
Params | |
Endpoint is sync(blocking) If tm is not found on the disk - |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
Response example:
success:
{
"newBtree3_cloned2": "deleted"
},
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
Response example:
failed:
{
"newBtree3_cloned2": "not found"
} |
returns 404 If tm is not open - returns 400 with message Then t5m requests write pointer to the tm(so it waits till other requests that's working with the tm would finish) and then it flushes it to the disk Could also return an error if flushing got some issue. Would not open the tm, if it's not opened yet, but instead would return an error. |
Import provided base64 encoded TMX file into TM
{"tmxData": "base64EncodedTmxFile" }
It's async, so check status using status endpoint, like with reorganize in 0.5.x and up
|
|
|
|
|
|
Delete TM | |
---|---|
Purpose | Deletes .TMD, .TMI, .MEM files |
Request | Delete |
Reorganize TM | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Purpose | Reorganizes tm and fixing issues. | ||||||||||||||||||
Request | GET /%service%/%tm_name%/reorganize | ||||||||||||||||||
HeadersParams | Accept - applicaton/xml | up to v0.4.x reorganize is sync, so t5memory ||||||||||||||||||
Export TMX from TM | |||||||||||||||||||
Purpose | Creates TMX from tm. | Request | GET|||||||||||||||||
|
Import provided base64 encoded TMX file into TM | |
---|---|
Purpose | Import provided base64 encoded TMX file into TM. Starts another thead for import. For checking import status use status call |
Request | POST /%service%/%tm_name%/import |
Accept - applicaton/xml
Params | {"tmxData": "base64EncodedTmxFile" }
| ||||||||
TM must exist Handling if framing tag situation differs from source to target - for skipAll or skipPairedIf framing tags situation is the same in source and target, both sides should be treated as described above. If framing tags only exist in source, then still they should be treated as described above. If they only exist in target, then nothing should be removed.
|
|
Table of Contents
Overview and API introduction
In this document the translate5 TM service REST interface is described.
The translate5 TM service is build by using the OpenTM2 Translation Memory Engine.
It provides the following functionality:
- import new openTM2-TMs
- delete openTM2-TMs
- create new empty openTM2-TM
- import TMX
- open TM and close TM: not possible see extra section in this document. Maybe we need trigger to flush tm to the disk, but also it could be done in some specific cases...
- query TM for Matches: one query per TM, not quering multiple TMs at once.
- query TM for concordance search
- extract segment by it's location
- save new entry to TM
- delete entry from TM
- localy clone TM
- reorganize TM
- get some statistics about service
- also you can use tagreplacement endpoint to test tag replacement mechanism
This can be achieved by the following specification of a RESTful HTTP Serive, the specification is given in the following form:
- URL of the HTTP Resource, where servername and an optional path prefix is configurable.
- HTTP Method with affected functionality
- Brief Description
- Sent and returned Body.
Request Data Format:
The transferred data in the requests is JSON and is directly done in the request body. It's should be pretty json and ends with '\n}" symbol, because of bug in proxygen that caused garbage after valid data.
URL Format:
In this document, the OpenTM2 is always assumed under http://opentm2/.
To rely on full networking features (proxying etc.) the URL is configurable in Translate5 so that the OpenTM2 instance can also reside under http://xyz/foo/bar/.
Errors
For each request, the possible errors are listed below for each resource. In case of an error, the body should contain at least the following JSON, if it is senseful the attributes of the original representation can be added.
{
errors: [{errorMsg: 'Given tmxData is no TMX.'}]
}
Values | |
---|---|
%service% | Name of service(default - t5memory, could be changed in t5m3mory.conf file |
%tm_name% | Name of Translation Memory |
Example | http://localhost:4040/t5memory/examle_tm/fuzzysearch/? |
Endpoints overview | default endpoint/example | Is async? | ||||
---|---|---|---|---|---|---|
1 | Get the list of TMs | Returns JSON list of TMs | GET | /%service%/ | /t5memory/ | |
2 | Create TM | Creates TM with the provided name | POST | /%service%/ | /t5memory/ | |
3 | Create/Import TM in internal format | Import and unpack base64 encoded archive of .TMD, .TMI, .MEM files. Rename it to provided name | POST | /%service%/ | /t5memory/ | |
4 | Clone TM Localy | Makes clone of existing tm | POST | /%service%/%tm_name%/clone | /t5memory/my+TM/clone (+is placeholder for whitespace in tm name, so there should be 'my TM.TMD' and 'my TM.TMI'(and in pre 0.5.x 'my TM.MEM' also) files on the disk ) tm name IS case sensetive in url | |
5 | Reorganize TM | Reorganizing tm(replacing tm with new one and reimporting segments from tmd) - async | GET | /%service%/%tm_name%/reorganize | /t5memory/my+other_tm/reorganize | + in 0.5.x and up |
5 | Delete TM | Deletes .TMD, .TMI files | DELETE | /%service%/%tm_name%/ | /t5memory/%tm_name%/ | |
6 | Import TMX into TM | Import provided base64 encoded TMX file into TM - async | POST | /%service%/%tm_name%/import | /t5memory/%tm_name%/import | + |
7 | Export TMX from TM | Creates TMX from tm. Encoded in base64 | GET | /%service%/%tm_name%/ | /t5memory/%tm_name%/ | |
8 | Export in Internal format | Creates and exports archive with .TMD, .TMI files of TM | GET | /%service%/%tm_name%/ | /t5memory/%tm_name%/status | |
9 | Status of TM | Returns status\import status of TM | GET | /%service%/%tm_name%/status | /t5memory/%tm_name%/status | |
10 | Fuzzy search | Returns entries\translations with small differences from requested | POST | /%service%/%tm_name%/fuzzysearch | /t5memory/%tm_name%/fuzzysearch | |
11 | Concordance search | Returns entries\translations that contain requested segment | POST | /%service%/%tm_name%/concordancesearch | /t5memory/%tm_name%/concordancesearch | |
12 | Entry update | Updates entry\translation | POST | /%service%/%tm_name%/entry | /t5memory/%tm_name%/entry | |
13 | Entry delete | Deletes entry\translation | POST | /%service%/%tm_name%/entrydelete | /t5memory/%tm_name%/entrydelete | |
14 | Save all TMs | Flushes all filebuffers(TMD, TMI files) into the filesystem | GET | /%service%_service/savetms | /t5memory_service/saveatms | |
15 | Shutdown service | Flushes all filebuffers into the filesystem and shutting down the service | GET | /%service%_service/shutdown | /t5memory_service/shutdown | |
16 | Test tag replacement call | For testing tag replacement | POST | /%service%_service/tagreplacement | /t5memory_service/tagreplacement | |
17 | Resources | Returns resources and service data | GET | /%service%_service/resources | /t5memory_service/resources | |
18 | Import tmx from local file(in removing lookuptable git branch) | Similar to import tmx, but instead of base64 encoded file, use local path to file | POST | /%service%/%tm_name%/importlocal | /t5memory/%tm_name%/importlocal | + |
19 | Mass deletion of entries(from v0.6.0) | It's like reorganize, but with skipping import of segments, that after checking with provided filters combined with logical AND returns true. | POST | /%service%/%tm_name%/entriesdelete | /t5memory/tm1/entriesdelete | + |
20 | New concordance search(from v0.6.0) | It's extended concordance search, where you can search in different field of the segment | POST | /%service%/%tm_name%/search | /t5memory/tm1/search | |
21 | Get segment by internal key | Extracting segment by it's location in tmd file. | POST | /%service%/%tm_name%/getentry | /t5memory/tm1/getentry | |
22 | NEW Import tmx | Imports tmx in non-base64 format | POST | /%service%/%tm_name%/importtmx | /t5memory/tm1/tmporttmx | + |
23 | NEW import in internal format(tm) | Extracts tm zip attached to request(it should contains tmd and tmi files) into MEM folder | POST | /%service%/%tm_name%/ | /t5memory/tm1/ ("multipart/form-data") | |
24 | NEW export tmx | Exports tmx file as a file. Could be used to export selected number of segments starting from selected position | GET (could be with body) | /%service%/%tm_name%/download.tmx | /t5memory/tm1/download.tmx | |
25 | NEW export tm (internal format) | Exports tm archive | GET | /%service%/%tm_name%/download.tm | /t5memory/tm1/download.tm |
Available end points
List of TMs | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Returns JSON list of TMs | |||||||||
Request | GET /%service%/ | |||||||||
Params | - | |||||||||
Returns list of open TMs and then list of available(excluding open) in the app.
|
Create TM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Creates TM with the provided name(tmd and tmi files in/MEM/ folder) | |||||||||
Request | Post /%service%/%tm_name%/ | |||||||||
Params | Required: name, sourceLang | |||||||||
|
Create/Import TM in internal format | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Import and unpack base64 encoded archive of .TMD, .TMI, .MEM(in pre 0.5.x versions) files. Rename it to provided name | |||||||||
Request | POST /%service%/ | |||||||||
Params | { "name": "examle_tm", "sourceLang": "bg-BG" , "data":"base64EncodedArchive" } or alternatively data could be provided in non-base64 binary format as a file attached to the request | |||||||||
curl -X POST \ -H "Content-Type: application/json" \ -F "file=@/path/to/12434615271d732fvd7te3.gz;filename=myfile.tg" \ -F "json_data={\"name\": \"TM name\", \"sourceLang\": \"en-GB\"}" \ http://t5memory:4045/t5memory | ||||||||||
Do not import tms created in other version of t5memory. Starting from 0.5.x tmd and tmi files has t5memory version where they were created in the header of the file, and different middle version(0.5.x) or global version(0.5.x) would be represented as This would create example_tm.TMD(data file) and example.TMI(index file) in MEM folder In 0.6.20 and up data could be send as attachment instead of base64 encoded. Content-type then should be set to "multipart/form-data" and then json(with name of new tm) should be provided with json_data key(search is made this way: part.headers.at("Content-Disposition").find("name=\"json_data\"") curl command example : curl -X POST \
|
Clone TM localy | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Creates TM with the provided name | |||||||||
Request | Post /%service%/%tm_name%/clone | |||||||||
Params | Required: name, sourceLang | |||||||||
Endpoint is sync(blocking)
|
Delete TM | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Purpose | Deletes .TMD, .TMI, .MEM files | ||||||||||||||||||
Request | Delete /%service%/%tm_name%/ | ||||||||||||||||||
Params | - | ||||||||||||||||||
|
Import binary TMX file into TM | |
---|---|
Purpose | Import provided base64 encoded TMX file into TM. Starts another thead for import. For checking import status use status call |
Request | POST /%service%/%tm_name%/importtmx |
Params | Request has a file attached and a body as an option, Implemented in 0.6.19 curl -X POST \ -H "Content-Type: application/json" \ -F "file=@/path/to/12434615271d732fvd7te3.tmx;filename=myfile.tmx" \ -F "json_data={\"framingTags\": \"value\", \"timeout\": 1500}" \ http://t5memory:4045/t5memory/{memory_name}/importtmx Body should be provided in multiform under json_data key { ["framingTags": "saveAll"], // framing tags behaviour [timeout: 100] // timeout in sec after which import stops, even if it doesn't reach end of tmx yet }
|
TM must exist TMX import could be interrupted in case of invalid XML or TM reaching it's limit or timeout. For both cases check status request to have info about position in tmx file where it was interrupted. Handling if framing tag situation differs from source to target - for skipAll or skipPairedIf framing tags situation is the same in source and target, both sides should be treated as described above. If framing tags only exist in source, then still they should be treated as described above. If they only exist in target, then nothing should be removed. |
Reorganize TM | |
---|---|
Purpose | Reorganizes tm and fixing issues. |
Request | GET /%service%/%tm_name%/reorganize |
Headers | Accept - applicaton/xml |
up to v0.4.x reorganize is sync, so t5memory reorganize would check this condition
, and in case if this condition is true and then it passes segment to putProposal function, which is also used by UpdateRequest and ImportTmx request, so other
{ |
Export TMX from TM - old | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Creates TMX from tm. | |||||||||
Request | GET /%service%/%tm_name%/ | |||||||||
Headers | Accept - applicaton/xml | |||||||||
|
Export TMX from TM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Exports TMX from tm. | |||||||||
Request | GET /%service%/%tm_name%/download.tmx | |||||||||
Headers | Accept - applicaton/xml | |||||||||
curl | curl --location --request GET 'http://localhost:4040/t5memory/{MEMORY_NAME}/download.tmx' \ --header 'Accept: application/xml' \ --header 'Content-Type: application/json' \ --data '{"startFromInternalKey": "7:1", "limit": 20}' | |||||||||
Could have body with this fields startFromInternalKey - in "recordKey:targetKey" format sets starting point for import loggingThreshold- as in other requests in response in headers you would get NextInternalKey: 19:1 - if exists next item in memory else the same as you send. So you could repeat the call with new starting position. If no body provided, export starts from the beginning (key 7:1) to the end. This endpoint should flush tm before execution
|
Export in internal format | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Creates and exports archive with .TMD, .TMI files of TM | |||||||||
Request | GET /%service%/%tm_name%/download.tm | |||||||||
Headers | application/zip | |||||||||
returns archive(.tm file) consists with .tmd and .tmi files
|
Export in internal format - OLD | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Creates and exports archive with .TMD, .TMI, .MEM files of TM | |||||||||
Request | GET /%service%/%tm_name%/ | |||||||||
Headers | application/zip | |||||||||
returns archive(.tm file) consists with .tmd and .tmi files
|
Get the status of TM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Request | GET /%service%/%tm_name%/status | |||||||||
Params | - | |||||||||
Would return status of TM. It could be 'not found', 'available' if it's on the disk but not loaded into the RAM yet, and 'open' with additional info. In case if there was at least one try to import tmx or reorganize tm since it was loaded into the RAM, additional fields would appear and stay in the statistics till memory would be unloaded.
|
Fuzzy search | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Purpose | Returns enrties\translations with small differences from requested | ||||||||||||||||||
Request | POST /%service%/%tm_name%/fuzzysearch | ||||||||||||||||||
Params | Required: source, sourceLang, targetLang iNumOfProposal - limit of found proposals - max is 20, if 0 → use default value '5' | ||||||||||||||||||
{ |
New Concordance search | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Purpose | Returns entries\translations that fits selected filters. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Request | POST /%service%/%tm_name%/search | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Params | Required: NONE iNumOfProposal - limit of found proposals - max is 200, if 0 → use default value '5' | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Search is made segment-by segment, and it's checking segment if it fits selected filters. You can search for EXACT or CONCORDANCE matches in this fields: "Filters":" It's possible to apply filter just with SearchMode, like if you would type "authorSearchMode": "exact",but there would be no "author" field, it would look for segments, where author field is empty. "timestampSpanStart":"20000121T115234Z", You should set both parameters to apply filter, otherwise you would get error as return. Check output to see how it was parsed and applied. "logicalOr": 1, Instead of returning segments, just count them and return counter in "NumOfFoundSegments":22741 "sourceLang":"en-GB", Lang filters could be applied with major lang feature, so source lang in this case would be applied as exact filter for source lang, but target lang would check if langs is in the same lang group. That check is done in languages.xml file with isPreferred flag. "GlobalSearchOptions":"SEARCH_FILTERS_LOGICAL_OR|SEARCH_EXACT_MATCH_OF_SRC_LANG_OPT, lang = en-GB|SEARCH_GROUP_MATCH_OF_TRG_LANG_OPT, lang = de", Other that you can send is: "searchPosition":"8:1", So search position is position where to start search internaly in btree. This search is limited by num of found segment(set by numResults) or timeout(set by msSearchAfterNumResults), but timeout would be ignored in case if there are no segments in the tm to fit params. Max numResults is 200. from responce.
Here is search request with all possible parameters: "source":"the", "sourceSearchMode":"CONTAINS, CASEINSENSETIVE, WHITESPACETOLERANT, INVERTED", "target":"", "targetSearchMode":"EXACT, CASEINSENSETIVE", "document":"evo3_p1137_reports_translation_properties_de_fr_20220720_094902", "documentSearchMode":"CONTAINS, INVERTED", "author":"some author", "timestampSpanStart": "20000121T115234Z", "timestampSpanEnd": "20240121T115234Z", "addInfo":"some add info", "addInfoSearchMode":"CONCORDANCE, WHITESPACETOLERANT", "context":"context context", "contextSearchMode":"EXACT", "sourceLang":"en-GB", "targetLang":"SV", "searchPosition": "8:1", "numResults": 2, "msSearchAfterNumResults": 25, So request with this body would also work:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Concordance search | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Returns entries\translations that contain requested segment | |||||||||
Request | POST /%service%/%tm_name%/concordancesearch | |||||||||
Params | Required: searchString - what we are looking for , searchType ["Source"|"Target"|"SourceAndTarget"] - where to look iNumOfProposal - limit of found proposals - max is 20, if 0 → use default value '5' | |||||||||
|
Get entry | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Returns entry that located in [recordKey:targetKey] location or error if it's empty | |||||||||
Request | POST /%service%/%tm_name%/getentry | |||||||||
Params | Required: recordKey- it's position in the tmd file, starting from 7(first 6 it's service records) targetKey - position in record, starting from 1 Implemented in 0.6.24 | |||||||||
|
Update entry | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Purpose | Updates entry\translation | ||||||||||||||||||
Request | POST /%service%/%tm_name%/entry | ||||||||||||||||||
Params | Only sourceLang, targetLang, source and target are required | ||||||||||||||||||
This request would made changes only in the filebuffer(so files on disk would not be changed)
|
Delete entry | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Deletes entry\translation | |||||||||
Request | POST /%service%/%tm_name%/entrydelete | |||||||||
Params | 2 ways - by id or regular 1) by id - 3 integers should be provided - recordKey, targetKey and segmentId. After deletion, tmd is rearranged, so that's why we should use segmentId - it's pseudo unique key. It's generated during import tmx, or when inserting segment without providing id, but if it's provided in update call or during reorganize when segment's id is not 0, it would be used instead of generating new. If id is not matching, t5memory would not delete the segment. if keys are provided, other provided fields would be ignored. All 3 keys should be provided in request to delete a segment. if segment is deleted, it's fields would be returned in response ps: recordKey and targetKey together forming internalKey, in [recordKey:targetKey] format(like 7:1 - first segment) 2)regular - old. Only sourceLang, targetLang, source, and target are required Deleting based on strict match(including tags and whitespaces) of target and source | |||||||||
This request would made changes only in the filebuffer(so files on disk would not be changed)
| ||||||||||
Export in internal format | ||||||||||
Purpose | Creates and exports archive with .TMD, .TMI, .MEM files of TM | |||||||||
Request | GET /%service%/%tm_name%/ | |||||||||
Headers | application/zip |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
Response example:%binary_data% |
Get the status of TM
-
Would return status of TM. It could be 'not found', 'available' if it's on the disk but not loaded into the RAM yet, and 'open' with additional info. In case if there was at least one try to import tmx or reorganize tm since it was loaded into the RAM, additional fields would appear and stay in the statistics till memory would be unloaded.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
Response example:
{//just opened tm, without import\reorganize called
"status": "open",
"lastAccessTime": "",
"creationTime": "20230703T122212Z",
"tmCreatedInT5M_version": "0:5:1"
}
{// after reorgainize was called
"status": "open",
"reorganizeStatus": "available",
"reorganizeTime": 100,
"reorganizeTime": "Overall reorganize time is : 0:00:02\n",
"segmentsReorganized": 1112,
"invalidSegments": 10,
"invalidSegmentsRCs": "5005:10; ",
"firstInvalidSegments": "123; 432; 554; 623; 659; 675; 741; 742; 753; 755; ",
"invalidSymbolErrors": -1,
"reorganizeErrorMsg": "",
"lastAccessTime": "",
"creationTime": "20230810T095233Z",
"tmCreatedInT5M_version": "0:5:10"
} {//not opened but available on the disk
"status": "available"
}
{//not found tm {
"status": "not found",
"res": 48 // 48- both tmi and tmd files are no found, 16- only TMD file not found, 32 - only TMI file not found
}
The tmxImportStatus could be "available", "import" or "failed" if the import had errors. If there were at least one import to that tm, new fields would appear
{//tm in process of import
"status": "open",
"tmxImportStatus": "import",
"importProgress" : 56,
"importTime": "00:00:13",
"segmentsImported": 1356,
"invalidSegments": 23,
"invalidSymbolErrors": 2,
"importErrorMsg": "",
"lastAccessTime": "%lastAccessTime",
"creationTime": "20230703T122212Z",
"tmCreatedInT5M_version": "0:5:1"
}// in case if internal error happened, like t5memory would have error 5034 or 5035 which indicates, that tm size is reached it's limit and you should create new one to save new segments or part that left from tmx that you tried to import, status would look like this
{
"status": "open",
"tmxImportStatus": "failed",
"importProgress": 100,
"importTime": "Overall import time is : 0:00:19\n",
"segmentsImported": 445,
"invalidSegments": 1,
"invalidSymbolErrors": 0,
"importErrorMsg": "Warning: encoding 'UTF-16' from XML declaration or manually set contradicts the auto-sensed encoding; ignoring at column 40 in line 1; \n Fatal internal Error at column 6 in line 9605, import stopped at progress = 0%, errorMsg: TM is reached it's size limit, please create another one and import segments there, rc = 5034; aciveSegment = 1834\n\nSegment 1834 not imported\r\n\nReason = \nDocument = none\nSourceLanguage = de-DE\nTargetLanguage = en-GB\nMarkup = OTMXUXLF\nSource = in Verbindung mit Befestigungswinkel MS-...-WPE-B zur Wandmontage eines Einzelgeräts\nTarget = In combination with mounting bracket MS-...-WPE-B for wall mounting an individual component ",
"lastAccessTime": "",
"ErrorMsg": " Fatal internal Error at column 6 in line 9605, import stopped at progress = 0%, errorMsg: TM is reached it's size limit, please create another one and import segments there, rc = 5034; aciveSegment = 1834\n\nSegment 1834 not imported\r\n\nReason = \nDoc"
}
So you would have info about last segment which interrupted tm import |
Fuzzy search
Required: source, sourceLang, targetLang
iNumOfProposal - limit of found proposals - max is 20, if 0 → use default value '5'Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
Request example:
Request example:
{ // required fields
"sourceLang":"en-GB", // langs would be checked with languages.xml
"targetLang":"de",
"source":"For > 100 setups.",
// optional fields
["documentName":"OBJ_DCL-0000000845-004_pt-br.xml"],
["segmentNumber":15],
["markupTable":"OTMXUXLF"], //if there is no markup, default OTMXUXLF would be used.
//Markup tables should be located inside ~/.t5memory/TABLE/%markup$.TBL
["context":"395_408"],
["numOfProposals":20], // num of expected segments in output. By default it's 5
["loggingThreshold": 0]
}
Response example:
Success:
{
"ReturnValue": 0,
"ErrorMsg": "",
"NumOfFoundProposals": 1,
"results": [
{
"source": "For > 100 setups.",
"target": "Für > 100 Aufstellungen.",
"segmentNumber": 10906825,
"id": "",
"documentName": "none",
"documentShortName": "NONE",
"sourceLang": "en-GB",
"targetLang": "de-DE",
"type": "Manual",
"matchType": "Exact", // could be exact or fuzzy
"author": "",
"timestamp": "20190401T084052Z",
"matchRate": 100,
"fuzzyWords": -1, // for exact match it would be -1 here and in diffs
"fuzzyDiffs": -1, // otherwise here would be amount of parsed words and diffs that was
// used in fuzzy matchrate calculation
"markupTable": "OTMXML",
"context": "",
"additionalInfo": ""
}
]
}
Not found:
{
"ReturnValue": 133,
"ErrorMsg": "OtmMemoryServiceWorker::concordanceSearch::"
}
For exact match used function that's comparing strings ignoring whitespaces. First normalized strings(without tags).
If it's the same string, then t5memory is checking string with tags and could return 100 or 97 match rate depending on result.
Then it's checking context match rate and if document name is the same(non case sensitive)
Then it's checking and modifying exactMatchRate according to code in code block below.
After that it would store exact matches only with usMatchLevel>=100. If there would be no exact matches, fuzzy match calculations would begin.
In case if there is at least one exact match, any fuzzy matches would be skipped.
In case if we have only one exact exact match, it's rate would be set to 102
For equal matches with 100% word matches but different whitespaces/newlines, each whitespace/newline diffs would be count as -1%. For punctuation, at least for 0.4.50, each punctuation would count as word token. This would be changed in future to count punctuation as whitespaces.
For fuzzy calculation tags would be removed from text, except t5:np tags, which would be replaced with their "r" attribute to be counted as 1 word per tag.
For fuzzy rate calculation we count words and then diffs in normalized string(without tags), using this formula:
if (usDiff < usWords )
{
*pusFuzzy = (usWords != 0) ? ((usWords - usDiff)*100 / usWords) : 100;
}
else
{
*pusFuzzy = 0;
} /* endif */ Regarging Number Protection feature, tags from number protection would be replaced with their regexHashes from their attributes, so they would be count as 1 word each. NP with the same regex would be counted as equal
To count diffs, t5memory go throuht both segments to find matching tokens, to find something called snake- line of matching tokens.
Then It marks unmatched as INSERTED or DELETED tokens, and based on that it calculates diffs.
if it's 100% rate, we add tags and compare it again
if then it's not equal, here is how match rate would be changed - probably this would never happens, because we have exact match test before fuzzy,
and we do exact test even if triplesHashes is different(which is pre-fuzzy calculation and if it's equal, it could be flag that trigger exact test)
if ( !fStringEqual )
{
if ( usFuzzy > 3 )
{
usFuzzy -= 3;
}
else
{
usFuzzy = 0;
} /* endif */
usFuzzy = std::min( (USHORT)99, usFuzzy );
} /* endif */
then depending on type of translation it could tweak rate
if ( (usModifiedTranslationFlag == TRANSLFLAG_MACHINE) && (usFuzzy < 100) )
{
// ignore machine fuzzy matches
}
else if ( usFuzzy > TM_FUZZINESS_THRESHOLD )
{
/********************************************************/
/* give MT flag a little less fuzziness */
/********************************************************/
if ( usModifiedTranslationFlag == TRANSLFLAG_MACHINE )
{
if ( usFuzzy > 1 )
{
usFuzzy -= 1;
}
else
{
usFuzzy = 0;
} /* endif */
} /* endif */
if (usFuzzy == 100 && (pGetIn->ulParm & GET_RESPECTCRLF) && !fRespectCRLFStringEqual )
{ // P018279!
usFuzzy -= 1;
}
add to resulting set
} /* endif */
} /* endif */
At the end fuzzy request replaces tags in proposal from TM with tags from request, and if matchRate >= 100, it calculates whitespace diffs and apply matchRate-= wsDiffs |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
ExactMatchRate calculation:so, before usExact is equal to 97 or 100, depending if strings with tags are equal ignoring whitespaces and then code do some tweaks.
pClb is struct that have proposals from TM, pGetIn is fuzzy requests data
// loop over CLBs and look for best matching entry
{
LONG lLeftClbLen; // left CLB entries in CLB list
PTMX_TARGET_CLB pClb; // pointer for CLB list processing
#define SEG_DOC_AND_CONTEXT_MATCH 8
#define DOC_AND_CONTEXT_MATCH 7
#define CONTEXT_MATCH 6
#define SAME_SEG_AND_DOC_MATCH 5
#define SAME_DOC_MATCH 4
#define MULT_DOC_MATCH 3
#define NORMAL_MATCH 2
#define IGNORE_MATCH 1
SHORT sCurMatch = 0;
// loop over all target CLBs
pClb = pTMXTargetClb;
lLeftClbLen = RECLEN(pTMXTargetRecord) -
pTMXTargetRecord->usClb;
while ( ( lLeftClbLen > 0 ) && (sCurMatch < SAME_SEG_AND_DOC_MATCH) )
{
USHORT usTranslationFlag = pClb->bTranslationFlag;
USHORT usCurContextRanking = 0; // context ranking of this match
BOOL fIgnoreProposal = FALSE;
// apply global memory option file on global memory proposals
if ( pClb->bTranslationFlag == TRANSLFLAG_GLOBMEM ) // pClb it's segment in TM
{
if ( (pGetIn->pvGMOptList != NULL) && pClb->usAddDataLen ) // pGetIn it's fuzzy requests segment
{
USHORT usAddDataLen = NtmGetAddData( pClb, ADDDATA_ADDINFO_ID, pContextBuffer, MAX_SEGMENT_SIZE );
if ( usAddDataLen )
{
GMMEMOPT GobMemOpt = GlobMemGetFlagForProposal( pGetIn->pvGMOptList, pContextBuffer );
switch ( GobMemOpt )
{
case GM_SUBSTITUTE_OPT: usTranslationFlag = TRANSLFLAG_NORMAL; break;
case GM_HFLAG_OPT : usTranslationFlag = TRANSLFLAG_GLOBMEM; break;
case GM_HFLAGSTAR_OPT : usTranslationFlag = TRANSLFLAG_GLOBMEMSTAR; break;
case GM_EXCLUDE_OPT : fIgnoreProposal = TRUE; break;
} /* endswitch */
} /* endif */
} /* endif */
if ( pClb == pTMXTargetClb )
{
usTargetTranslationFlag = usTranslationFlag;
} /* endif *
} /* endif */
// check context strings (if any)
if ((!fIgnoreProposal)
&& pGetIn->szContext[0]
&& pClb->usAddDataLen )
{
USHORT usContextLen = NtmGetAddData( pClb, ADDDATA_CONTEXT_ID, pContextBuffer, MAX_SEGMENT_SIZE );
if ( usContextLen != 0 )
{
usCurContextRanking = NTMCompareContext( pTmClb, pGetIn->szTagTable, pGetIn->szContext, pContextBuffer );
} /* endif */
} /* endif */
// check for matching document names
if ( pGetIn->ulParm & GET_IGNORE_PATH )
{
// we have to compare the real document names rather than comparing the document name IDs
PSZ pszCLBDocName = NTMFindNameForID( pTmClb, &(pClb->usFileId), (USHORT)FILE_KEY );
if ( pszCLBDocName != NULL )
{
PSZ pszName = UtlGetFnameFromPath( pszCLBDocName );
if ( pszName == NULL )
{
pszName = pszCLBDocName;
} /* endif */
fMatchingDocName = stricmp( pszName, pszDocName ) == 0;
}
else
{
// could not access the document name, we have to compare the document name IDs
fMatchingDocName = ((pClb->usFileId == usGetFile) || (pClb->usFileId == usAlternateGetFile));
} /* endif */
}
else
{
// we can compare the document name IDs
fMatchingDocName = ((pClb->usFileId == usGetFile) || (pClb->usFileId == usAlternateGetFile));
} /* endif */
if ( fIgnoreProposal )
{
if ( sCurMatch == 0 )
{
sCurMatch = IGNORE_MATCH;
} /* endif */
}
else if ( usCurContextRanking == 100 )
{
if ( fMatchingDocName && (pClb->ulSegmId >= (pGetIn->ulSegmentId - 1)) && (pClb->ulSegmId <= (pGetIn->ulSegmentId + 1)) )
{
if ( sCurMatch < SEG_DOC_AND_CONTEXT_MATCH )
{
sCurMatch = SEG_DOC_AND_CONTEXT_MATCH;
pTMXTargetClb = pClb; // use this target CLB for match
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
}
}
else if ( fMatchingDocName )
{
if ( sCurMatch < DOC_AND_CONTEXT_MATCH )
{
sCurMatch = DOC_AND_CONTEXT_MATCH;
pTMXTargetClb = pClb; // use this target CLB for match
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
}
else if ( sCurMatch == DOC_AND_CONTEXT_MATCH )
{
// we have already a match of this type so check if context ranking
if ( usCurContextRanking > usContextRanking )
{
pTMXTargetClb = pClb; // use newer target CLB for match
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
}
// use time info to ensure that latest match is used
else if ( usCurContextRanking == usContextRanking )
{
// GQ 2015-04-10 New approach: If we have an exact-exact match use this one, otherwise use timestamp for the comparism
BOOL fExactExactNewCLB = fMatchingDocName && (pClb->ulSegmId >= (pGetIn->ulSegmentId - 1)) && (pClb->ulSegmId <= (pGetIn->ulSegmentId + 1));
BOOL fExactExactExistingCLB = ((pTMXTargetClb->usFileId == usGetFile) || (pTMXTargetClb->usFileId == usAlternateGetFile)) &&
(pTMXTargetClb->ulSegmId >= (pGetIn->ulSegmentId - 1)) && (pTMXTargetClb->ulSegmId <= (pGetIn->ulSegmentId + 1));
if ( fExactExactNewCLB && !fExactExactExistingCLB )
{
// use exact-exact CLB for match
pTMXTargetClb = pClb;
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
}
else if ( (fExactExactNewCLB == fExactExactExistingCLB) && (pClb->lTime > pTMXTargetClb->lTime) )
{
// use newer target CLB for match
pTMXTargetClb = pClb;
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
}
} /* endif */
} /* endif */
}
else
{
if ( sCurMatch < CONTEXT_MATCH )
{
sCurMatch = CONTEXT_MATCH;
pTMXTargetClb = pClb; // use this target CLB for match
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
}
else if ( sCurMatch == CONTEXT_MATCH )
{
// we have already a match of this type so check if context ranking
if ( usCurContextRanking > usContextRanking )
{
pTMXTargetClb = pClb; // use newer target CLB for match
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
}
// use time info to ensure that latest match is used
else if ( (usCurContextRanking == usContextRanking) && (pClb->lTime > pTMXTargetClb->lTime) )
{
pTMXTargetClb = pClb; // use newer target CLB for match
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
} /* endif */
} /* endif */
} /* endif */
}
else if ( fMatchingDocName && (pClb->ulSegmId >= (pGetIn->ulSegmentId - 1)) && (pClb->ulSegmId <= (pGetIn->ulSegmentId + 1)) )
{
// same segment from same document available
sCurMatch = SAME_SEG_AND_DOC_MATCH;
pTMXTargetClb = pClb; // use this target CLB for match
usContextRanking = usCurContextRanking;
usTargetTranslationFlag = usTranslationFlag;
}
else if ( fMatchingDocName )
{
// segment from same document available
if ( sCurMatch < SAME_DOC_MATCH )
{
sCurMatch = SAME_DOC_MATCH;
pTMXTargetClb = pClb; // use this target CLB for match
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
}
else if ( sCurMatch == SAME_DOC_MATCH )
{
// we have already a match of this type so
// use time info to ensure that latest match is used
if ( pClb->lTime > pTMXTargetClb->lTime )
{
pTMXTargetClb = pClb; // use newer target CLB for match
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
} /* endif */
} /* endif */
}
else if ( pClb->bMultiple )
{
// multiple target segment available
if ( sCurMatch < MULT_DOC_MATCH )
{
// no better match yet
sCurMatch = MULT_DOC_MATCH;
pTMXTargetClb = pClb; // use this target CLB for match
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
} /* endif */
}
else if ( usTranslationFlag == TRANSLFLAG_NORMAL )
{
// a 'normal' memory match is available
if ( sCurMatch < NORMAL_MATCH )
{
// no better match yet
sCurMatch = NORMAL_MATCH;
pTMXTargetClb = pClb; // use this target CLB for match
usTargetTranslationFlag = usTranslationFlag;
usContextRanking = usCurContextRanking;
} /* endif */
} /* endif */
// continue with next target CLB
if ( sCurMatch < SAME_SEG_AND_DOC_MATCH )
{
lLeftClbLen -= TARGETCLBLEN(pClb);
if (lLeftClbLen > 0)
{
usTgtNum++;
pClb = NEXTTARGETCLB(pClb);
}
} /* endif */
} /* endwhile */
{
BOOL fNormalMatch = (usTargetTranslationFlag == TRANSLFLAG_NORMAL) ||
(usTargetTranslationFlag == TRANSLFLAG_GLOBMEM) ||
(usTargetTranslationFlag == TRANSLFLAG_GLOBMEMSTAR);
switch ( sCurMatch )
{
case IGNORE_MATCH :
usMatchLevel = 0;
break;
case SAME_SEG_AND_DOC_MATCH :
usMatchLevel = fNormalMatch ? usEqual+2 : usEqual-1;
break;
case SEG_DOC_AND_CONTEXT_MATCH :
usMatchLevel = fNormalMatch ? usEqual+2 : usEqual-1; // exact-exact match with matching context
break;
case DOC_AND_CONTEXT_MATCH :
if ( usContextRanking == 100 )
{
// GQ 2015/05/09: treat 100% context matches as normal exact matches
// usMatchLevel = fNormalMatch ? usEqual+2 : usEqual-1;
usMatchLevel = fNormalMatch ? usEqual+1 : usEqual-1;
}
else
{
usMatchLevel = fNormalMatch ? usEqual+1 : usEqual-1;
} /* endif */
break;
case CONTEXT_MATCH :
if ( usContextRanking == 100 )
{
// GQ 2015/05/09: treat 100% context matches as normal exact context matches
// usMatchLevel = fNormalMatch ? usEqual+2 : usEqual-1;
// GQ 2016/10/24: treat 100% context matches as normal exact matches
usMatchLevel = fNormalMatch ? usEqual : usEqual-1;
}
else
{
usMatchLevel = fNormalMatch ? usEqual : usEqual-1;
} /* endif */
break;
case SAME_DOC_MATCH :
usMatchLevel = fNormalMatch ? usEqual+1 : usEqual-1;
break;
case MULT_DOC_MATCH :
usMatchLevel = fNormalMatch ? usEqual+1 : usEqual-1;
break;
default :
usMatchLevel = fNormalMatch ? usEqual : usEqual-1;
break;
} /* endswitch */
}
} |
Concordance search
Required: searchString - what we are looking for , searchType ["Source"|"Target"|"SourceAndTarget"] - where to look
iNumOfProposal - limit of found proposals - max is 20, if 0 → use default value '5'
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
Request example: { "searchString": "The", "searchType": "source", // could be Source, Target, SourceAndTarget - says where to do search ["searchPosition": "",] ["numResults": 20,] ["msSearchAfterNumResults": 250,] ["loggingThreshold": 0] } Response example:Success: { "ReturnValue": 0, "NewSearchPosition": null, "results": [ { "source": "For > 100 setups.", "target": "Für > 100 Aufstellungen.", "segmentNumber": 10906825, "id": "", "documentName": "none", "documentShortName": "NONE", "sourceLang": "en-GB",← rfc5646 "targetLang": "de-DE",← rfc5646 "type": "Manual", "matchType": "undefined", "author": "", "timestamp": "20190401T084052Z", "matchRate": 0, "markupTable": "OTMXML", "context": "", "additionalInfo": "" } ], "ErrorMsg": "" } Success, but with NewSearchPosition - not all TM was checked, use this position to repeat search: { "ReturnValue": 0, "NewSearchPosition": "8:1", "results": [ { "source": "For > 100 setups.The tar", "target": "Für > 100 Aufstellungen.The target", "segmentNumber": 109068250, "id": "", "documentName": "none", "documentShortName": "NONE", "sourceLang": "en-GB", "targetLang": "de-DETe2.xlf", "typesourceLang": "Manualde-DE", "matchTypetargetLang": "undefinedEN-GB", "authortype": "Manual", "timestampauthor": "20190401T084052ZTHOMAS LAURIA", "matchRatetimestamp": 0"20231229T125701Z", "markupTable": "OTMXMLOTMXUXLF", "context": "2_3", "additionalInfo": "", } ], "ErrorMsg": "" } SearchPosition / NewSearchPositionFormatinternalKey": "7:1" First is segmeng\record number, second is target number The NextSearchposition is an internal key of the memory for the next position on sequential access. Since it is an internal key, maintained and understood by the underlying memory plug-in (for EqfMemoryPlugin is it the record number and the position in one record), no assumptions should be made regarding the content. It is just a string that, should be sent back to OpenTM2 on the next request, so that the search starts from there. So is the implementation in Translate5: The first request to OpenTM2 contains SearchPosition with an empty string, OpenTM2 returns than a string in NewSearchPosition, which is just resent to OpenTM2 in the next request. Not found:{ "ReturnValue": 0, "NewSearchPosition": null, "ErrorMsg": "" }TM not found:{ "ReturnValue": 133, "ErrorMsg": "OtmMemoryServiceWorker::concordanceSearch::" } } } |
Delete entries / mass deletion | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Purpose | Deletes entries\translation | ||||||||||
Request | POST /%service%/%tm_name%/entriesdelete | ||||||||||
Params | This would start reorganize process which would remove like reorganize bad segments and also would remove segments that gives true when checking with provided filters combined with logical AND. So if you provide timestamps and addInfo, only segments within provided timestamp and with that addInfo would not be imported to new TM(check reorganize process). | ||||||||||
| |||||||||||
Update entry | |||||||||||
Purpose | Updates entry\translation | ||||||||||
Request | POST /%service%/%tm_name%/entry | ||||||||||
Params | This request would made changes only in the filebuffer(so files on disk would not be changed)
| ||||||||||
Delete entry | |||||||||||
Purpose | Deletes entry\translation | ||||||||||
Request | POST /%service%/%tm_name%/entrydelete | ||||||||||
Params | Only sourceLang, targetLang, source, and target are required Deleting based on strict match(including tags and whitespaces) of target and source |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
Request example:
{
"sourceLang": "bg",
"targetLang": "en",
"source": "The end",
"target": "Eth dne"
["documentName": "my file.sdlxliff",]
["segmentNumber": 1,]
["markupTable": "translate5",]
["author": "Thomas Lauria",]
["type": "",]
["timeStamp": ""],
["context": "",]
["addInfo": ""] , ["loggingThreshold": 0]
}
|
Save all TMs
Flushes all filebuffers(TMD, TMI files) into the filesystem. Reset 'Modified' flags for file buffers.
Filebuffer is a file instance of .TMD or .TMI loaded into RAM. It provides better speed and safety when working with files.
-
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
Response example:{
'saved 4 files': '/home/or/.t5memory/MEM/mem2.TMD, /home/or/.t5memory/MEM/mem2.TMI, /home/or/.t5memory/MEM/newBtree3.TMD, /home/or/.t5memory/MEM/newBtree3.TMI'
} List of saved files |
Shutdown service
dontsave=1(optional in address) - skips saving tms, for now value doesn't matter, only presence
If try to save tms before closing, would check if there is still import process going on
If there is some, would wait 1 second and check again.
Repeats last step up to 10 min, then closes service anyway.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
Response example:%Empty% |
Test tag replacement call
Required: src, trg,
Optional: req
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
Fuzzy search tag replacement test:
Request example:
{
"src": "Tap <ph x='1'/>View <ph x='2' />o<bpt i='1' x='3'/> get <ph x='4'>strong</ph>displayed<ph x='5'>View</ph> two strong<ept i='1' x='6'/>US patents.",
"trg": "View <ph x='1'/> tap <ph x='2' />to<bpt i='1' x='3'/> got <ph x='4'>strong</ph>dosplayd<ph x='5'>Veiw</ph> two strong<ept i='1' x='6'/>US patents.",
"req": "Tap <x id='123'/>View <x id='222' />o<g> get <x id='44'>strong</x>displayed<x id='51'>View</x> two strong</g>US patents."
}
Response example:
//'1' - request result
//'2' - src result
//'3' - trg result
{
'1' :'Tap <x id="123"/>View <x id="222"/>o<bx/> get <x id="44"/>displayed<x id="51"/> two strong<ex/>US patents.',
'2' :'Tap <x id="123"/>View <x id="222"/>o<g> get <x id="44"/>displayed<x id="51"/> two strong</g>US patents.',
'3' :'View <x id="123"/> tap <x id="222"/>to<g> got <x id="44"/>dosplayd<x id="51"/> two strong</g>US patents.',
};
Import tag replacement test:
Request example:
{
"src": "Tap <ph/>View <ph/>o<bpt/> get <ph>strong</ph>displayed<ph>View</ph> two strong<ept/>US patents.",
"trg": "View <ph/> tap <ph/>to<bpt/> got <ph>strong</ph>dosplayd<ph>Veiw</ph> two strong<ept/>US patents.",
}
Response example:
{
'1' :'Tap <ph x="1"/>View <ph x="2"/>o<bpt x="3" i="1"/> get <ph x="4"/>displayed<ph x="5"/> two strong<ept x="6" i="1"/>US patents.',
'2' :'View <ph x="1"/> tap <ph x="2"/>to<bpt x="3" i="1"/> got <ph x="4"/>dosplayd<ph x="5"/> two strong<ept x="6" i="1"/>US patents.',
};
|
...
|
Save all TMs | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Flushes all filebuffers(TMD, TMI files) into the filesystem. Reset 'Modified' flags for file buffers. Filebuffer is a file instance of .TMD or .TMI loaded into RAM. It provides better speed and safety when working with files. | |||||||||
Request | GET /%service%_service/savetms | |||||||||
Params | - | |||||||||
|
Shutdown service | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Safely shutting down the service with\without saving all loaded tm files to the disk | |||||||||
Request | GET /%service%_service/shutdown?dontsave=1 | |||||||||
Params | dontsave=1(optional in address) - skips saving tms, for now value doesn't matter, only presence | |||||||||
If try to save tms before closing, would check if there is still import process going on
|
Test tag replacement call | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Updates entry\translation | |||||||||
Request | POST /%service%_service/tagreplacement | |||||||||
Params | Required: src, trg, Optional: req | |||||||||
|
Configuration of service
You can configure the service in ~/.t5service/t5memory.conf
Logging | ||
---|---|---|
Level | Mnemonic | Description |
0 | DEVELOP | could make code work really slow, should be used only when debugging some specific places in code, like binary search in files, etc. |
1 | DEBUG | logging values of variables. Wouldn't delete temporary files(In MEM and TMP subdirectories), like base64 encoded\decoded tmx files and archives for import\export |
2 | INFO | logging top-level functions entrances, return codes, etc. Default value. |
3 | WARNING | logging if we reached some commented or hardcoded code. Usually commented code here is replaced with new code, and if not, it's marked as ERROR level |
4 | ERROR | errors, why and where something fails during parsing, search, etc |
5 | FATAL | you shouldn't reach this code, something is really wrongOther values would be ignored. The set level would stay the same till you change it in a new request or close the app. Logs suppose to be written into a file with date\time name under ~/.OtmMemoryService/Logs and errors/fatal are supposed to be duplicated in another log file with FATAL suffices |
6 | TRANSACTION | - Logs only things like begin\end of request etc. No purpose to setup this hight |
Logging could impact application speed very much, especially during import or export. In t5memory there are 2 systems of logs - one from glog library and could be set in launch as commandline parameter and one is internal to filter out logs based on their level, can be set with every request that have json body with additional ["loggingThreshold": 0] parameter or at startup with flag. POST http://localhost:4040/t5memory/example_tm/ { Or in t5memory.conf file in line (config file is obsolete now) | ||
Logging | ||
Level | Mnemonic | Description |
0 | DEVELOP | could make code work really slow, should be used only when debugging some specific places in code, like binary search in files, etc. |
1 | DEBUG | logging values of variables. Wouldn't delete temporary files(In MEM and TMP subdirectories), like base64 encoded\decoded tmx files and archives for import\export |
2 | INFO | logging top-level functions entrances, return codes, etc. Default value. |
3 | WARNING | logging if we reached some commented or hardcoded code. Usually commented code here is replaced with new code, and if not, it's marked as ERROR level |
4 | ERROR | errors, why and where something fails during parsing, search, etc |
5 | FATAL | you shouldn't reach this code, something is really wrongOther values would be ignored. The set level would stay the same till you change it in a new request or close the app. Logs suppose to be written into a file with date\time name under ~/.OtmMemoryService/Logs and errors/fatal are supposed to be duplicated in another log file with FATAL suffices |
6 | TRANSACTION | - Logs only things like begin\end of request etc. No purpose to setup this hight |
Logging could impact application speed very much, especially during import or export. POST http://localhost:4040/t5memory/example_tm/ { Or in t5memory.conf file in line |
Working directory | |
---|---|
Path | Description |
~/.t5memory | The main directory of service. Should always be under the home directory. Consists of nested folders and t5memory.conf file(see Config file). All directories\files below are nested |
LOG | lIncludes log files. It should be cleanup manualy. One session(launch of service) creates two files Log_Thu May 12 10:15:48 2022 .log and Log_Thu May 12 10:15:48 2022 .log_IMPORTANT |
MEM | Main data directory. All tm files is stored here. One TM should include .TMD(data file), .TMI(index file), .MEM(properties file) with the same name as TM name |
TABLE | Services reserved readonly folder with tagtables, languages etc. |
TEMP | For temporary files that were created for mainly import\export. On low debug leved(DEVELOP, DEBUG) should be cleaned manualy |
t5memory.conf | Main config file(see config file) |
Config directory should be located in a specific place |
...