Table of Contents
Overview and API introduction
In this document the translate5 TM service REST interface is described.
The translate5 TM service is build by using the OpenTM2 Translation Memory Engine.
It provides the following functionality:
- import new openTM2-TMs
- delete openTM2-TMs
- create new empty openTM2-TM
- import TMX
- open TM and close TM: not possible see extra section in this document.
- query TM for Matches: one query per TM, not quering multiple TMs at once.
- query TM for concordance search
- save new entry to TM
This can be achieved by the following specification of a RESTful HTTP Serive, the specification is given in the following form:
- URL of the HTTP Resource, where servername and an optional path prefix is configurable.
- HTTP Method with affected functionality
- Brief Description
- Sent and returned Body.
Request Data Format:
The transferred data in the requests is JSON and is directly done in the request body.
URL Format:
In this document, the OpenTM2 is always assumed under http://opentm2/.
To rely on full networking features (proxying etc.) the URL is configurable in Translate5 so that the OpenTM2 instance can also reside under http://xyz/foo/bar/.
Errors
For each request, the possible errors are listed below for each resource. In case of an error, the body should contain at least the following JSON, if it is senseful the attributes of the original representation can be added.
{
errors: [{errorMsg: 'Given tmxData is no TMX.'}]
}
Values | |
---|---|
%service% | Name of service(default - t5memory, could be changed in t5m3mory.conf file |
%tm_name% | Name of Translation Memory |
Example | http://localhost:4040/t5memory/examle_tm/fuzzysearch/? |
Endpoints overview | ||||
---|---|---|---|---|
1 | Get the list of TMs | Returns JSON list of TMs | GET | /%service%/ |
2 | Create TM | Creates TM with the provided name | POST | /%service%/ |
3 | Create/Import TM in internal format | Import and unpack base64 encoded archive of .TMD, .TMI, .MEM files. Rename it to provided name | POST | /%service%/ |
4 | Delete TM | Deletes .TMD, .TMI, .MEM files | DELETE | /%service%/%tm_name%/f |
5 | Import TMX into TM | Import provided base64 encoded TMX file into TM | POST | /%service%/%tm_name%/import |
6 | Export TMX from TM | Creates TMX from tm. Encoded in base64 | GET | /%service%/%tm_name%/ |
7 | Export in Internal format | Creates and exports archive with .TMD, .TMI, .MEM files of TM | GET | /%service%/%tm_name%/ |
8 | Status of TM | Returns status\import status of TM | GET | /%service%/%tm_name%/status |
9 | Fuzzy search | Returns enrties\translations with small differences from requested | POST | /%service%/%tm_name%/fuzzysearch |
10 | Concordance search | Returns entries\translations that contain requested segment | POST | /%service%/%tm_name%/concordancesearch |
11 | Entry update | Updates entry\translation | POST | /%service%/%tm_name%/entry |
12 | Entry delete | Deletes entry\translation | POST | /%service%/%tm_name%/entrydelete |
13 | Save all TMs | Flushes all filebuffers(TMD, TMI files) into the filesystem | GET | /%service%_service/savetms |
14 | Shutdown service | Flushes all filebuffers into the filesystem and shutting down the service | GET | /%service%_service/shutdown |
15 | Test tag replacement call | For testing tag replacement | POST | /%service%_service/tagreplacement |
Available end points
List of TMs | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Returns JSON list of TMs | |||||||||
Request | GET /%service%/ | |||||||||
Params | - | |||||||||
|
Create TM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Creates TM with the provided name | |||||||||
Request | Post /%service%/%tm_name%/ | |||||||||
Params | Required: name, sourceLang | |||||||||
|
Create/Import TM in internal format | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Import and unpack base64 encoded archive of .TMD, .TMI, .MEM files. Rename it to provided name | |||||||||
Request | POST /%service%/%tm_name%/ | |||||||||
Params | { "name": "examle_tm", "sourceLang": "bg-BG" , "data":"base64EncodedArchive" } | |||||||||
|
Delete TM | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Purpose | Deletes .TMD, .TMI, .MEM files | ||||||||||||||||||
Request | Delete /%service%/%tm_name%/ | ||||||||||||||||||
Params | - | ||||||||||||||||||
|
Import provided base64 encoded TMX file into TM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Import provided base64 encoded TMX file into TM. Starts another thead for import. For checking import status use status call | |||||||||
Request | POST /%service%/%tm_name%/import | |||||||||
Params | {"tmxData": "base64EncodedTmxFile" } | |||||||||
|
Export TMX from TM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Creates TMX from tm. | |||||||||
Request | GET /%service%/%tm_name%/ | |||||||||
Headers | Accept - applicaton/xml | |||||||||
|
Export in internal format | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Creates and exports archive with .TMD, .TMI, .MEM files of TM | |||||||||
Request | GET /%service%/%tm_name%/ | |||||||||
Headers | application/zip | |||||||||
|
Get the status of TM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Request | GET /%service%/%tm_name%/status | |||||||||
Params | - | |||||||||
|
Fuzzy search | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Returns enrties\translations with small differences from requested | |||||||||
Request | POST /%service%/%tm_name%/fuzzysearch | |||||||||
Params | Required: source, sourceLang, targetLang iNumOfProposal - limit of found proposals - max is 20, if 0 → use default value '5' | |||||||||
|
Concordance search | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Returns entries\translations that contain requested segment | |||||||||
Request | POST /%service%/%tm_name%/concordancesearch | |||||||||
Params | Required: searchString - what we are looking for , searchType ["Source"|"Target"|"SourceAndTarget"] - where to look iNumOfProposal - limit of found proposals - max is 20, if 0 → use default value '5' | |||||||||
|
Update entry | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Updates entry\translation | |||||||||
Request | POST /%service%/%tm_name%/entry | |||||||||
Params | Only sourceLang, targetLang, source and target are required | |||||||||
|
Delete entry | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Deletes entry\translation | |||||||||
Request | POST /%service%/%tm_name%/entrydelete | |||||||||
Params | Only sourceLang, targetLang, source, and target are required Deleting based on strict match(including tags and whitespaces) of target and source | |||||||||
|
Save all TMs | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Flushes all filebuffers(TMD, TMI files) into the filesystem. Reset 'Modified' flags for file buffers. Filebuffer is a file instance of .TMD or .TMI loaded into RAM. It provides better speed and safety when working with files. | |||||||||
Request | GET /%service%_service/savetms | |||||||||
Params | - | |||||||||
|
Shutdown service | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Safely shutting down the service with\without saving all loaded tm files to the disk | |||||||||
Request | GET /%service%_service/shutdown?dontsave=1 | |||||||||
Params | dontsave=1(optional in address) - skips saving tms, for now value doesn't matter, only presence | |||||||||
If try to save tms before closing, would check if there is still import process going on
|
Test tag replacement call | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Purpose | Updates entry\translation | |||||||||
Request | POST /%service%_service/tagreplacement | |||||||||
Params | Required: src, trg, Optional: req | |||||||||
|
Configuration of service
You can configure the service in ~/.t5service/t5memory.conf
Logging | ||
---|---|---|
Level | Mnemonic | Description |
0 | DEVELOP | could make code work really slow, should be used only when debugging some specific places in code, like binary search in files, etc. |
1 | DEBUG | logging values of variables. Wouldn't delete temporary files(In MEM and TMP subdirectories), like base64 encoded\decoded tmx files and archives for import\export |
2 | INFO | logging top-level functions entrances, return codes, etc. Default value. |
3 | WARNING | logging if we reached some commented or hardcoded code. Usually commented code here is replaced with new code, and if not, it's marked as ERROR level |
4 | ERROR | errors, why and where something fails during parsing, search, etc |
5 | FATAL | you shouldn't reach this code, something is really wrongOther values would be ignored. The set level would stay the same till you change it in a new request or close the app. Logs suppose to be written into a file with date\time name under ~/.OtmMemoryService/Logs and errors/fatal are supposed to be duplicated in another log file with FATAL suffices |
6 | TRANSACTION | - Logs only things like begin\end of request etc. No purpose to setup this hight |
Logging could impact application speed very much, especially during import or export. POST http://localhost:4040/t5memory/example_tm/ { Or in t5memory.conf file in line |
Working directory | |
---|---|
Path | Description |
~/.t5memory | The main directory of service. Should always be under the home directory. Consists of nested folders and t5memory.conf file(see Config file). All directories\files below are nested |
LOG | lIncludes log files. It should be cleanup manualy. One session(launch of service) creates two files Log_Thu May 12 10:15:48 2022 .log and Log_Thu May 12 10:15:48 2022 .log_IMPORTANT |
MEM | Main data directory. All tm files is stored here. One TM should include .TMD(data file), .TMI(index file), .MEM(properties file) with the same name as TM name |
TABLE | Services reserved readonly folder with tagtables, languages etc. |
TEMP | For temporary files that were created for mainly import\export. On low debug leved(DEVELOP, DEBUG) should be cleaned manualy |
t5memory.conf | Main config file(see config file) |
Config directory should be located in a specific place |
Config file | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
field | default | Description | |||||||||
name | t5memory | name of service that we use under %service% in address | |||||||||
port | 8080 | service port | |||||||||
timeout | 3600 | service timeout | |||||||||
threads | 1 | ||||||||||
logLevel | 2 | logLevel - > see logging | |||||||||
AllowedRAM_MB | 1500 | Ram limit to operate openning\closing TM(see Openning and closing TM) Doesn't include services RAM | |||||||||
TriplesThreshold | 33 | Level of pre-fuzzy search filtering based on combinations of triples of tokens(excluding tags). Could impact fuzzy search perfomance. For higher values service is faster, but could skip some segments in result. Not always corelated with resulted fuzzyRate | |||||||||
Config file should be located under ~/.t5memory/t5memory.conf Anyway, all field has default values so the service could start without the conf file Reading\applying configs happen only once at service start Once service started you should be able to see setup values in logs.
|
Conceptional information
Openning and closing TM | |
---|---|
In first concept it was planned to implement routines to open and close a TM. While concepting we found some problemes with this approach:
This leads to the following conclusion in implementation of opening and closing of TMs: OpenTM2 has to automatically load the requested TMs if requested. Also OpenTM2 has to close the TMs after a TM was not used for some time. That means that OpenTM2 has to track the timestamps when a TM was last requested.
http://opentm2/translationmemory/[TM_Name]/openHandle GET – Opens a memory for queries by OpenTM2 Note: This method is not required as memories are automatically opened when they are accessed for the first time. http://opentm2/translationmemory/[TM_Name]/openHandle DELETE – Closes a memory for queries by OpenTM2 Note: This method is not required as memories are automatically opened when they are accessed for the first time.
|
TAG REPLACEMENT | ||
---|---|---|
Tag replacement
Pseudocode for tag replacement in import call:
TAG_REPLACEMENT PSEUDOCODE
Pseudocode for tag replacement in import call:
TAG_REPLACEMENT PSEUDOCODE
struct TagInfo
{
bool fPairTagClosed = true; // false for bpt tag - waiting for matching ept tag. If we'll find matching tag -> we'll set this to true
bool fTagAlreadyUsedInTarget = false; // would be set to true if we would already use this tag as matching for target
// this we generate to save in TM. this would be saved as <{generated_tagType} [x={generated_x}] [i={generated_i}]/>.
// we would skip x attribute for generated_tagType=EPT_ELEMENT and i for generated_tagType=PH_ELEMENT
int generated_i = -1; // for pair tags - generated identifier to find matching tag. the same as in original_i if it's not binded to other tag in segment
int generated_x = -1; // id of tag. should match original_x, if it's not occupied by other tags
TagType generated_tagType = UNKNOWN_ELEMENT; // replaced tagType, could be only PH_ELEMENT, BPT_ELEMENT, EPT_ELEMENT
// this cant be generated, only saved from provided data
int original_i = -1; // original paired tags i
int original_x = -1; // original id of tag
TagType original_tagType = UNKNOWN_ELEMENT; // original tagType, could be any tag
};
}
TagType could be one of the values in enum:
[
BPT_ELEMENT EPT_ELEMENT G_ELEMENT HI_ELEMENT SUB_ELEMENT BX_ELEMENT EX_ELEMENT
//standalone tags
BEGIN_STANDALONE_TAGS PH_ELEMENT X_ELEMENT IT_ELEMENT UT_ELEMENT
]
we use 3 lists of tags
SOURCE_TAGS
TARGET_TAGS
REQUEST_TAGS
as id we understand one of following attributes(which is present in original tag) : 'x', 'id'
as i we understand one of following attributes(which is present in original tag) : 'i', 'rid'
all single tags we understand as ph_tag
all opening pair tags we understand as bpt_tag
all closing pair tags we understand as ept_tag
-1 means that value is not found/not used/not provided etc.
for ept tags in generated_id we would use generated_id from matching bpt tag
if matching bpt tag is not found -> ???
TagType could be set to one of following values
TAG REPLACEMENT USE CASES {
IMPORT{
SOURCE_SEGMENT{
<single tags> -> would be saved as <ph>{ // for ph and all single tags
if(type == "lb"){
replace with newline
}else{
generate next generated_id incrementally
ignore content and attributes(except id) if provided
set generated_tagType to PH_ELEMENT
save original_tagType for matching
if id provided -> save as original_id for matching
save tag to SOURCE_TAGS
}
}
<opening pair tags> -> would be saved as <bpt>{
original type is <bpt>{
generate generated_i incrementally in source segment
generate generated_id incrementally
set generated_tagType to BPT_ELEMENT
save original_i (should that always be provided??)
save original_id if provided (should that always be provided??)
set fPairTagClosed to false; // it would be set to true if we would use this tag as matching
set original_type as BPT_ELEMENT
save tag to SOURCE_TAGS
}
original type is <bx>{
generate generated_i incrementally in source segment
generate generated_id incrementally
set generated_tagType to BPT_ELEMENT
save original_i (should that always be provided??)
save original_id if provided (should that always be provided??)
set fPairTagClosed to false; // it would be set to true if we would use this tag as matching
set original_type as BX_ELEMENT
save tag to SOURCE_TAGS
}
original type is other openning pair tags(like <g>){
generate generated_i incrementally in source segment
generate generated_id incrementally
set generated_tagType to BPT_ELEMENT
set fPairTagClosed to false; // it would be set to true if we would use this tag as matching
save tag type as original_tagType;
save tag to SOURCE_TAGS
}
}
<closing pair tags> -> would be saved as <ept>{
original type is <ept>{
search for matching bpt_tag in saved tags
//should we look in reverse order?
looking in SOURCE_TAGS for matchingTag which have [
matchingTag.fPairTagClosed == false
AND matchingTag.generated_tagType == BPT_ELEMENT //all OPENING PAIR TAGs always has BPT_ELEMENT here
AND matchingTag.original_tagType == BPT_ELEMENT
AND matchingTag.original_i == our_ept_tag.original_i
]
if found
set matchingTag.fPairTagClosed to true to eliminate matching one opening tag for different closing tags
set our_ept_tag.i to matchingTag.i
set our_ept_tag.id to matchingTag.id
else
generate next our_ept_tag.generated_i incrementally in source segment // in every segment(target, source, request) i starts from 1
generate next our_ept_tag.generated_id incrementally // should be unique across target, source and request segments
save tag in SOURCE_TAGS
}
original type is <ex>{
search for matching bpt_tag in saved tags
//should we look in reverse order?
looking in SOURCE_TAGS for matchingTag which have [
matchingTag.fPairTagClosed == false
AND matchingTag.generated_tagType == BPT_ELEMENT //all OPENING PAIR TAGs has BPT_ELEMENT here
AND matchingTag.original_tagType == BX_ELEMENT
AND matchingTag.original_i == our_ept_tag.original_i
]
if found
set matchingTag.fPairTagClosed to true to eliminate matching one opening tag for different closing tags
set our_ept_tag.i to matchingTag.i
set our_ept_tag.id to matchingTag.id
else
generate next our_ept_tag.generated_i incrementally in source segment // in every segment(target, source, request) i starts from 1
generate next our_ept_tag.generated_id incrementally // should be unique across target, source and request segments
save tag in SOURCE_TAGS
}
original type is others closing pair tags(like </g>){
search for matching bpt_tag in saved tags:
looking in SOURCE_TAGS in REVERSE for matchingTag which have
[ matchingTag.fPairTagClosed == false
AND matchingTag.generated_tagType == BPT_ELEMENT //OPENING_PAIR_TAG
AND matchingTag.original_tagType == our_tag.original_tagType
]
if found
set matchingTag.fPairTagClosed to true to eliminate matching one opening tag for different closing tags
set our_tag.generated_i to matchingTag.i
set our_tag.generated_id to matchingTag.id
else
generate next our_tag.generated_i incrementally in source segment // in every segment(target, source, request) i starts from 1
generate next our_tag.generated_id incrementally // should be unique across target, source and request segments
save tag in SOURCE_TAGS
}
}
}
TARGET_SEGMENT{
<single tags> -> would be saved as <ph>{ // for ph and all single tags
if(type == "lb"){
replace with newline
}else{
ignore content and attributes(except id) if provided
save original_tagType for matching
if id provided -> save as original_id for matching
search for matching ph_tag in saved tags
looking in SOURCE_TAGS for matchingTag which have [
matchingTag.fTagAlreadyUsedInTarget == false
AND matchingTag.generated_tagType == PH_ELEMENT //SINGLE TAG
AND matchingTag.original_tagType == our_ph_tag.original_tagType
AND matchingTag.original_id == our_ph_tag.original_id
]
if found
set matchingTag.fTagAlreadyUsedInTarget = true
set our_ph_tag.generated_id = matchingTag.generated_id // use id generated for source segment
else
generate new our_ph_tag.generated_id incrementally(should be unique for SOURCE and TARGET)
save tag in TARGET_TAGS // we should track only opening pair tags in target, so theoretically can skip this step
}
}
<opening tags> -> would be saved as <bpt>{
original type is <bpt>{
set generated_tagType to BPT_ELEMENT
save original_i (should that always be provided??)
save original_id if provided (should that always be provided??)
set fPairTagClosed to false; // it would be set to true if we would use this tag as matching
set original_type as BPT_ELEMENT
try to found matching source tag to get generated id:
looking in SOURCE_TAGS for matchingTag which have [
matchingTag.fTagAlreadyUsedInTarget == false
AND matchingTag.generated_tagType == BPT_ELEMENT //all OPENING PAIR TAGs always has BPT_ELEMENT here
AND matchingTag.original_tagType == BPT_ELEMENT
AND matchingTag.original_id == our_bpt_tag.original_id
]
if found:
set matchingTag.fTagAlreadyUsedInTarget to true
generate our_bpt_tag.generated_i incrementally in target segment
set our_bpt_tag.generated_id to matchingTag.generated_id
else:
generate our_bpt_tag.generated_i incrementally // unique between all segments
generate our_bpt_tag.generated_id incrementally // unique between all segments
save tag in TARGET_TAGS
}
original type is <bx>{
set generated_tagType to BPT_ELEMENT
save original_i (should that always be provided??)
save original_id if provided (should that always be provided??)
set fPairTagClosed to false; // it would be set to true if we would use this tag as matching
set original_type as BX_ELEMENT
try to found matching source tag to get generated id:
looking in SOURCE_TAGS for matchingTag which have [
matchingTag.fTagAlreadyUsedInTarget == false
AND matchingTag.generated_tagType == BPT_ELEMENT //all OPENING PAIR TAGs always has BPT_ELEMENT here
AND matchingTag.original_tagType == BX_ELEMENT
AND matchingTag.original_id == our_bpt_tag.original_id
]
if found:
set matchingTag.fTagAlreadyUsedInTarget to true
generate our_bpt_tag.generated_i incrementally in target segment
set our_bpt_tag.generated_id to matchingTag.generated_id
else:
generate our_bpt_tag.generated_i incrementally // unique between all segments
generate our_bpt_tag.generated_id incrementally // unique between all segments
save tag in TARGET_TAGS
}
original type is other openning pair tags(like <g>){
set generated_tagType to BPT_ELEMENT
we never have here original i attribute
save original_id if provided (should that always be provided??)
set fPairTagClosed to false; // it would be set to true if we would use this tag as matching
save original_type
try to found matching source tag to get generated id:
looking in SOURCE_TAGS for matchingTag which have [
matchingTag.fTagAlreadyUsedInTarget == false
AND matchingTag.generated_tagType == BPT_ELEMENT //all OPENING PAIR TAGs always has BPT_ELEMENT here
AND matchingTag.original_tagType == our_tag.original_tagType
AND matchingTag.original_id == our_tag.original_id
]
if found:
set matchingTag.fTagAlreadyUsedInTarget to true
generate our_tag.generated_i incrementally in target segment
set our_tag.generated_id to matchingTag.generated_id
else:
generate our_tag.generated_i incrementally // unique between all segments
generate our_tag.generated_id incrementally // unique between all segments
save tag in TARGET_TAGS
}
}
<closing tags> -> would be saved as <ept>{
original type is <ept>{
try to found matching bpt tag in TARGET_TAGS
looking in TARGET_TAGS for matchingTag which have [
matchingTag.fPairTagClosed == false
AND matchingTag.generated_tagType == BPT_ELEMENT //all OPENING PAIR TAGs always has BPT_ELEMENT here
AND matchingTag.original_tagType == BPT_ELEMENT
AND matchingTag.original_i == our_tag.original_i
]
if found:
set matchingTag.fPairTagClosed to true
set our_tag.generated_id to matchingTag.generated_id
set our_tag.generated_i to matchingTag.generated_i
else:
generate our_tag.generated_i incrementally // unique between all segments
generate our_tag.generated_id incrementally // unique between all segments
save tag in TARGET_TAGS // we should track only opening pair tags in target, so theoretically can skip this step
}
original type is <ex>{
try to found matching bpt tag in TARGET_TAGS
looking in TARGET_TAGS for matchingTag which have [
matchingTag.fPairTagClosed == false
AND matchingTag.generated_tagType == BPT_ELEMENT //all OPENING PAIR TAGs always has BPT_ELEMENT here
AND matchingTag.original_tagType == BX_ELEMENT
AND matchingTag.original_i == our_tag.original_i
]
if found:
set matchingTag.fPairTagClosed to true
set our_tag.generated_id to matchingTag.generated_id
set our_tag.generated_i to matchingTag.generated_i
else:
generate our_tag.generated_i incrementally // unique between all segments
generate our_tag.generated_id incrementally // unique between all segments
save tag in TARGET_TAGS // we should track only opening pair tags in target, so theoretically can skip this step
}
original type is others closing pair tags(like </g>){
search for matching bpt_tag in saved tags:
looking in TARGET_TAGS in REVERSE for matchingTag which have
[ matchingTag.fPairTagClosed == false
AND matchingTag.generated_tagType == BPT_ELEMENT //OPENING_PAIR_TAG
AND matchingTag.original_tagType == our_tag.original_tagType
]
if found:
set matchingTag.fPairTagClosed to true to eliminate matching one opening tag for different closing tags
set our_tag.generated_i to matchingTag.i
set our_tag.generated_id to matchingTag.id
else :
generate next our_tag.generated_i incrementally in target segment // in every segment(target, source, request) i starts from 1
generate next our_tag.generated_id incrementally // should be unique across target, source and request segments
save tag in TARGET_TAGS // we should track only opening pair tags in target, so theoretically can skip this step
}
}
}
}
}
Tag replacement for fuzzy request pseudocode:
TAG_REPLACEMENT PSEUDOCODE
struct TagInfo
{
bool fPairTagClosed = true; // false for bpt tag - waiting for matching ept tag. If we'll find matching tag -> we'll set this to true
bool fTagAlreadyUsedInTarget = false; // would be set to true if we would already use this tag as matching for target
// this we generate to save in TM. this would be saved as <{generated_tagType} [x={generated_x}] [i={generated_i}]/>.
// we would skip x attribute for generated_tagType=EPT_ELEMENT and i for generated_tagType=PH_ELEMENT
int generated_i = -1; // for pair tags - generated identifier to find matching tag. the same as in original_i if it's not binded to other tag in segment
int generated_x = -1; // id of tag. should match original_x, if it's not occupied by other tags
TagType generated_tagType = UNKNOWN_ELEMENT; // replaced tagType, could be only PH_ELEMENT, BPT_ELEMENT, EPT_ELEMENT
// this cant be generated, only saved from provided data
int original_i = -1; // original paired tags i
int original_x = -1; // original id of tag
TagType original_tagType = UNKNOWN_ELEMENT; // original tagType, could be any tag
};
}
we use 3 lists of tags
SOURCE_TAGS
TARGET_TAGS
REQUEST_TAGS
as id we understand one of following attributes(which is present in original tag) : 'x', 'id'
as i we understand one of following attributes(which is present in original tag) : 'i', 'rid'
all single tags we understand as ph_tag
all opening pair tags we understand as bpt_tag
all closing pair tags we understand as ept_tag
-1 means that value is not found/not used/not provided etc.
for ept tags in generated_id we would use generated_id from matching bpt tag
if matching bpt tag is not found -> ???
TagType could be set to one of following values
TAG REPLACEMENT USE CASES {
REQUEST{
basically we convert request segment to tmx tags(similar as we generate ph, bpt and ept tags at import), but with saving original data
then we try to find matching tags from the source to generated from the request. In matching source tags we replace data with original from request(tagType, id and i attributes)
then do the same with target segment\tags
REQUEST_SEGMENT{
are we sending only xliff? so ph, bpt and ept tag shouldn't be handled here?
<single tags> { // for ph and all single tags
// here we can have PH, X, IT, UT tags, right?
generate generated_id incrementally
set generated_tagType to PH_ELEMENT
save original_id if provided (should that always be provided??)
save tag type as out_tag.original_tagType
save tag in REQUEST_TAGS
}
<opening tags> {
//this would be never send from translate5, right?
original type is <bpt>{
save tag in REQUEST_TAGS
}
original type is <bx>{
generate generated_i incrementally in source segment
generate generated_id incrementally
set generated_tagType to BPT_ELEMENT
save original_i (should that always be provided??)
save original_id if provided (should that always be provided??)
set fPairTagClosed to false; // it would be set to true if we would use this tag as matching
set fTagAlreadyUsedInTarget to false;
set original_type as BX_ELEMENT
save tag to REQUEST_TAGS
}
original type is <g>{
generate generated_i incrementally in source segment
generate generated_id incrementally
set generated_tagType to BPT_ELEMENT
we don't have original_i provided here, only original_id, right?
save original_id if provided (should that always be provided??)
set fPairTagClosed to false; // it would be set to true if we would use this tag as matching
set fTagAlreadyUsedInTarget to false;
set original_type as G_ELEMENT
save tag in REQUEST_TAGS
}
original type is <hi>{
generate generated_i incrementally in source segment
generate generated_id incrementally
set generated_tagType to BPT_ELEMENT
we don't have original_i provided here, only original_id, right?
save original_id if provided (should that always be provided??)
set fPairTagClosed to false; // it would be set to true if we would use this tag as matching
set fTagAlreadyUsedInTarget to false;
set original_type as HI_ELEMENT
save tag in REQUEST_TAGS
}
original type is <sub>{
generate generated_i incrementally in source segment
generate generated_id incrementally
set generated_tagType to BPT_ELEMENT
we don't have original_i provided here, only original_id, right?
save original_id if provided (should that always be provided??)
set fPairTagClosed to false; // it would be set to true if we would use this tag as matching
set fTagAlreadyUsedInTarget to false;
set original_type as HI_ELEMENT
save tag in REQUEST_TAGS
}
}
<closing tags> {
//this would be never send from translate5, right?
original type is <ept>{
save tag in REQUEST_TAGS
}
original type is <ex>{
search for matching tag in saved tags:
looking in REQUEST_TAGS in REVERSE for matchingTag which have
[ matchingTag.fPairTagClosed == false
AND matchingTag.generated_tagType == BPT_ELEMENT //OPENING_PAIR_TAG
AND matchingTag.original_tagType == BX_ELEMENT // our_tag.original_tagType
AND matchingTag.original_i == our_tag.original_i
]
if found
set matchingTag.fPairTagClosed to true to eliminate matching one opening tag for different closing tags
set our_tag.generated_i to matchingTag.i
set our_tag.generated_id to matchingTag.id
else
generate next our_tag.generated_i incrementally in request segment // in every segment(target, source, request) i starts from 1
generate next our_tag.generated_id incrementally // should be unique across target, source and request segments
save tag in REQUEST_TAGS
}
original type is </g>{
search for matching tag in saved tags:
looking in REQUEST_TAGS in REVERSE for matchingTag which have
[ matchingTag.fPairTagClosed == false
AND matchingTag.generated_tagType == BPT_ELEMENT //OPENING_PAIR_TAG
AND matchingTag.original_tagType == G_ELEMENT // our_tag.original_tagType
]
if found
set matchingTag.fPairTagClosed to true to eliminate matching one opening tag for different closing tags
set our_tag.generated_i to matchingTag.i
set our_tag.generated_id to matchingTag.id
else
generate next our_tag.generated_i incrementally in request segment // in every segment(target, source, request) i starts from 1
generate next our_tag.generated_id incrementally // should be unique across target, source and request segments
save tag in REQUEST_TAGS
}
original type is </hi>{
search for matching tag in saved tags:
looking in REQUEST_TAGS in REVERSE for matchingTag which have
[ matchingTag.fPairTagClosed == false
AND matchingTag.generated_tagType == BPT_ELEMENT //OPENING_PAIR_TAG
AND matchingTag.original_tagType == HI_ELEMENT // our_tag.original_tagType
]
if found
set matchingTag.fPairTagClosed to true to eliminate matching one opening tag for different closing tags
set our_tag.generated_i to matchingTag.i
set our_tag.generated_id to matchingTag.id
else
generate next our_tag.generated_i incrementally in request segment // in every segment(target, source, request) i starts from 1
generate next our_tag.generated_id incrementally // should be unique across target, source and request segments
save tag in REQUEST_TAGS
}
original type is </sub>{
search for matching tag in saved tags:
looking in REQUEST_TAGS in REVERSE for matchingTag which have
[ matchingTag.fPairTagClosed == false
AND matchingTag.generated_tagType == BPT_ELEMENT //OPENING_PAIR_TAG
AND matchingTag.original_tagType == SUB_ELEMENT // our_tag.original_tagType
]
if found
set matchingTag.fPairTagClosed to true to eliminate matching one opening tag for different closing tags
set our_tag.generated_i to matchingTag.i
set our_tag.generated_id to matchingTag.id
else
generate next our_tag.generated_i incrementally in request segment // in every segment(target, source, request) i starts from 1
generate next our_tag.generated_id incrementally // should be unique across target, source and request segments
save tag in REQUEST_TAGS
}
}
}
!!!CONSIDER THAT WE SHOULD HAVE IN SOURCE SEGMENT ONLY 3 TYPES OF TAGS - PH_ELEMENT, BPT_ELEMENT and EPT_ELEMENT, because all of them was regenerated with their attributes at import stage
At this point we read the source and target segments "as is", without any tag replacement in lists. so original_id would be id, that was generated_id at import stage.
SOURCE_SEGMENT{
<ph x="1" />{
search for matching tag in saved tags:
looking in REQUEST_TAGS in REVERSE for matchingTag which have
matchingTag.generated_tagType == PH_ELEMENT //or our_tag.original_tagType
AND matchingTag.generated_id == our_tag.original_id
]
if found
set our_tag.generated_tagType = matchingTag.original_tagType
set our_tag.generated_id = matchingTag.original_id
use that that data to generate tag like <our_tag.generated_tagType id="{our_tag.generated_id}" />
else
maybe just return <x/> tag?
save tag in SOURCE_TAGS
}
<bpt i="1" x="2"/> {
search for matching tag in saved tags:
looking in REQUEST_TAGS in REVERSE for matchingTag which have
[ matchingTag.generated_tagType == BPT_ELEMENT //or our_tag.original_tagType
AND matchingTag.generated_id == our_tag.original_id
]
if found
set our_tag.generated_tagType = matchingTag.original_tagType
set our_tag.generated_id = matchingTag.original_id
set our_tag.generated_i = matchingTag.original_i
if matchingTag.original_tagType == BX_ELEMENT // do BX_ELEMENT always have id and rid attributes provided?
use that that data to generate tag like <our_tag.generated_tagType id="{our_tag.generated_id}" rid="{our_tag.generated_id}" />
else:
[rid="{our_tag.generated_id}"] - means optional, so for example if it's bigger than 0, then we should add this attribute
use that that data to generate tag like <our_tag.generated_tagType [id="{our_tag.generated_id}"] [rid="{our_tag.generated_id}"] >
else
maybe just return <bx/> tag?
save tag in SOURCE_TAGS
}
<ept i="1" /> {
search for matching tag in saved tags:
looking in REQUEST_TAGS in REVERSE for matchingTag which have
[ matchingTag.generated_tagType == EPT_ELEMENT //or our_tag.original_tagType
AND matchingTag.generated_id == our_tag.original_id // id should hold information about paired BPT_ELEMENT, or it's absence
]
if found
set our_tag.generated_tagType = matchingTag.original_tagType
set our_tag.generated_id = matchingTag.original_id
set our_tag.generated_i = matchingTag.original_i
use that that data to generate tag like <our_tag.generated_tagType id="{our_tag.generated_id}" rid="{our_tag.generated_id}" />
if matchingTag.original_tagType == EX_ELEMENT // do EX_ELEMENT always have id and rid attributes provided?
use that that data to generate tag like <our_tag.generated_tagType id="{our_tag.generated_id}" rid="{our_tag.generated_id}" />
else:
[rid="{our_tag.generated_id}"] - means optional, so for example if it's bigger than 0, then we should add this attribute
use that that data to generate tag like </our_tag.generated_tagType>
else
maybe just return <ex/> tag? or add some specific attributes?
save tag in SOURCE_TAGS
}
}
}
Previous documentation:
|