Request example:
Request example:
{ // required fields
"sourceLang":"en-GB", // langs would be checked with languages.xml
"targetLang":"de",
"source":"For > 100 setups.",
// optional fields
["documentName":"OBJ_DCL-0000000845-004_pt-br.xml"],
["segmentNumber":15],
["markupTable":"OTMXUXLF"], //if there is no markup, default OTMXUXLF would be used.
//Markup tables should be located inside ~/.t5memory/TABLE/%markup$.TBL
["context":"395_408"],
["numOfProposals":20], // num of expected segments in output. By default it's 5
["loggingThreshold": 0]
}
Response example:
Success:
{
"ReturnValue": 0,
"ErrorMsg": "",
"NumOfFoundProposals": 1,
"results": [
{
"source": "For > 100 setups.",
"target": "Für > 100 Aufstellungen.",
"segmentNumber": 10906825,
"id": "",
"documentName": "none",
"documentShortName": "NONE",
"sourceLang": "en-GB",
"targetLang": "de-DE",
"type": "Manual",
"matchType": "Exact", // could be exact or fuzzy
"author": "",
"timestamp": "20190401T084052Z",
"matchRate": 100,
"fuzzyWords": -1, // for exact match it would be -1 here and in diffs
"fuzzyDiffs": -1, // otherwise here would be amount of parsed words and diffs that was
// used in fuzzy matchrate calculation
"markupTable": "OTMXML",
"context": "",
"additionalInfo": ""
}
]
}
Not found:
{
"ReturnValue": 133,
"ErrorMsg": "OtmMemoryServiceWorker::concordanceSearch::"
}
For exact match used function that's comparing strings ignoring whitespaces. First normalized strings(without tags).
If it's the same string, then t5memory is checking string with tags and could return 100 or 97 match rate depending on result.
Then it's checking context match rate and if document name is the same(non case sensitive)
Then it's checking and modifying exactMatchRate according to code in code block below.
After that it would store exact matches only with usMatchLevel>=100. If there would be no exact matches, fuzzy match calculations would begin.
In case if there is at least one exact match, any fuzzy matches would be skipped.
In case if we have only one exact exact match, it's rate would be set to 102
For fuzzy rate calculation we count words and then diffs in normalized string(without tags), using this formula:
if (usDiff < usWords )
{
*pusFuzzy = (usWords != 0) ? ((usWords - usDiff)*100 / usWords) : 100;
}
else
{
*pusFuzzy = 0;
} /* endif */ Regarging Number Protection feature, tags from number protection would be replaced with their regexHashes from their attributes, so they would be count as 1 word each. NP with the same regex would be counted as equal
To count diffs, t5memory go throuht both segments to find matching tokens, to find something called snake- line of matching tokens.
Then It marks unmatched as INSERTED or DELETED tokens, and based on that it calculates diffs.
if it's 100% rate, we add tags and compare it again
if then it's not equal, here is how match rate would be changed - probably this would never happens, because we have exact match test before fuzzy,
and we do exact test even if triplesHashes is different(which is pre-fuzzy calculation and if it's equal, it could be flag that trigger exact test)
if ( !fStringEqual )
{
if ( usFuzzy > 3 )
{
usFuzzy -= 3;
}
else
{
usFuzzy = 0;
} /* endif */
usFuzzy = std::min( (USHORT)99, usFuzzy );
} /* endif */
then depending on type of translation it could tweak rate
if ( (usModifiedTranslationFlag == TRANSLFLAG_MACHINE) && (usFuzzy < 100) )
{
// ignore machine fuzzy matches
}
else if ( usFuzzy > TM_FUZZINESS_THRESHOLD )
{
/********************************************************/
/* give MT flag a little less fuzziness */
/********************************************************/
if ( usModifiedTranslationFlag == TRANSLFLAG_MACHINE )
{
if ( usFuzzy > 1 )
{
usFuzzy -= 1;
}
else
{
usFuzzy = 0;
} /* endif */
} /* endif */
if (usFuzzy == 100 && (pGetIn->ulParm & GET_RESPECTCRLF) && !fRespectCRLFStringEqual )
{ // P018279!
usFuzzy -= 1;
}
add to resulting set
} /* endif */
} /* endif */
At the end fuzzy request replaces tags in proposal from TM with tags from request, and if matchRate >= 100, it calculates whitespace diffs and apply matchRate-= wsDiffs |