Page tree

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Current »

Documentation is incomplete (especially regarding possible HTTP errorcodes and error messages from the termtagger). 

Base URL:

http://<host>:9001/termTagger

1. Version & Health Check

Description
Checks if the service is running and returns version information.

Request

curl -X GET "http://<host>:9001/termTagger"   -H "Accept: text/html"

Response (200 OK, shortened)

<html>
  <h1>TermTagger Version Information</h1>
  <h2>TermTagger REST Server</h2>
  <b>Version:</b> 0.16
  <h2>TermTagger:</h2>
  <b>Version:</b> 9.01
  <h2>OpenTMS Version: </h2>0.2.1
</html>

2. Upload TBX File

Description
Uploads a TBX file to the TermTagger.
The TBX file is referenced using its MD5 hash as its ID.

The header Field x-tbxid is optional for an optional nginx proxy before the termtagger instances to distribute same TBX files to the same instance.


Request

curl -X POST "http://<host>:9001/termTagger/tbxFile/"   -H "Content-Type: application/json"   -H "x-tbxid: 2b412073d5185fa4b8d7831e0ee6472d"   -d '{
    "tbxFile": "2b412073d5185fa4b8d7831e0ee6472d",
    "tbxdata": "<?xml version=\"1.0\"?><martif>...</martif>"
  }'

Response (200 OK)

{
  "uuid": "2b412073d5185fa4b8d7831e0ee6472d",
  "added": true
}

3. Check TBX File

Description
Checks whether a TBX file with the given ID exists in memory.

Request

curl -I "http://<host>:9001/termTagger/tbxFile/2b412073d5185fa4b8d7831e0ee6472d"   -H "x-tbxid: 2b412073d5185fa4b8d7831e0ee6472d"

Response

  • 200 OK – File exists
  • 404 Not Found – File not found

4. Tag Text with Terms

Description
Sends segments to the TermTagger to find and mark terms from the loaded TBX file.

Batchsize: Multiple segments can be tagged by one call, as bigger the batch size as longer needs the answer. A optimal batch size is somewhere between 5 and 25 Segments, also depending on the segment size.

Request

curl -X POST "http://<host>:9001/termTagger/termTag/"   -H "Content-Type: application/json"   -H "x-tbxid: 2b412073d5185fa4b8d7831e0ee6472d"   -d '{
    "tbxFile": "2b412073d5185fa4b8d7831e0ee6472d",
    "sourceLang": "en",
    "targetLang": "en",
    "segments": [
      {
        "id": "387278",
        "field": "targetEdit",
        "source": "Outstanding features",
        "target": "Outstanding features"
      }
    ],
    "debug": 0,
    "fuzzy": 0,
    "stemmed": 1,
    "fuzzyPercent": 70,
    "maxWordLengthSearch": 2,
    "minFuzzyStartLength": 2,
    "minFuzzyStringLength": 5,
    "targetStringMatch": 0,
    "task": "{a4393eb5-46a7-4f5e-ba1a-70873c74a7a6}"
  }'

Response (200 OK, example)

{
  "bCorrectRequest": true,
  "segments": [
    {
      "field": "targetEdit",
      "id": "387278",
      "source": "Outstanding features",
      "target": "Outstanding features"
    }
  ],
  "tbxFile": "2b412073d5185fa4b8d7831e0ee6472d"
}

Note:
Matches are marked in the HTML of the segments with <div class="term ...">:

<div class="term preferredTerm transFound exact" data-tbxid="term_01_1_en_1_00003">VisualTranslation</div>

Internal Tags

The termTagger works only properly with HTML Synax, so internal tags in the segment content have to be replaced with img tags: 

<img class="content-tag" src="1" alt="TaggingError"  />

If the class "content-tag" and alt text "TaggingError" are needed, is currently unclear. But that is at least the way since translate5 sends the img tag placeholders to the TermTagger since years.

The number in the src attribute is an ID to identify the real tag on the client side, there is no semantic on termtagger side.

The same img tags are returned by the termtagger then.

Terminology Tags

The TermTagger uses div tags to marc the text containing terminology.

<div class="term preferredTerm transFound exact" data-tbxid="term_01_1_en_1_00003">VisualTranslation</div> 
where each div contains several CSS classes to tag the term with several flags. Also the ID of the term from the TBX file is added to the attribute data-tbxid.

CSS Classes: 

  • term: always set
  • preferredTerm the normativeAuthorization value of the term
  • transFound|transNotFound: flag to mark the term found in target or not. This is buggy on TermTagger side and is corrected by translate5 itself.
  • exact: flag if the found term was found exactly or by stemming / fuzzy match.


5. Remove TBX File

Description
Removes a TBX file from the TermTagger memory.

Request

curl -X DELETE "http://<host>:9001/termTagger/tbxFile/2b412073d5185fa4b8d7831e0ee6472d"   -H "x-tbxid: 2b412073d5185fa4b8d7831e0ee6472d"

Response

  • 200 OK
  • Empty body

Example Workflow

# 1. Check service
curl -X GET "http://<host>:9001/termTagger"

# 2. Upload TBX file
curl -X POST "http://<host>:9001/termTagger/tbxFile/" ...

# 3. Process segments
curl -X POST "http://<host>:9001/termTagger/termTag/" ...

# 4. Delete TBX file
curl -X DELETE "http://<host>:9001/termTagger/tbxFile/<id>" ...




  • No labels