Page tree

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

Version and versioning

Current translate5 version7.27.0
Changelogs documented up to version7.25.1

Version Published Changed By Comment
CURRENT (v. 1) Aug 14, 2025 14:00
v. 1 Jul 30, 2025 14:52

What is the ChatGPT plug-in used for?

The plug-in enables you to create, train and use language resources based on Large Language Models (LLM) within translate5. It is possible to select from various GPT models, some of which can be trained, which serve as the basis for a language resource. These GPT language resources can then be trained with prompts and translation examples as needed. In addition, they can be fine-tuned via “Temperature”, “Top‑p”, “Presence” and “Frequency” parameters.


Sections and overviews

In translate5, the following sections are relevant for the management of GPT language resources:

  • The language resource overview:
    Language resources based on GPT models are created here.
  • The “Adjust OpenAI model” window:
    Is called up in the language resource overview and offers the option of fine-tuning models using various parameters.
  • The “Fine-tune OpenAI model” window:
    Is called up in the language resource overview and offers the possibility to test and train models with prompts and translation memories.
  • “Prompt management” in Preferences:
    You can register prompts here in order to use them as instructions when pre-translating with GPT language resources.

Creation, fine-tuning and training of GPT language resources as well as prompt management are all available to project managers.

Available engines/models

OpenAI

The models available are continuously queried at OpenAI’s platform and therefore correspond to what is currently available there.

Azure

The models that you have available in your AzureCloud are available here.


Creating GPT language resources

A language resource based on a GPT model is created like any other language resource in the language resource overview:

  1. Click on the button to open the “Add language resource” window.
  2. From the “Resource” drop-down, select the option “ChatGPT (OpenAI / MS Azure)”.
  3. Select the model you want to use from the “Engine/Model” drop-down.
    The models appearing at the top of the list are trainable, which is also indicated accordingly in their designation.
  4. Enter a meaningful name in the “Name” field.
  5. Use the “Source language” and “Target language” drop-downs to specify the language combination for which the language resource should be created.
  6. Select the clients for whom the language resource should be used from the “Use for selected clients” drop-down.
  7. Under “Read access by default”, select the clients to whose projects the language resource should be added with read rights by default.
  8. If required, select the clients for whose projects the language resource should be used in pivot language projects by default from the “Use as pivot by default” drop-down.
  9. Select the colour with which matches from this language resource should be coloured in the matches and concordance panel from the “Color” drop-down.
  10. Confirm by clicking the “Save” button.

The language resource is then created and will be visible immediately afterwards in the language resource overview.

Start typing in drop-down fields to find options more quickly. For languages, for example, you can type the ISO code: “de-de” will find “German (Germany) (de-DE)”.


Managing GPT language resources

The following options are available for GPT language resources in the language resource overview:

ButtonExplanation

Opens the “Edit language resource” window, but the basic settings can no longer be edited. However, clients can be added/removed for whom the language resource should be:

  • used;
  • used with read rights by default;
  • used as a pivot language resource by default.

Deletes the language resource. The deletion process must be completed by confirming it in a window that appears after clicking the button.

Opens the “Adjust OpenAI model” window, in which you can adjust various parameters to fine-tune the GPT language resource.

Opens the “Fine-tune OpenAI model”, via which you can train and test the GPT resource with prompts.

Fine-tuning GPT language resources

In the “Adjust OpenAI model” window, both trainable and untrainable GPT resources can be fine-tuned using the following parameters:

  • Use fine-tuning default system-message when translating with a trained model:
    • If this box is ticked, the standard prompt (the top prompt in the training window) will be used as instruction when pre-translating with the model.
    • If the box remains deselected, the following prompt will be used as instruction when pre-translating with the model: “Translate the following segments encapsulated in JSON Objects with the index and the segment as properties from [source language] as source language to [target language] as target language using all segments as context for each other”.
  • Use all trained system messages when translating with a trained model:
    • If this box is ticked, the user-specific prompts will be used as instructions when pre-translating with the model. These can be selected in the “Fine-tune OpenAI Model” window.
  • Generation Sensitivity / Temperature: Values from 0 to 2, with up to two decimal places.
  • Probability Threshold / Top P: Values from 0 to 1, with up to two decimal places.
  • Presence Penalty: Values from 0 to 2, with up to two decimal places.
  • Frequency Penalty: Values from 0 to 2, with up to two decimal places.
  • Max. target tokens (% of source tokens): A GPT model can only ever process a limited number of tokens. This maximum number of tokens includes both the sent and the returned tokens. In the case of a (pre-)translation, this includes the prompt(s), the text or batch to be translated and the returned translations. An appropriate ratio must be maintained so that there is enough “space” for the returned tokens. This setting is only relevant for batch translations, such as those used for pre-translation.

A token is the smallest linguistic unit that a language model like GPT processes. It can be a single character, a word, or a part of a word – depending on the language and structure of the text. For example, the word “translation” is usually counted as one token, whereas a long or compound term can comprise several tokens.

Parameters for fine-tuning

How creative should the translation be?

Generation sensitivity / Temperature

This parameter determines how “random” or “creative” the language model should be when generating text. A low Temperature means that the model translates more objectively and predictably, while a higher temperature means that it can translate very creatively and therefore unpredictably.

Probability threshold / Top P

The “Top P” parameter (also known as “nucleus sampling”) is a nuanced alternative to Temperature-based sampling. It is like a “spotlight” that emphasizes the most probable words. With a default value of 1.0 all words are taken into account. This parameter can help to control the distribution of word choice and thus ensure the relevance and coherence of the generated text.

Attention: If Temperature is set to a very high value, it is possible that the model will generate contradictory or meaningless content.

It is advisable to adjust either the Temperature or top P, but not both.

Please have a look at this page for further information on the two parameters Temperature and Top P.

How varied should the translation be?

Presence Penalty

This parameter is used to encourage the model to include a wide range of tokens in the generated text. This is a value that is deducted from the log probability of a token each time it is generated. A high Presence penalty value means that the model tends to generate tokens that are not yet contained in the generated text.

Frequency Penalty

This parameter is used to prevent the model from using the same words or phrases too often within the generated text. It is a value that is added to the log probability of a token each time it appears in the generated text. A high Frequency Penalty value means that the model is more careful when using recurring tokens.

Please have a look at this page for further information on the two parameters Presence Penalty and Frequency Penalty.

Max. target tokens (% of source tokens)

A GPT model can only ever process a limited number of tokens. This maximum number of tokens includes both the sent and the returned tokens. In the case of a (pre-)translation, this includes the prompt(s), the text or batch to be translated and the returned translations. An appropriate ratio must be maintained so that there is enough “space” for the returned tokens. This setting is only relevant for batch translations, such as those used for pre-translation.


Prompt management

Version and versioning

Current translate5 version7.27.0
Changelogs documented up to version7.25.1

Version Published Changed By Comment
CURRENT (v. 2) Aug 14, 2025 14:01
v. 1 Aug 14, 2025 13:56

Creating and managing prompt sets

translate5 has a standard prompt for training GPT language resources. It reads “You are a machine translation engine and translate single texts or multiple segments from [source language] as source language to [target language] as target language”.

For specific translation requirements, you can also enter your own prompts in translate5. To do this, navigate to the Prompt management in the translate5 settings.


Prompt set management

The management of the prompt sets includes:

  1. a search field with which all prompts can be searched;
  2. the “Create new prompt” button;
  3. the “Refresh” button to reload the view; and
  4. the list with all saved prompts.

Prompt list

The prompt list shows all prompt sets saved in translate5. It is made up of the following columns:

ColumnExplanation
IdThe ID of the prompt set is displayed here, it is automatically assigned and incremented.
NameThe names of the prompt sets are displayed here to help identify the prompt sets.
CommentMore detailed information on the prompt sets can be entered in the comment field.
LanguagesThe source and target language variant(s) for which the respective prompt sets were recorded are displayed here. Prompt sets can be created for several language combinations.
System messageThe actual contents of the prompt sets is displayed in this column.
CreatedDate and time are recorded here so that the time at which the respective prompt sets were saved can be traced.
Last changeThe date and time at which the last change was made to the respective prompt set are entered here.
Actions

This column contains the following buttons for each prompt set:

Opens the prompt set for editing in the “Prompt details” window.

Deletes the prompt set. The deletion must be confirmed in a pop-up window.

Entering meta information

In the upper area of the “Prompt details” window, the following information can be noted for each prompt and example set:

  1. The ID for the respective set is automatically assigned and incremented.
  2. Enter a name for the current set in the “Name” field.
  3. You can describe the set in more detail in the “Comment” field.
  4. The “Save” button saves the current input of meta information as well as the currently entered prompts.

Adding prompts

The prompts added are listed in this window area.

  1. To add a prompt, click on the “Add message” button so that a (further) red line appears.
  2. Click on the “Type new message here” line and insert the prompt you want to add.
  3. If you want to undo the changes made to the selection of prompts since they were last saved, click the “Reset” button.
  4. Click the “Save” button above if you want to save the meta information and the current selection of prompts.

Adding example sentence sets

Sets with one or more translation examples can now be entered in the lower section of the window to indicate expected translation results. The example sentences can be defined for different language combinations.

  1. To create a new set of example sentences, click on “Add example set”.
  2. The “Create example sets for” window opens, in which you can specify the example set’s language combination. Confirm by clicking “OK”.
  3. To add a new example sentence pair, click on “Add example”.
  4. An orange line appears for the source sentence and a yellow line for the target sentence of the translation example.
  5. Individual examples can be removed by clicking on the cross.
  6. The newly added examples are saved by clicking on the “Save” button.
  7. If you want to undo the changes made to the currently displayed examples since they were last saved, click the “Reset” button.
  8. The “Push sources” button can be used to transfer the source sentences of the currently selected example set to all other matching example sets. “Matching” means all example sets whose source language matches that of the current example set. The variant of the source language is ignored, i.e. the source sentences from an example set with source language German (Germany) are also transferred to example sets with source language German (Switzerland) or German without variant. The target sentences specific to the respective example set can then be added to the other example sets to which the source sentences have been transferred.

Start typing in drop-down fields to find options more quickly. For languages, for example, you can type the ISO code: “de-de” will find “German (Germany) (de-DE)”.

With the “Push sources” action, source sentences are only transferred to other language combinations example sets if there are no other source segments existing with a similarity value of 4 or more. This is intended to prevent the addition of duplicates.


Messages related with example sets

The rectangles that appear in this line function as buttons that can be used to switch between the example sets for the different language combinations.

At the same time, the coloured upper left and/or lower left corners show whether there are unsaved changes (upper left corner coloured) or empty source and/or target language segments (lower left corner coloured) in the respective example set.

If you move the mouse over a language combination button, a text field appears with the corresponding information and an indication of how many examples the respective set contains.

The number of examples is also indicated directly in the labelling of the language combination buttons.

The order in which the example sets are listed for each language combination corresponds to the order in which they were created.

Keyboard shortcuts

Keyboard shortcutExplanation
STRG + ALT + SSaves the prompt set with the currently recorded prompts.
CTRL + ENTERInserts a new field for a prompt or new fields for an example sentence with translation, depending on where the mouse cursor was last positioned.
CTRL + ALT + NAdds a new field for a prompt.
ALT + NInserts new fields for an example sentence with translation.
CTRL + UP (up arrow key)Switches to the prompt or example sentence set above, depending on where the mouse cursor was last positioned.
CTRL + DOWN (down arrow key)Switches to the prompt or example sentence set below, depending on where the mouse cursor was last positioned.
CTRL + QEnds the editing of the current prompt or the current example sentence.
ALT + CUndoes the changes made to the currently displayed example sentences since the last save.
ALT + SSaves the currently entered example sentences.



Training GPT language resources

Once a GPT language resource has been created, it can be trained with prompts, examples, and specific translation memories. In the language resource overview, click on the button in the line of the language resource that you would like to train. The “Fine-tune OpenAI Model” window opens. It includes the following areas:

  1. The prompts and example sentences that can be loaded from a prompt set previously created in Prompt management are displayed here.
  2. Test panel in which a test text can be entered, and its translation tested using the current configuration.
  3. The TM panel, in which one or more translation memories can be added for training the model, as well as a drop-down in which the number of training iterations can be selected.

All entries in the TMs are used as examples for your training. Normally, this only makes sense for TMs that contain hand-picked examples for your training. The use of large amounts of TM data usually reduces the translation quality.

Loading and customizing prompt sets, starting a training

  1. Click on “Add some pre-configured prompt” to open the “Add some pre-configured prompt” window.
  2. It lists all the prompt sets created in Prompt management. They can be selected by ticking the box in the “Add” column.
  3. The selection is confirmed by clicking the “OK” button, which also closes the window.
  4. The prompts and examples contained in the selected prompt sets are now displayed in the “Fine-tune OpenAI Model” window and can be added to, adjusted or removed as required. The functions of the buttons correspond to their counterparts in Prompt management.
  5. You can now enter any test text in the top window of the test panel. By clicking on the “Translate” button, the text is translated using the current configuration of the selected language resource in combination with the added prompts and example sentences. The translation of the test text is displayed in the lower window.
  6. One or more task or main translation memories can also be added to the TM panel for (pre-)translation.
  7. If you are satisfied with the way the test text is translated, you can select further down in the TM panel the number of epochs with which the language resource should be trained using the existing configuration. Then start the training using the “Submit training” button.

You can add one or more of the preconfigured prompt sets.

As the use of terminology during training does not lead to good results, no TermCollections can be added in translate5 for this purpose. Instead, we recommend that you add the desired TermCollections as usual when creating tasks so that GPT can then take the resource into account for pre-translation during import.

  • No labels