The "top P"parameter, also known as nucleus sampling, is a nuanced alternative to temperature-based sampling. It is a "spotlight" that shines on the most probable words. At a default value of 1.0, the model considers all words. This parameter can help control the distribution of word choices, keeping the generated content relevant and coherent.It’s advisable to adjust either the temperature or top_p, but not both.
- This parameter is used to encourage the model to include a diverse range of tokens in the generated text. It is a value that is subtracted from the log-probability of a token each time it is generated. A higher presence_penalty value will result in the model being more likely to generate tokens that have not yet been included in the generated text.
- This parameter is used to discourage the model from repeating the same words or phrases too frequently within the generated text. It is a value that is added to the log-probability of a token each time it occurs in the generated text. A higher frequency_penalty value will result in the model being more conservative in its use of repeated tokens.
Max. target tokens (% of source tokens):
- A GPT Model always has a maximum size of tokens that can be used within a single request. This amout calculates as the sum of the semt tokens and the returned tokens. For a (pre)translation, this is the system message and the text or batch to translate plus the returned translations. Therefore a ratio is needed to leave "room" in a sent request for the generated translation. This is only relevant for batch-translations as used when pretranslating