Instruction Manual
User’s Guide SYSTRAN 5.0 147
Protected Sequences
Protected sequences are words and phrases that do not undergo analysis. Rather,
they are accepted as-is for the final translated product.
When it comes to protected sequences, it is important to keep the original format of the
entry in your glossary. For example, if the sequence is in capital letters in the document
to be translated, it must be entered in capital letters in the external dictionary. To
designate proper nouns, acronyms, and expressions as protected sequences, encode
them with quotation marks (","). The figure below shows an example.
Figure 78: Example of Using Intuitive Coding
Bracketing
Use brackets to isolate a compound within a larger compound. This can be helpful
when translating from English, where the relationship between different elements of a
compound are not as obvious as in other languages.
Figure 79: Example of Using Brackets
Canonical Form
Like a traditional paper dictionary, canonical form — the simplest form of a word or
phrase — is the ideal form for entering words and phrases into a UD. UD entries in
their inflected form are translated by the SYSTRAN Translation System, but the
additional information is interpreted as a clue to a particular use of a word.
The canonical form is dependent on the languages it is to encode. In French, for
example, the canonical form of a nominal or adjectival entry is the masculine singular,
while verbal entries are in the infinitive.
Upper-Case Letters
Use of capital letters adheres to the same guideline as canonical form — all entries
entry should be in their native format. Otherwise, the software interprets the
uppercase as an additional linguistic clue. The original format is automatically
detected and respected.