Lesson outline on the topic: MDK lesson "Machine translation systems for texts and computer dictionaries." Translation of primary documents from a foreign language

Subscribe
Join the “koon.ru” community!
In contact with:

Currently, there are three types of machine translation systems:

Systems based on grammatical rules (Rule-Based Machine Translation, RBMT);

Statistical systems (Statistical Machine Translation, SMT);

Hybrid systems;

Systems based on grammatical rules analyze the text that is used in the translation process. Translation is carried out on the basis of built-in dictionaries for a given language pair, as well as grammars covering semantic, morphological, syntactic patterns of both languages. Based on all this data, the source text is sequentially, sentence by sentence, converted into text in the required language. The basic principle of operation of such systems is the connection between the structures of the source and target texts.

Systems based on grammatical rules are often divided into three further subgroups - word-by-word translation systems, transfer systems and interlinguistic systems.

The advantages of systems based on grammatical rules are grammatical and syntactic accuracy, stability of the result, and the ability to customize for a specific subject area. The disadvantages of systems based on grammatical rules include the need to create, maintain and update linguistic databases, the complexity of creating such a system, as well as its high cost.

Statistical systems use statistical analysis in their work. A bilingual text corpus is loaded into the system (containing a large amount of text in the source language and its “manual” translation into the required language), after which the system analyzes the statistics of interlingual correspondences, syntactic structures, etc. The system is self-learning - when choosing a translation option, it relies on based on previously obtained statistics. The larger the dictionary within a language pair and the more accurately it is compiled, the better the result of statistical machine translation. With each new translated text, the quality of subsequent translations improves.

Statistical systems are quick to set up and easy to add new translation areas. Among the shortcomings, the most significant are the presence of numerous grammatical errors and the instability of the translation.

Hybrid systems combine the approaches described earlier. It is expected that hybrid machine translation systems will combine all the advantages of statistical and rule-based systems.

1.3 Classification of machine translation systems

Machine translation systems are programs that perform fully automated translation. The main criterion of the program is the quality of translation. In addition, important points for the user are the convenience of the interface, ease of integration of the program with other document processing tools, choice of topics, and a dictionary replenishment utility. With the advent of the Internet, major machine translation vendors included Web interfaces in their products, while ensuring their integration with other software and e-mail, allowing the use of MT engines for the translation of Web pages, electronic correspondence, and online conversation sessions.

New members of CompuServe's Foreign Language Forum often ask if anyone can recommend a good machine translation program at a reasonable price.

The answer to this question is invariably “no.” Depending on the person answering, the answer may contain two main arguments: either that machines cannot translate, or that machine translation is too expensive.

Both of these arguments are valid to a certain extent. However, the answer is far from so simple. When studying the problem of machine translation (MT), it is necessary to consider separately the various subsections of this problem. The following division is based on lectures by Larry Childs given at the 1990 International Technical Communication Conference:

Fully automatic translation;

Automated machine translation with human participation;

Translation carried out by a person using a computer.

Fully automated machine translation. This type of machine translation is what most people mean when they talk about machine translation. The meaning here is simple: text in one language is entered into the computer, this text is processed and the computer displays the same text in another language. Unfortunately, the implementation of this type of automatic translation faces certain obstacles that still need to be overcome.

The main problem is the complexity of the language itself. Take, for example, the meanings of the word "can". In addition to the basic meaning of a modal auxiliary verb, the word "can" has several formal and slang meanings as a noun: "can", "latrine", "prison". In addition, there is an archaic meaning of this word - “to know or understand.” Assuming that the output language has a separate word for each of these meanings, how can a computer differentiate between them?

As it turns out, some progress has been made in the development of translation programs that differentiate meaning based on context. More recent studies rely more on probability theories when analyzing texts. However, fully automated machine translation of texts with extensive subject matter is still an impossible task.

Automated machine translation with human participation. This type of machine translation is now entirely feasible. When we talk about human-assisted machine translation, we usually mean editing texts both before and after they are processed by a computer. Human translators change texts so that they are understandable to machines. After the computer has done the translation, humans again edit the rough machine translation, making the text in the output language correct. In addition to this operating procedure, there are MT systems that during translation require the constant presence of a human translator to help the computer translate particularly complex or ambiguous structures.

Human-assisted machine translation is applicable to a greater extent to texts with a limited vocabulary and narrowly limited subject matter.

The cost-effectiveness of using human-assisted machine translation is still a controversial issue. The programs themselves are usually quite expensive, and some of them require special equipment to run. Pre- and post-editing requires a learning curve, and it's not a pleasant job. Creating and maintaining word databases is a labor-intensive process and often requires special skills. However, for an organization translating large volumes of text in a well-defined subject area, human-assisted machine translation can be a fairly cost-effective alternative to traditional human translation.

Translation carried out by a person using a computer. In this approach, the human translator is placed at the center of the translation process, while the computer program is regarded as a tool that makes the translation process more efficient and the translation accurate. These are ordinary electronic dictionaries that provide translation of the required word, placing responsibility on the person for choosing the desired option and the meaning of the translated text. Such dictionaries greatly facilitate the translation process, but require the user to have a certain knowledge of the language and spend time on its implementation. And yet the translation process itself is significantly faster and easier.

Among the systems that help a translator in his work, the most important place is occupied by the so-called Translation Memory (TM) systems. TM systems are an interactive tool for accumulating in a database pairs of equivalent text segments in the original language and translation with the possibility of their subsequent search and editing. These software products do not aim to use highly intelligent information technologies, but, on the contrary, are based on using the creative potential of the translator. In the process of work, the translator himself creates a database (or receives it from other translators or from the customer), and the more units it contains, the greater the return on its use.

Here is a list of the most famous TM systems:

Transit from the Swiss company Star,

Trados (USA),

Translation Manager from IBM,

Eurolang Optimizer from the French company LANT,

DejaVu from ATRIL (USA),

WordFisher (Hungary).

TM systems make it possible to eliminate repeated translation of identical text fragments. The translation of a segment is carried out by the translator only once, and then each subsequent segment is checked for a match (full or fuzzy) with the database, and if an identical or similar segment is found, it is offered as a translation option.

Currently, developments are underway to improve TM systems. For example, the core of Star's Transit system is implemented based on neural network technology.

Despite the wide range of TM systems, they share several common features:

Alignment function. One of the advantages of TM systems is the ability to use already translated materials on a given topic. The TM database can be obtained by segment-by-segment comparison of the original and translation files.

Availability of import and export filters. This property ensures compatibility of TM systems with a variety of word processors and publishing systems and gives the translator relative independence from the customer.

A mechanism for searching fuzzy or complete matches. It is this mechanism that represents the main advantage of TM systems. If, when translating a text, the system encounters a segment that is identical or close to the previously translated one, then the already translated segment is offered to the translator as an option for translating the current segment, which can be corrected. The degree of fuzzy matching is specified by the user.

Support for thematic dictionaries. This feature helps the translator stick to the glossary. As a rule, if a word or phrase from a thematic dictionary occurs in a translated segment, it is highlighted in color and its translation is suggested, which can be inserted into the translated text automatically.

Tools for searching text fragments. This tool is very convenient when editing translations. If during the work process a more successful translation option was found for any fragment of text, then this fragment can be found in all TM segments, after which the necessary changes are sequentially made to the TM segments.

Of course, like any software product, TM systems have their advantages and disadvantages, and their scope of application. However, regarding TM systems, the main disadvantage is their high cost.

It is especially convenient to use TM systems when translating documents such as user manuals, operating instructions, design and business documentation, product catalogs and other similar documentation with a large number of matches.

Documents drawn up in foreign languages ​​should be translated into Russian by the travel agency. Otherwise, expenses on them cannot be accepted for tax purposes. However, sometimes translation is not necessary. For example, if we are talking about the details of an electronic air ticket encoded in Latin (letter of the Federal Tax Service of Russia dated June 7, 2011 No. ED-4-3/8983).

The need to translate documents
In accordance with paragraph 1 of Article 16 of the Law of the Russian Federation of October 25, 1991 No. 18071 “On the languages ​​of the peoples of the Russian Federation,” official paperwork in organizations in our country is conducted in Russian.

And as stated in paragraph 9 of the Regulations on maintaining accounting and financial reporting in the Russian Federation, approved by order of the Ministry of Finance of Russia dated July 29, 1998 No. 34n, accounting of property, liabilities and business transactions (facts of economic activity) is recorded in the currency of the Russian Federation - in rubles .

At the same time, documentation of property, liabilities and other facts of economic activity, maintenance of accounting and reporting registers are carried out in Russian. It is further said: primary accounting documents compiled in other languages ​​must have a line-by-line translation into Russian.

Based on these norms, regulatory authorities conclude that primary documents, if they are drawn up in a foreign language, must be translated into Russian. This opinion is expressed, in particular, in letters of the Ministry of Finance of Russia dated November 3, 2009 No. 03-03-06/725, dated September 14, 2009 No. 03-03-05/170, dated February 16, 2009 No. 03- 03-05/23.

Although in the courts, organizations manage to defend expenses that are confirmed by documents without translation (resolutions of the Federal Antimonopoly Service of the Moscow District dated April 21, 2011 No. KA-A40/2152-11, dated October 8, 2008 No. KA-A40/8061-08).

Also, most often, arbitrators side with taxpayers, pointing out that the lack of translation of documents into Russian cannot serve as a basis for refusing to deduct VAT. An example of this is the resolution of the FAS Moscow District dated April 1, 2009 No. KA-A40/132809, dated March 16, 2009 No. KA-A40/1450-09, FAS West Siberian District dated March 5, 2007 No. F04-979 /2007(31967-A45-14).

However, if the travel agency’s documents are not translated, you will most likely have to defend the possibility of tax accounting for expenses or the right to deduction. At the same time, the outcome of the legal dispute may not be in favor of the taxpayer.

How to translate a document
The financial department clarifies that the translation can be made either by a professional translator or by the organization itself, or rather, by its employee (letters dated September 14, 2009 No. 03-03-05/170, dated March 20, 2006 No. 03-02-07 /1-66).

However, the legislation does not establish how such a document should be drawn up. Therefore, it can be done in the form of a separate document or you can enter Russian text on photocopies of a foreign primary document.

It should be noted that the translation can also be performed by the organization that issued the primary document, for example, in the form of a certificate (letter of the Ministry of Finance of Russia No. 03-03-05/170).

When can you do without translation?
In some cases, you will not have to translate documents.

This does not need to be done, firstly, if you regularly receive standard documents from your foreign counterparties, in which only digital indicators differ (number, document date, price, etc.), it is enough to translate the document form into Russian once. Explanations on this issue are given in the letter of the Ministry of Finance of Russia dated November 3, 2009 No. 03-03-06/1/725.

Secondly, there is no need to translate information that is not essential to confirm the expenses incurred.

For example, conditions for applying the fare, air transportation rules, baggage transportation rules. Officials of the main financial department drew attention to this in a letter dated September 14, 2009 No. 03-03-05/170.

Thirdly, there is no need to translate formalized (coded) details of an electronic air ticket filled out using Latin characters (letters of the Federal Tax Service of Russia dated June 7, 2011 No. ED-4-3/8983, dated April 26, 2010 No. ShS-37- 3/656@).

But when the values ​​in an electronic air ticket are actually indicated in a foreign language and do not coincide with the formalized (coded) values ​​or codes in accordance with the Unified International Codifiers, then these indicators (values) of the air ticket must be translated into Russian.

Accounting for translation costs
In accounting, a travel company's expenses for payment for document translation services are included in other expenses in the month in which they are provided. This is reflected by an entry in the debit of account 91 “Other income and expenses” (sub-account “Other expenses”) and the credit of account 76 “Settlements with various debtors and creditors” (clauses 11, 16, 18 PBU 10/99 “Expenses of the organization”).

Such expenses are also accepted for profit tax purposes - as part of others as payment for information services (subclause 14, paragraph 1, article 264 of the Tax Code of the Russian Federation) or as other other expenses. This is discussed in the letter of the Federal Tax Service of Russia for Moscow dated May 26, 2008 No. 20-12/050126. True, this rule concerns the costs of third-party translation.

Let us remind you that in order to comply with the requirements of Article 252 of the Tax Code of the Russian Federation, expenses for translation of documents must be documented.

With the “simplification” it will not be possible to take into account such expenses. Since they are not included in the closed list of permitted expenses (clause 1 of Article 346.16 of the Tax Code of the Russian Federation).

Important to remember

The costs of third-party translation of documents can be taken into account for income tax purposes. But it is impossible to accept them with “simplification”.

Topic: “Computer translators. Text recognition systems."

Lesson objectives:

    help students gain an understanding of computer dictionaries and machine text translation systems, become familiar with the capabilities of these programs, and teach how to use these programs. help students get an idea of ​​OCR - text recognition programs, get acquainted with the program's data capabilities, teach them to recognize scanned text, transfer and edit it in Word. nurturing students’ information culture, attentiveness, accuracy, discipline, perseverance. development of cognitive interests, computer skills, self-control, and note-taking skills.

Equipment:
board, computer, computer presentation.

Lesson plan:

1) Organizational moment. (1 min)

2) Updating knowledge. (5 minutes)

3) Theoretical part. (10 min)

4) Practical part. (15 minutes)

5) Homework (2 min)

6) Questions from students. (5 minutes)

7) Lesson summary. (2 minutes)

During the classes:

I. Organizational moment.

Greetings, checking those present. Explanation of the lesson.

II. Updating knowledge.

As you can see, in order to obtain an electronic copy of any printed text, ready for editing, the OCR program needs to perform a “chain” of many individual operations.

First you need to recognize the structure of the text on the page: select columns, tables, images, and so on. Next, the selected text fragments of the graphic image of the page must be converted into text.

If the source document has typographic quality (sufficiently large font, no poorly printed characters or corrections), then the recognition problem is solved by comparison with a raster template. First, the bitmap image of the page is divided into images of individual characters. Then each of them is sequentially superimposed on the symbol templates available in the system memory, and the template with the least number of points different from the input image is selected.

When recognizing documents with low print quality (typewritten text, fax, etc.), the method of character recognition is used by the presence of certain structural elements in them (segments, rings, arcs, etc.).

Any symbol can be described through a set of parameter values ​​that determine the relative position of its elements. For example, the letter “H” and the letter “I” consist of three segments, two of which are parallel to each other, and the third connects these segments. The difference between these letters is in the size of the angles that the third segment forms with the other two.

When recognizing using the structural method, characteristic details are identified in a distorted symbolic image and compared with structural patterns of symbols. As a result, the symbol is selected for which the totality of all structural elements and their arrangement most closely corresponds to the recognized symbol.

The most common optical character recognition systems, for example, ABBYY FineReader and CuneiForm from Cognitive, use both raster and structural recognition methods. In addition, these systems are “self-learning” (for each specific document they create an appropriate set of character templates) and therefore the speed and quality of recognition of a multi-page document is gradually increasing.

You can purchase text recognition programs separately or receive them free of charge with the scanner you purchase.

Perhaps the most famous text recognition program is FineReader from ABBYY. It is this program that is most often remembered when it comes to recognition systems.

FineReader allows you to recognize texts typed in almost any font without prior training. A special feature of the FineReader program is its high recognition accuracy and low sensitivity to print defects, which is achieved through the use of “holistic targeted adaptive recognition” technology.

FineReader has a lot of additional functions that the average user may not need, but which make an impression on certain groups of buyers. So, one of the trump cards of FineReader is its support for an incredible number of recognition languages ​​- 176, among which you will find exotic and ancient languages, and even popular programming languages.

But not all features are included in the simplest modification of the program, which you can receive for free along with the scanner. Batch scanning, competent processing of tables and images - for all this it is worth purchasing the professional version of the program.

All versions of FineReader, from the simplest to the most powerful, share a user-friendly interface. To start the recognition process, you just need to put the document in the scanner and press the only button (Scan & Read wizard) on the program toolbar. The program will perform all further operations - scanning, dividing the image into “blocks” and, finally, the recognition itself. The user will only have to set the necessary scanning parameters.

The quality of recognition largely depends on how good the image is obtained during scanning. Image quality is regulated by setting the basic scanning parameters: image type, resolution and brightness.

Scanning in gray is the optimal mode for the recognition system. When scanning in gray mode, brightness is automatically selected. If you want the color elements contained in the document (pictures, color of letters and background) to be transferred to an electronic document while preserving the color, you must select the color image type. Otherwise, use the gray image type.

The optimal resolution for regular texts is 300 dpi and 400-600 dpi for texts in small font (9 points or less).

After page recognition is complete, FineReader will offer the user a choice: scan and recognize further (for a multi-page document) or save the resulting text in one of many popular formats - from Microsoft Office documents to HTML or PDF. You can, however, immediately transfer the document to Word or Excel, and correct all recognition flaws there (it’s simply impossible to do without). At the same time, FineReader fully preserves all the document formatting features and its graphic design.

    Why do we need text recognition programs? How does text recognition work? What text recognition programs do you know? Which ones did you use? What resolution is optimal for scanning text and images?

III. Practical part.
1. Working with a text translator (line by line)
2. Now let’s practice working with the ABBYY FineReader program. We will use a simplified version of the program supplied with the scanner.

IV. Homework
Know what automatic text translation programs are and be able to work with these programs. Additional task: connect to the Internet and use any on-line translator to translate the text.
Know what text recognition programs are and be able to work with these programs. Additional task: install the OCR program at home and prepare an essay on any subject. The text is recognized in OCR, editing and formatting is carried out in Word.

V. Questions from students.
Answers to student questions.

VI. Lesson summary.
Summing up the lesson. Grading.

During the lesson we got acquainted with computer translation programs for texts and learned how to translate words and text using a translator program.

During the lesson, we got acquainted with OCR programs and learned to recognize a scanned image using the ABBYY FineReader 5.0 program.

    Translate sentences into Russian:

    The operating system is usually stored in the computer's external memory.

    Dictionaries are necessary for translating texts from one language to another.

    Information must be reliable, relevant and useful.

    The teacher’s computer is placed on the table in the corner of the classroom.

    Instrumental system programs facilitate the process of creation of new programs for a computer.

    Universal arrangement of processing of the information is the computer.

1.Translate the sentences into English:

    The operating system is usually stored in the computer's external memory.

    Dictionaries are necessary for translating texts from one language to another.

    Information must be reliable, relevant and useful.

2. Translate the sentences into Russian:

    The teacher’s computer is placed on the table in the corner of the classroom.

    Instrumental system programs facilitate the process of creation of new programs for a computer.

    Universal arrangement of processing of the information is the computer.

1.Translate the sentences into English:

    The operating system is usually stored in the computer's external memory.

    Dictionaries are necessary for translating texts from one language to another.

    Information must be reliable, relevant and useful.

2. Translate the sentences into Russian:

    The teacher’s computer is placed on the table in the corner of the classroom.

    Instrumental system programs facilitate the process of creation of new programs for a computer.

    Universal arrangement of processing of the information is the computer.

1.Translate the sentences into English:

    The operating system is usually stored in the computer's external memory.

    Dictionaries are necessary for translating texts from one language to another.

    Information must be reliable, relevant and useful.

2. Translate the sentences into Russian:

    The teacher’s computer is placed on the table in the corner of the classroom.

    Instrumental system programs facilitate the process of creation of new programs for a computer.

    Universal arrangement of processing of the information is the computer.

1.Translate the sentences into English:

    The operating system is usually stored in the computer's external memory.

    Dictionaries are necessary for translating texts from one language to another.

    Information must be reliable, relevant and useful.

2. Translate the sentences into Russian:

    The teacher’s computer is placed on the table in the corner of the classroom.

    Instrumental system programs facilitate the process of creation of new programs for a computer.

    Universal arrangement of processing of the information is the computer.

1.Translate the sentences into English:

    Dictionaries are necessary for translating texts from one language to another.

2. Translate the sentences into Russian:

    The teacher’s computer is placed on the table in the corner of the classroom.

1.Translate the sentences into English:

    A universal information processing device is a computer.

    Dictionaries are necessary for translating texts from one language to another.

    System tool programs facilitate the process of creating new computer programs.

2. Translate the sentences into Russian:

    The information should be authentic, actual and useful.

    The teacher’s computer is placed on the table in the corner of the classroom.

1.Translate the sentences into English:

    A universal information processing device is a computer.

    Dictionaries are necessary for translating texts from one language to another.

    System tool programs facilitate the process of creating new computer programs.

2. Translate the sentences into Russian:

    1)The operation system is usually stored in external memory of a computer.

    The information should be authentic, actual and useful.

    The teacher’s computer is placed on the table in the corner of the classroom.

1.Translate the sentences into English:

    A universal information processing device is a computer.

    Dictionaries are necessary for translating texts from one language to another.

    System tool programs facilitate the process of creating new computer programs.

2. Translate the sentences into Russian:

    1)The operation system is usually stored in external memory of a computer.

    The information should be authentic, actual and useful.

    The teacher’s computer is placed on the table in the corner of the classroom.

1.Translate the sentences into English:

    A universal information processing device is a computer.

    Dictionaries are necessary for translating texts from one language to another.

    System tool programs facilitate the process of creating new computer programs.

2. Translate the sentences into Russian:

    The operation system is usually stored in external memory of a computer.

    The information should be authentic, actual and useful.

    The teacher’s computer is placed on the table in the corner of the classroom.

1.Translate the sentences into English:

    A universal information processing device is a computer.

    Dictionaries are necessary for translating texts from one language to another.

    System tool programs facilitate the process of creating new computer programs.

2. Translate the sentences into Russian:

    The operation system is usually stored in external memory of a computer.

    The information should be authentic, actual and useful.

    The teacher’s computer is placed on the table in the corner of the classroom.

Product overview

With the advent of writing, people received a powerful tool for preserving knowledge and for communication. The first writings that have come down to us on the walls of temples and tombs tell about the deeds of kings and generals that took place many centuries ago. In addition, people recorded the results of economic activities in order to successfully trade, collect taxes, etc.

To facilitate written communication between peoples, the first dictionaries were created. One of these dictionaries was written by Sumerian priests on clay tablets. Each tablet was divided into two equal parts. On one side, a Sumerian word was written, and on the other, a word of similar meaning in another language, sometimes with a brief explanation. From those times to the present day, the structure of dictionaries has remained virtually unchanged.

With the advent of the personal computer, electronic dictionaries began to be created, making it easier to find the right word and offering many new useful functions (voicing the word, searching for synonyms, etc.).

Machine translation technology has gradually improved. And if the quality and speed of translation of the first systems left much to be desired, now a computer can really coherently translate text from one language to another. And more modern systems with acceptable quality translate 1 page of text in 1 second.

Who needs machine translation and why?

Recently, the possibilities and prospects of machine translation (MT) technologies have been actively discussed. Both professional translators and MT system manufacturers take part in the discussions. Let's try to evaluate the capabilities of MP, based on the experience of using real systems.

To be fair, it should be noted that in the foreseeable future, machine technology will not be able to completely replace the human translator. In terms of translation quality, MP programs cannot compete with humans. However, with the help of such programs you can significantly increase the efficiency of a translator’s work.

Based on the formal description of languages, the program analyzes text in one language and then synthesizes a phrase in another. Analysis and synthesis algorithms, as a rule, are quite complex and are controlled by dictionary information assigned to lexical units in the system dictionaries for both the language of the source text and the language of its translation.

Where are MP systems used? Firstly, translation programs can be used to quickly translate text in order to understand its meaning. Of course, the quality of machine translation cannot compare with translation made by a person, but the user receives an answer “here and now.” In addition, with the help of MP systems, you can read information posted on foreign websites, as well as understand the text of a sent letter written in French, German, Japanese or another language.

In addition, MT systems can be used to solve professional translation problems and significantly increase the efficiency of work. Let's compare both methods - traditional and machine. Traditional translation usually includes several stages: translation, editing, layout, proofreading. In order to speed up the translation, as a rule, several translators perform the translation. As a result, the problem of unified terminology and a unified translation style arises, which increases the cost of editorial editing. In addition, considerable effort has to be spent on revising the document.

What benefits does the use of MP systems provide and where is it most appropriate? MP systems, using a common vocabulary base for translation, significantly minimize the costs of maintaining a uniform terminology, and, consequently, of editorial editing. In this case, the technical editor receives from the MP system a translation made in the same style. Thus, the use of machine translation systems is most effective for organizing the technological process of translating large arrays of similar documents in a short time, ensuring uniformity of terminology and style throughout the entire array of documents.

The possibility of using an MP system is determined by its ability to adapt to the translation of documents on various topics. The quality of the resulting translation largely depends on the settings. In addition to the general lexical dictionary, specialized dictionaries should be used that reflect both the topic of translation and the specifics of specific documents. In addition, the quality of translations depends on the ability of the translator to create his own custom dictionaries, which should include terminology specific to this documentation, as well as frequently occurring phrases/expressions (micro-segments), the translation of which cannot be formal. Such a setting guarantees the quality at which the use of MT becomes effective for solving “industrial” translation problems.

To evaluate the effectiveness of using MP systems, PROMT provided its PROMT 2000 Translation Office system to the LONIIS translation center. The experiment showed that the use of MP can reduce the total project completion time by approximately 2 times.

It should be noted that there are a number of restrictions on the use of MP systems. It makes no sense to translate literary texts, proverbs and sayings using a translator program. Small texts on various topics are also better translated in the traditional way.

PROMT Translation Office 2000

PROMT Translation Office 2000 (hereinafter referred to as PROMT), priced at $300, is a set of professional tools that provides translation from major European languages ​​into Russian and vice versa. With its help, you can not only translate, but also edit the translation and work with dictionaries of all language areas at the same time.

PROMT includes the following collections of dictionaries:

  • "Light industry" ($180);
  • "Heavy Industry" ($180);
  • Commerce ($99);
  • Science ($120);
  • "Technique" ($199).

To ensure high quality translation, the PROMT system provides the ability to configure the translation of a specific text - by connecting specialized subject dictionaries, supplied separately, as well as creating your own custom dictionaries. A convenient means of setting up the system is also the ability to select the subject of the document: which dictionaries to connect, which words to leave without translation, and how to process special constructions such as email address, date and time.

The PROMT system includes the following modules:

  • PROMT - professional environment for translation;
  • Dictionary Editor - a tool for replenishing and editing dictionaries of machine translation systems of the PROMT family;
  • PROMT Electronic Dictionary is an electronic dictionary that provides the user with ample access to lexical and grammatical information collected in specialized dictionaries of the PROMT family. Can be used for any work with texts (for example, to quickly obtain information about translation equivalents of a given word or phrase);
  • WebView is a browser that allows you to get simultaneous translation of HTML pages when navigating the Internet. WebView contains two windows for displaying HTML pages: the top one displays the original page received from the Internet, the bottom one displays its translation, saving links, pictures, inserted objects, etc. You can follow links both in the upper window containing the source text and in the lower window containing the translation;
  • SmarTool is a tool that implements translation functions in Microsoft Office 97 (Word, Excel) and Microsoft Office 2000 (Word, Excel, PowerPoint, FrontPage, Outlook) applications. The translation menu and toolbar are built into all major Microsoft Office 2000 and Microsoft Office 97 applications, which allows you to get a translation of an open document directly in these applications;
  • QTrans is a program designed for quickly translating unformatted text. With its help you can easily and quickly translate text, a text file or a clipboard (Clipboard). To improve the quality of translation, you can select a suitable topic, connect specialized dictionaries and reserve words;
  • Clipboard Translator is a program designed to quickly translate text previously copied to the clipboard. The text can be copied from any Windows application (Help, Notepad, Word, Word Perfect, PageMaker, etc.);
  • “Integrator” is a means of accessing all applications of the package.

Translation of a document in the PROMT system

The label marks the current paragraph of the source text and the translation of this paragraph (the current one is the one in which the cursor is currently positioned).

All documents with which the PROMT program works appear in document windows. Several documents can be open at the same time - each in its own window (Fig. 4,).

The completed translation can be clarified by using electronic dictionaries developed by other companies (if they are installed on your computer, of course). Electronic dictionaries can be used:

  • Lingvo 6.0 (ABBYY program);
  • “Context 3.0” (program from the Informatik company);
  • "MultiLex 1.0, 2.0, 3.0" (program of the company "MediaLingua");
  • PROMT Electronic Dictionary 1.0 (program from PROMT).

When translating, the PROMT system does not use electronic dictionaries from other manufacturers. Therefore, if a word is not in the PROMT system dictionaries or you are not satisfied with the translation of a word or phrase, you can call up the electronic dictionary and use it as a reference.

The WebView browser is included in the package for translating HTML documents.

Sequence of actions when performing a translation

  1. Open the file with the source text or create a new document (new text can be typed directly in the PROMT window).
  2. Check the text breakdown into paragraphs (after translation, the paragraph formatting will remain the same).
  3. Check spelling and edit source text if necessary.
  4. Select a topic template suitable for translating a given text (a topic template for a given translation direction is a set of dictionaries and a list of reserved words; it is installed to improve the quality of the translation).
  5. Specify the subject of the document by customizing its components:
    • connect dictionaries that will be used when translating the text. If no dictionary is connected, only the general lexical general dictionary will be used for translation;
    • reserve words that in the translation text should remain in the source language;
    • connect the preprocessor if you want to cancel the translation of certain structures, such as email addresses, file names, and also choose the form of representing date and time in the translation text;
    • Mark paragraphs that do not require translation.
  6. Translate the text (the entire document at once or paragraph by paragraph).
  7. Enter unfamiliar words into your custom dictionary if you want them to be translated later.
  8. Use an electronic dictionary to clarify the meanings of words.
  9. Save the translation results.

System requirements

  • IBM PC-compatible computer with a P166 processor or higher;
  • 32 MB of RAM;
  • approximately 160 MB of hard disk space (for a system with all components);
  • SVGA or better resolution video adapter;
  • CD-ROM reader (for installation);
  • mouse or compatible device;
  • OS: Windows 98 (Russian version or pan-European version with Russian language support and Russian regional settings), or Windows NT 4.0 SP3 (or higher) with Russian language support and Russian regional settings, or Windows 2000 Professional (with Russian language support and Russian regional settings );
  • Microsoft Internet Explorer 5.x (included in delivery).
  • IBM PC-compatible computer with a PII-300 processor or higher;
  • 64 MB RAM

Translation of a document in the Socrates Personal system

The main window of the program is shown in Fig. 6.

When you launch it for the first time, the main program window opens by default on the “Translator” tab. Translation of text typed in the program window: by typing text in the upper window of the “Translator” tab and clicking the “Translate” button on the toolbar or in the “Translation” menu, you will receive a translation of the text in the lower window of the tab.

In order to use the dictionary (Fig. 7), just click on the corresponding tab. In addition, the dictionary window can be called up using hot keys.

Using a dictionary, you can get the translation of the word you are looking for in the following ways:

  • Type the word in the input field located in the upper right window of the dictionary. Navigation through the dictionary database is carried out as letters are entered, until the maximum possible match is obtained;
  • paste a word into the input field from the clipboard. In this case, a quick transition will be made to the word that most closely matches the entered one;
  • select a previously translated word from the input field history window, after which a quick transition will be made to the word that has the maximum possible match with the entered one;
  • Select a word in another application and, while holding down the Shift key, right-click on the selection. The translation of the selected word will appear in a pop-up window;
  • use a hotkey combination after placing the required word on the clipboard.

Translate words or text from other applications

The Socrates Personal 4.0 system provides the ability to work with a translator and dictionary in other applications without leaving them. The translation is carried out in a pop-up window.

In order to get a translation of text from another application (for example, a text editor), you need to select the text to be translated and, while holding down the Shift key, right-click on the selection. A pop-up window will appear containing the translation of the selected fragment.

In order to get a translation of a word from another application, you need to select the word you are interested in and, while holding down the Shift key, right-click on the selection. The pop-up window that appears will contain the translation of the highlighted word.

If necessary, from this window you can go directly to the “Dictionary” tab using the hyperlink of the pop-up window.

System requirements

Minimum computer configuration:

  • IBM PC-compatible computer with a Pentium 90 processor or higher;
  • Operating system Windows 98/Me or Windows NT/2000;
  • 32 MB of RAM;
  • 16 MB of free hard disk space.

Test results for PROMT Translation Office 2000 and Socrates Personal 4.0

To compare the quality and speed of translation of the two systems, several fragments of texts in Russian and English were selected: individual phrases, news from companies, passages from the Bible, Murphy's Laws, technical, medical, legal texts. Ratings were given on a 10-point scale. After this, a comparison was made of the results of translation from English into Russian and vice versa (Table 1).

It should be noted that PROMT Translation Office 2000 and Socrates Personal 4.0 are products designed to solve different problems. PROMT Translation Office 2000 is a professional translation system that makes it much more efficient to translate large volumes of information. In addition, the PROMT system correctly implements the grammatical rules of a particular language. Therefore, the quality of the translation is very high. The disadvantages of the PROMT system are high requirements for hardware resources and significant translation time when connecting several additional dictionaries.

"Socrates Personal 4.0" is an automatic translation system that helps you quickly and easily get a translation of an unclear phrase or term. Its main purpose is to always be at hand.

Translating a short letter or phrase from a text using Socrates Personal 4.0 is much easier and faster than using the PROMT system. However, to translate a large amount of text, it is advisable to use PROMT Translation Office 2000.

Lingvo 7.0

Lingvo 7.0 is a powerful professional dictionary that is very user-friendly. Press a hotkey in any Windows application - and the most complete translation of the word from all dictionaries connected to the system will appear on the screen. Grammar comments on any word, pronunciation of the most important words, checking spelling, the ability to create your own dictionaries - all this is offered by ABBYY Lingvo 7.0 (Fig. 9). Lingvo 7.0 contains more than 1.2 million words and phrases in 18 general and specialized dictionaries.

When Lingvo starts, the main window appears on the screen (Fig. 10). The user can type the desired word in the input line. As you type, the program will search for the most suitable word. By pressing the enter key or the “Translate text” icon, the user will see a card window containing the dictionary entry of the selected (found during search) word (Fig. 11).

If you are reading the help section of a program, working with a text editor, browser, or any other Windows application, select a word or several words in the text and press Ctrl+Ins+Ins. Or simply drag-and-drop the word into the input line. This will activate the main Lingvo window and open a card with the translation of the selected word. If there are many such cards, the “Translation” window will appear containing words and phrases from the request.

To insert a translation into the edited text, select the translation in the card and press Ctrl+Ins. Switch to the text editor window and perform the “Paste” operation. You can also drag the translation onto your text editor window.

When translating from English into Russian, it is not always obvious whether we are dealing with words that can be translated independently, or with a phrase that is translated as a whole. The “Translate text from string” function helps in solving this problem, allowing you to find stable phrases in the translated fragment, for which there are separate dictionary entries. You can try to find the remaining untranslated fragments in the examples using the full-text search function, setting the necessary options (and/or, taking into account the order or not, etc.)

When translating from Russian into English, highlighting combinations and grammatical structures is not difficult, and if these combinations are not in the dictionary, you can immediately turn to the full-text search function. The search results allow you to evaluate how the expression you are interested in is translated in real examples.

Main features of Lingvo:

  • translation of 1.2 million words and phrases;
  • 18 general and specialized dictionaries (2 medical and 2 legal dictionaries in Lingvo 7.0 - new);
  • modern vocabulary;
  • calling the dictionary from any Windows application;
  • perfect search system;
  • 5 thousand English words were voiced by an announcer from Oxford;
  • the ability to create your own custom dictionaries;
  • 23 free user dictionaries at http://www.lingvo.ru/;
  • detailed interpretations and explanations of the use of words;
  • modern linguistic technologies;
  • new updated versions of general and specialized dictionaries.

System requirements

Minimum computer configuration:

  • IBM PC-compatible computer with a Pentium 133 processor or higher;
  • operating system Windows 95/98/Me, Windows 2000/Windows NT 4.0 (SP3 or higher);
  • 16 MB of RAM for Windows 95/98/Me, 32 MB of RAM for Windows 2000/Windows NT 4.0;
  • from 85 to 265 MB of free hard disk space;
  • 3.5” disk drive and CD-ROM drive, mouse;
  • Microsoft Internet Explorer 5.0 and higher (the ABBYY Lingvo 7.0 distribution includes Microsoft Internet Explorer 5.5 - installing it will require an additional 27 to 80 MB);
  • sound card compatible with the operating system; headphones or speakers (recommended).

Context 4.0

"Context 4.0" is a system of electronic dictionaries that includes a developed software shell and an extensive set of dictionaries - both general vocabulary and specialized ones. A unique property of “Context” is that it takes into account the morphology of supported languages. Thanks to this, Context translates words and phrases in any grammatical form. The English-Russian and Russian-English dictionaries are most fully represented in Context. The Context library of the new version has been expanded to include English-French, English-German, English-Spanish, English-Italian, English-Portuguese, English-Serbian and English-Croatian dictionaries.

Context dictionaries are two-way. The program translates from one language to another and back without any special settings. The translation search can be carried out both in all dictionaries included in the kit, and in a specific dictionary. At the same time, the set of active (participating in the search) dictionaries, as well as the search order for them, can be easily changed.

You can work with “Context” by typing a word or phrase of interest to the user into a special input field (Fig. 12).

It is convenient to work with “Context” from Windows applications. Translation is carried out using the drag-and-drop method or via the clipboard. In the settings, you can specify a hotkey or enable the option to start translation when placing text on the clipboard.

For users working in the MS Word editor, the ability to call “Context” from the editor itself has been implemented. To do this, click on the “Context” icon located on the MS Word toolbar, and the user does not need to select a word or phrase in the text. “Context” will translate the word the cursor is on and at the same time check several words on the right and left to see if they are part of the phrase.

“Context” is completed with dictionaries upon request of the user. If the user has purchased a shell and some dictionaries, he can purchase any other dictionaries he needs.

The 4th version of Context has a number of interesting features that were not present in previous versions. For example, the dictionary searches partial phrases. In this case, all phrases whose relevance coefficient in relation to the search string is greater than a specified threshold value are displayed in the translation window (Fig. 13,).

There is a new function of fast dialing (Fast Typing). When entering a word, the user receives hints of similar words from the current dictionary, taking into account the characters already entered (Fig. 15). Then the user can select from the list or continue dialing himself.

To allow dictionaries to work together in different languages, along with automatic detection, a language selection function has been added (Fig. 16).

The new version has the ability to add and edit dictionary entries, which makes the dictionary system more flexible. In the previous version of Context, the ability to work with the user's dictionary was implemented. The new version of the Context program allows you to create several dictionaries and edit them. User dictionaries, standard dictionaries, and user dictionaries are equal in the Context dictionary system. The format of the user's dictionary entry is close to the format of the standard dictionary, that is, to the usual book format. The article may include both words and expressions and examples of the use of words as part of set expressions and interpretation (

MultiLex 3.5

"MultiLex 3.5" is an electronic dictionary, which includes electronic versions of well-known printed dictionaries. A variety of English-Russian and Russian-English dictionaries are produced in the “MultiLex 3.5 English” shell (New English-Russian Dictionary by V.K. Muller, English-Russian/Russian-English Dictionary by O.S. Akhmanova, Russian-English Dictionary ed. A.I. Smirnitsky). It is planned to release technical, medical-biological, economic-legal and other collections.

"MultiLex 3.5 English" allows the user to gradually select for himself the optimal set of dictionaries that will work together.

Features of the MultiLex dictionary:

  • convenience and ease of use;
  • voicing a large number of dictionary entries;
  • quick access to important articles: using bookmarks, you can mark dictionary entries that are important to you, and then access them directly;
  • “speed dial” function - when typing a word, a list of similar words appears, from which the user can select a word for translation without typing the whole word;
  • translating a word or phrase and transferring translation results to a Windows application via the clipboard or drag-and-drop;
  • entering notes: when working collaboratively, it is important to maintain uniform terminology. This is where the notes mechanism comes to the rescue - you can write your own notes for any dictionary entry;
  • user dictionary.

The “MultiLex” window contains a window frame, a menu bar, under which there is a dictionary panel, a toolbar and a search bar. Below the search bar is the actual work area of ​​the MultiLexa window.

The work area is divided vertically into two parts: the article title panel (left) and the dictionary entry text panel (right). The border between panels can be moved left and right.

The left panel contains a list of headings of articles of that dictionary, which is shown in the dictionary panel using an icon in the form of an open book (used to view the headings of dictionary entries). The right panel always shows the dictionary entry corresponding to the title highlighted in the right panel. A dictionary entry begins with a title, followed by its transcription. Next, the part of speech is indicated, possible translations, explanations, and examples are given.

The dictionary panel allows you to select the desired dictionary. Each dictionary has its own icon, which takes three different states: closed book, half-open book, or open book. The shape of the icons shows which dictionary is currently open and in which dictionaries the last search found something.

If the dictionary icon depicts an open book (notepad) - this dictionary is now open, a half-open book (notebook) - this dictionary is not currently open, but it contains information relevant to your request, and if the icon depicts a closed book (notebook) - this dictionary is closed and it doesn't contain the information you need.

In July 2001, a new version of the dictionary “MultiLex 3.5 English Popular” (English-Russian, Russian-English dictionary of general vocabulary edited by O.S. Akhmanova and E.A.M. Wilson) was released. It contains more than 40 thousand dictionary entries.

Version 3.5 has a number of advantages that you will not find in the previous version:

  • possibility of additional installation of dictionaries. By purchasing any English dictionary (version no lower than 3.5), you can easily integrate it into your MultiLex. It is planned to release technical, medical-biological, economic-legal and other collections;
  • pop-up translation. MultiLex 3.5 provides support for translation using hot keys from any application that supports Clipboard. To do this, simply highlight the word, press the corresponding function key (F10 by default) - and a translation window will appear on the screen. The translation in the window is a hyperlink. If you need more complete information on the word you are interested in, click on the left mouse button to call up “MultiLex” with ready-made translation options for the requested word. The pop-up translation window can be installed on top of all windows by selecting the appropriate item in the context menu, which becomes available when you right-click on the “MultiLex” icon (in the lower right corner of the screen). A similar function is performed by the button on the left side of the “pop-up translation” window. Using this button you can “pin” the resulting translation anywhere on your screen;
  • Sound card compatible with the operating system, headphones or speakers (recommended).

Summary

In conclusion, a few words about personal experience in using machine translation systems and dictionaries.

Three years ago, I used a machine translation system to prepare a report for a Western employer. Several people who were involved in offshore programming wrote a program for the navigation receiver. Unfortunately, few of the group spoke enough English to describe the results of their work in the customer’s language. In this regard, there was a need to translate reports compiled in Russian. It was then that the idea came to me to try out the Stylus machine translation system (the first versions of the PROMT systems were called that way). This attempt turned out to be very successful: I translated the 140-page document three times faster than planned. Of course, the translation performed by the program was not perfect. I had to edit it a lot and for a long time. But the gain is obvious.

Since then, when translating texts of more than 10 pages, I always use machine translation systems.

I told this story to my entrepreneur friend. Then he started selling shoes and established connections with German suppliers. He also bought a similar system and still successfully corresponds with Germans by e-mail (he knows neither English nor German). Having written a letter in Russian, he translates it into German and sends it, and translates the received answer into Russian. And everyone is happy. As a result, my friend is opening his fifth shoe store in Moscow the other day.

I became familiar with electronic dictionaries even earlier, when I had a need to read foreign books and magazines on technical disciplines with specific vocabulary. Technical electronic dictionaries, dictionaries on telecommunications and computer science allowed me to save a lot of time and effort. Thanks Lingvo!

We hope that my story about new machine translation systems and dictionaries will help you organize your work effectively and ultimately achieve success.

The editors would like to thank for their assistance in preparing the article: Alexandra Andreeva, PROMT company; Andrey Sokolov, Informatics company; Anastasia Savina, ABBYY company; Konstantin Konin and Natalya Talpa, MediaLingua company; Alexey Bukhanov, Arsenal company.

ComputerPress 9"2001

Return

×
Join the “koon.ru” community!
In contact with:
I am already subscribed to the community “koon.ru”