Method and apparatus for performing full bidirectional translation between a source language and a linked alternative language
First Claim
1. An apparatus for providing bidirectional translations of a text of data between a linked alternative language and a source language, wherein the source language is a natural language and the linked alternative language is designed to map fully to the source language in terms of structure and strings of digitized data, comprising:
- (a) means for entering the text of data;
(b) a dictionary database of a vocabulary of words in the source language stored as records, with collocated information on the usage pattern of each word and on frequency of use of each word in the source language;
(c) a dictionary database for a vocabulary of words in the linked alternative language stored as records;
(d) means for storing the dictionary database of the vocabulary of words in the source language and the dictionary database of the vocabulary of words in the linked alternative language in a central concordance, the records within the dictionary databases taking the form of strings of digitized data in the linked alternative language and in the source language;
(e) a database of instructions which index relationships between the strings of digitized data in the linked alternative language and the strings of digitized data in the source language;
(f) a database of translation rules, wherein all of the translation rules in the database of translation rules provide a lossless translation between the linked alternative language and the source language wherein the linked alternative language is designed to map fully to the source language in terms of structure and strings of digitized data;
(g) automated means for translating, in both directions, between the linked alternative language and the source language, wherein the means for translating applies to the text the set of translation rules and the two dictionary databases stored within the central concordance; and
(h) means for outputting translated text.
0 Assignments
0 Petitions
Accused Products
Abstract
System with apparatus to improve international and other communication, and to provide easier access to data, especially digitized data, by means of linked alternative languages generated from a source language. As taught by the present invention, a linked alternative language is an especially designed language form quite different in outward format from its source language in that it has been optimized in a plurality of ways to allow targeted populations to comprehend and use it more efficiently than the source language, but which has also been carefully designed to retain full bidirectional machine translation equivalence to the source language. All use of artificial intelligence and computational linguistics for machine translation as taught in the present invention is constrained by these considerations.
587 Citations
21 Claims
-
1. An apparatus for providing bidirectional translations of a text of data between a linked alternative language and a source language, wherein the source language is a natural language and the linked alternative language is designed to map fully to the source language in terms of structure and strings of digitized data, comprising:
-
(a) means for entering the text of data;
(b) a dictionary database of a vocabulary of words in the source language stored as records, with collocated information on the usage pattern of each word and on frequency of use of each word in the source language;
(c) a dictionary database for a vocabulary of words in the linked alternative language stored as records;
(d) means for storing the dictionary database of the vocabulary of words in the source language and the dictionary database of the vocabulary of words in the linked alternative language in a central concordance, the records within the dictionary databases taking the form of strings of digitized data in the linked alternative language and in the source language;
(e) a database of instructions which index relationships between the strings of digitized data in the linked alternative language and the strings of digitized data in the source language;
(f) a database of translation rules, wherein all of the translation rules in the database of translation rules provide a lossless translation between the linked alternative language and the source language wherein the linked alternative language is designed to map fully to the source language in terms of structure and strings of digitized data;
(g) automated means for translating, in both directions, between the linked alternative language and the source language, wherein the means for translating applies to the text the set of translation rules and the two dictionary databases stored within the central concordance; and
(h) means for outputting translated text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
(a) a database of interaction rules;
(b) means for applying interaction rules to the text of data to create queries;
(c) means for outputting queries to the user;
(d) means for inputting answers to queries;
(e) at least one alternative set of translation rules; and
(f) means for utilizing the answers to queries in order to choose between sets of translation rules for translating the text of data.
-
-
3. The apparatus, as recited in claim 2, wherein one of the queries asks if a reduced vocabulary strategy is to be implemented, and further comprising:
-
(a) at least one reduced vocabulary database, comprising a limited list of words, the reduced vocabulary database further comprising;
(1) paired sets of words where each of the paired set of words has a first word and a second word whereby the first word is a word in the source language and the second word is a word in the reduced vocabulary, wherein the number of different first words is greater than the number of different second words;
(2) paired sets of strings of alphanumeric characters representing multi-word portions of texts of data, where each of the paired set of strings has a first string and a second string whereby the first string is a string of words in the source language and the second word is a string in the reduced vocabulary;
(3) sets of rules relating to sentence structure and syntax as a means for the automatic implementation of the reduced vocabulary strategy by the apparatus;
(4) sets of suggestions and interaction rules for the further implementation of the reduced vocabulary strategy by the user;
(b) a reduced vocabulary database;
(c) means for linking the reduced vocabulary database to the central concordance;
(d) a database of instructions on relationships between the strings of digitized data;
(e) a database of sets of additional translation rules for the specific implementation of the reduced vocabulary strategy, wherein the translation rules allow for translation between the source language, the linked alternative language, a reduced vocabulary version of the source language, and a reduced vocabulary version of the linked alternative language; and
(f) means for applying the reduced vocabulary database, the database of instructions on relationships, and the database of sets of additional translation rules to the text of data.
-
-
4. The apparatus, as recited in claim 2, further comprising a means for translating between a source language and a target language by using a linked alternative language mapped to the source language as a pivot language for translation.
-
5. The apparatus, as recited in claim 2, further comprising a means for translating between a source language and a target language by using a linked alternative language mapped to a second source language as a pivot language for translation.
-
6. The apparatus, as recited in claim 2, further comprising a means for translating between any of a plurality of languages by creating a linked alternative language for each of the plurality of languages and then translating between those linked alternative languages.
-
7. The apparatus, as recited in claim 1, further comprising:
-
(a) a database comprising a list of semantic concepts and categories;
(b) a dictionary database in thesaurus form, organized in accordance with the list of semantic concepts and categories;
(c) a database of annotation rules for tagging words and strings in the source language in order to describe their range of grammatical usage in the source language, and to delimit their semantic content in accordance with the list of semantic concepts and categories;
(d) means for storing within the central concordance the words and strings in the linked alternative language which map to the annotated words and strings in the source language;
(e) means for indexing the words and strings comprising the dictionary database to entries in the central concordance; and
(f) means for searching for and locating strings of digitized data listed in the central concordance in terms of the basic concepts and categories in the dictionary database.
-
-
8. The apparatus, as recited in claim 1, wherein the means for entering the text of data, comprises:
-
(a) means for inputting voice as audio data;
(b) means for converting the audio data into the form of a digitized audio file; and
(c) means for storing the digitized audio file.
-
-
9. The apparatus, as recited in claim 1, wherein the means for entering the text of data comprises a means for downloading the text from an Internet connection.
-
10. The apparatus, as recited in claim 9, wherein the means for outputting translated text comprises a means for outputting the translations within an Internet application.
-
11. The apparatus, as recited in claim 1, further comprising:
-
(a) an interface to a controllable machine;
(b) a database of controller vocabulary in the linked alternative language to provide means for directing the controllable machine in a repertoire of manners in which the controllable machine is controllable, the database of the controller comprising;
(1) a set of pronounceable words covering the functions of the controllable machine; and
(2) markers for the set of pronounceable words which renders the set of pronounceable fully distinguishable, in both written and audio form, from words in the linked alternative language which do not convey instructions to controllable machines.
-
-
12. The apparatus, as recited in claim 1, wherein the means for entering comprises a means for entering the symbols of mathematics and symbolic logic, and strings of the symbols, including mathematical formulae, and further comprising:
-
(a) a logico-mathematical system database, forming a subsidiary part of the dictionary database for the vocabulary of the linked alternative language and including words in the linked alternative language chosen to equate to the symbols of mathematics and symbolic logic, and including words in the linked alternative language chosen to equate to descriptive strings of words capable of being inserted into sentences in the source language and expressing logical and mathematical relationships;
(b) means for storing the symbols and strings of symbols;
(c) means for indexing the symbols and strings of symbols within the central concordance; and
(d) means for translating the symbols and strings of symbols into pronounceable strings of text in a linked alternative language by means of the logico-mathematical system database.
-
-
13. The apparatus, as recited in claim 1, further comprising,
(a) a mnemonic system database, forming a subsidiary part of the dictionary database for the vocabulary of the linked alternative language and including those words that relate to numbers and colors, so devised as to associate each arabic numeral with a specific set of letters of the alphabet used by the linked alternative language, and through the specific letters, to associate each arabic numeral with the words in the linked alternative language which designate the numbers and the basic colors; -
(b) means for linking the mnemonic system database to the central concordance; and
(c) means for using the mnemonic system database in the translation process.
-
-
14. The apparatus as recited in claim 13, further comprising:
-
(a) a dictionary database of the most frequently occurring words in the source language and the linked alternative language;
(b) a database of stored abbreviations for a plurality of frequently appearing words in the source language and the linked alternative language, using the mnemonic system database to make abbreviations more readily memorized;
(c) a database of stored abbreviations for a plurality of frequently appearing affixes in the source language and the linked alternative language, using the mnemonic system database to make abbreviations more readily memorized;
(d) means for indexing the abbreviations to the central concordance;
(e) means for querying a user about desired personal adaptations of, and additions to, the abbreviation tables;
(f) means for storing and applying the results of user adaptations;
(g) means for entering abbreviated text into the apparatus; and
(h) means for replacing the abbreviations with strings of digitized data linked to the abbreviations, whereby a standard readable text is outputted.
-
-
15. The apparatus, as recited in claim 13, wherein the means for entering comprises a color coded digital keyboard using the associations contained in the mnemonic system database to relate the alphanumeric keys to their positions on the keyboard and to fingers used for inputting.
-
16. The apparatus, as recited in claim 15, wherein the color coded keyboard comprises four horizontal rows of keys with twelve keys per horizontal row, wherein the keys in each row form twelve vertical columns, and wherein,
(a) each of the vertical columns of the keyboard has the following color applied to the keys: - white 1, gray 2, black 3, red 4, dark blue 5, yellow 6, purple 7, green 8, orange 9, sky-blue 10, pink 11, and tan 12;
(b) the home keys for the four fingers of the left hand and for the four fingers of the right hand are on the third row from top;
(c) the four fingers of the left hand have, in order, the letters U, T, M, and A as their home keys and these keys are associated, respectively, with the colors;
white, gray, black, and red;
(d) the four fingers of the right hand have, in order, the letters I, E, N, and O as their home keys, and these keys are associated, respectively, with the colors;
purple, green, orange, and light blue, and(e) other letters and numbers are placed on the keyboard with consideration given to their frequency and to mnemonic considerations.
- white 1, gray 2, black 3, red 4, dark blue 5, yellow 6, purple 7, green 8, orange 9, sky-blue 10, pink 11, and tan 12;
-
17. The apparatus, as recited in claim 1, wherein the apparatus further provides means for translating, displaying, and outputting at least one page of communicative text by employing at least one template of such a text in a source language and its linked alternative language, the template delimiting the user'"'"'s input to only such sentences, parts of sentences, words, and other strings of alphanumeric input which are in a sufficiently delimited context as to permit accurate translation of the input from the source language and the linked alternative language into an outputted text in at least one target language other than the source language and the linked alternative language, further comprising:
-
(a) a database of delimiting templates in the source language, stored in the form of alphanumeric strings of digitized data in the source language, comprising;
(1) delimiting templates for common communicative texts, including optional page formats and loci for the insertion of graphics;
(2) source language wording for entire sentences that appear within the context of the texts;
(3) source language wording for incomplete sentences within the context of the texts and with at least one indicated space into which at least one word from a delimited vocabulary may be inserted by the user; and
(4) sets of words forming delimited vocabulary for optional insertion into the indicated space;
(b) a database of delimiting templates in the linked alternative language, structured identically to the database of delimiting templates for the source language so that each alphanumeric string listed in the database of delimiting templates for the linked alternative language is linked to an alphanumeric string in the database of delimiting templates for the source language;
(c) a database of delimiting templates in at least one target language other than the source language and the linked alternative language, structured identically to the database of delimiting templates for the linked alternative language so that each alphanumeric string listed in the database of delimiting templates for the target language may be linked to an alphanumeric string in the database of delimiting templates for the linked alternative language;
(d) means for storing all databases of delimiting templates within the apparatus in a central concordance, the records within the databases of delimiting templates taking the form of alphanumeric strings of digitized data indexed to those strings in the source language;
(e) a database of translation rules for delimiting templates, wherein all of the translation rules and the strings to which they apply allow for a fully accurate automated translation of the delimiting templates and their content among the languages to which they are linked, and wherein the translation rules determine which identified strings of digitized data should be substituted for which other strings of digitized data, and wherein the translation rules establish the order in which the strings of digitized data are to appear in the translated text;
(f) means for storing the database of translation rules for delimiting templates in the central concordance;
(g) an automatic means for translating a communicative text which has been generated within a delimiting template, between the linked alternative language and any target language that has been linked for translation within the constraints of the delimiting template, applying the set of template translation rules and the database of delimiting templates in order to produce a translation; and
(h) means for applying the database of translation rules for delimiting templates to a delimiting template of text of data in order to produce a translation of the contents of the delimiting template between any two languages that have been linked within the constraints of the delimiting template.
-
-
18. A method for creating and employing a linked alternative language, wherein the linked alternative language is linked to a source language, and is designed to map fully to the source language in terms of structure and strings of digitized data, and wherein the linked alternative language is further designed to provide communicative features and efficiencies on the computer beyond those provided by the source language, the method comprising the steps of:
-
(a) establishing the parameters of the system, comprising the steps of;
(1) choosing the source language;
(2) targeting the user group which the linked alternative language is to serve; and
(3) choosing the communicative features and the efficiencies on computer systems to be accommodated by the linked alternative language;
(b) entering into a computer a dictionary database of vocabulary in the source language, with collocated information on the usage pattern of each word and on its frequency of use in the source language;
(c) entering a framework for a dictionary database for the vocabulary of the linked alternative language, the framework being structured to map to the dictionary database of vocabulary in the source language;
(d) building the lexical records within the dictionary database for the vocabulary of the linked alternative language, comprising the steps of;
(1) supplying the linked alternative language with a phonetic system generally reflecting the speech habits of its targeted speakers;
(2) creating a graphemic system to provide a method for writing the linked alternative language in a manner which reflects its phonetic system and is compatible with computer capabilities;
(3) using a computer to screen the morphemes tentatively chosen for the linked alternative language to assure that no two morphemes are so close phonetically as to lead to a serious confusion among the targeted speakers; and
(4) supplying the linked alternative language with a system for establishing sentence structure which is capable of retaining full computer-implemented mapping to the source language;
(e) storing the two dictionary databases within the computer in a central concordance, the records within the dictionary databases taking the form of strings of digitized data in the linked alternative language and in the source language;
(f) entering a database of instructions which index the relationships between the strings of digitized data in the linked alternative language and the strings of digitized data in the source language;
(g) implementing, on a computer, a set of translation rules wherein there is fully accurate automated and lossless translation in both directions between the linked alternative language and the source language; and
(h) outputting translated text. - View Dependent Claims (19)
(a) inputting written texts of data to a computer employing a neural network system;
(b) inputting voice data to the computer in the form of a human generated audio stream representing the same texts of data;
(c) tasking the computer to convert the written texts of data phoneme by phoneme and word by word into an audio stream;
(d) repeatedly comparing the computer generated audio stream and the human generated audio stream by means of a neural network which is trained using the techniques of back propagation;
(e) applying, recursively, the above procedures; and
(f) storing the network state when the difference between the computer generated audio stream and the human generated audio stream becomes negligible, and wherein the step of outputting, comprises the step of outputting in the linked alternative language in audio according to the pronunciation standards for the linked alternative language.
-
-
20. A method implemented on a computer for translating in both directions between a linked alternative language and a source language, wherein the linked alternative language is designed to map fully to the source language in terms of structure and strings of digitized data, comprising the steps of:
-
(a) entering a text of data into the computer system;
(b) dividing the text of data into sentences;
(c) consulting a central concordance, wherein the central concordance contains strings of digitized data, including spaces and punctuation marks, in the linked alternative language and in the source language, and instructions on relationships between the strings of digitized data;
(d) identifying within each sentence in the text of data those strings of digitized data, which appear in the central concordance, wherein the concordance contains strings of digitized data in the linked alternative language and the source language, and instructions on relationships between the strings of digitized data entering a database of instructions which index the relationships between the strings of digitized data in the linked alternative language and the strings of digitized data in the source language;
(e) implementing, on the computer, a set of translation rules wherein each of the translation rules allows for a filly accurate automated and lossless translation in both directions between the linked alternative language and the source language to the text, wherein the translation rules determine whether identified strings of digitized data should be substituted with related strings of digitized data in the concordance and wherein the translation rules establish the order in which the strings of digitized data are to appear in the translated text; and
(f) outputting translated text. - View Dependent Claims (21)
(a) adding a mnemonic system database, as a subsidiary part of the dictionary database for the vocabulary of the linked alternative language and which includes words that relate to numbers, days of the week, months, directions of the compass, and basic colors;
(b) supplying the mnemonic system database with morphemes, words, and longer digital strings in the linked alternative language vocabulary so formulated as to associate each arabic numeral with a specific set of letters of the alphabet used by the linked alternative language, and through the specific set of letters, to associate each arabic numeral with words in the linked alternative language which designate numbers, days of the week, months, directions of the compass, and basic colors;
(c) providing means for the user to input lists of items, numbers, dates, and other data to be entered into human memory; and
(d) outputting to the user suggested mnemonic techniques, based on the mnemonic system database to aid in the retention of such data.
-
Specification