Method for Automatically Detecting Meaning and Measuring the Univocality of Text

US 20190332670A1
Filed: 05/16/2019
Published: 10/31/2019
Est. Priority Date: 01/28/2014
Status: Active Grant

First Claim

Patent Images

1. A method of machine translation for automatically detecting meaning-patterns in a text that includes a plurality of input words of at least one sentence using a database system that includes, stored a table of words versus meaning-signal categories/sense properties, words of a language, a plurality of pre-defined categories of meaning describing sense properties of the words, and meaning-signals for all the words, wherein each meaning-signal is a univocal numerical characterization between one of the words and a category of meaning associated with said word, wherein the method comprises:

a) reading of the text with input words into a device for data entry, from a means for data input, linked to a device for data processing,b) comparison, by the device for data processing, of the input words with the words in the table of words versus meaning-signal categories/sense properties stored in the database system that is connected directly and/or via remote data line to the device for data processing,c) based on the comparison in step b), assignment, by the device for data processing, of at least one meaning-signal from the table to each of the input words, wherein in the case of homonyms two or more meaning-signals are assigned, wherein each meaning-signal is assigned to an input word based on the sense property associated with the input word in the table;

d) in the event that the assignment of the meaning-signals to the input words in step c) is univocal, the meaning-pattern identification is complete, and proceed to step g),e) in the event that more than one meaning-signal is assigned to an input word in step c), the device for data processing compares the meaning-signals assigned to the input word with one another in an exclusively context-controlled manner, excluding comparisons of meaning-signals to themselves and comparisons of meaning-signals that, based on a numerical pattern of the univocal numerical characterization of each meaning-signal, do not match semantically, logically, morphologically, or syntactically, and assigns a degree of meaning to each comparison based on a degree of matching semantically, logically, morphologically, or syntactically,f) meaning-signal comparisons that match are automatically numerically evaluated by the device for data processing according to the degree of matching of their meaning-signals and recorded,g) the device for data processing automatically compiles all input words resulting from steps d) and f) into output words in a target language and outputs said output words as the meaning-pattern of the text based on the degree of matching of the meaning-signals in step f), wherein;

after a word meaning score “

SW”

is calculated by a meaning modulator of the device for data processing for all of the input words of the text, wherein the word meaning score is the number of entries of each word in the database system, coupled with the relevance of the meaning-pattern of each word in the context of the sentence;

if the meaning score “

SW”

for a word of the sentence is equal to 0 (zero), then the word is spelled incorrectly and the sentence receives a sentence score “

SS”

=0,if the meaning score “

SW”

for a word of the sentence is greater than 1, wherein a word with SW>

1 has more than one possible meaning in the sentence and its context, then the analyzed sentence is incorrect and/or is not univocally formulated, and the sentence score is then set to “

SS”

=“

SW”

,if more than one word of the sentence has a meaning score “

SW”

>

1, then the sentence score “

SS”

is set to the maximum value “

SW”

of the meaning scores of the words of said sentence,if all the words of the sentence have a meaning score “

SW”

=1, then the sentence is univocal and receives the sentence score “

SS”

=1,if words of the sentence have a meaning score “

SW”

=−

2, then said words allow both upper and lower case spelling, wherein the sentence score “

SS”

then receives the value “

SS”

=−

2, until a correct upper or lower case spelling of the words with “

SW”

=−

2, in this sentence, is finally determined,if the text originates from speech input and if words have a meaning score “

SW”

not equal to 1 and belong to a homophone group—

identified by device for data processing—

then the words receive the meaning score “

SW”

=−

3, and the sentence score “

SS”

receives the value −

3 until the correct homophone of the group in this sentence and its context is finally determined, andif words of the sentence have meaning score “

SW”

>

1, then with words of an arbitrary number “

v”

of preceding or of “

n”

following sentences of the text it is checked whether the words are included in the preceding or following sentences which, due to the modulation of their meaning-signals, lead to “

SW”

=1 in the input sentence, wherein fornormal speech applications and easily understandable texts, “

v”

=1 and “

n”

=0,andh) in response to input of a sentence via a speech recognition system, the device for data processing automatically determines from the sentence a grammatically correct sentence wherein inflectable homonyms are replaced with synonyms.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for automatically detecting meaning patterns in a text that includes input words of at least one sentence, includes a database system containing words of a language, a number of defined categories of meaning in order to describe properties of the words, and meaning signals for all the words stored in the database, wherein a meaning signal is a clear numerical characterization of the meaning of the word using the categories of meaning.

Citations

27 Claims

1. A method of machine translation for automatically detecting meaning-patterns in a text that includes a plurality of input words of at least one sentence using a database system that includes, stored a table of words versus meaning-signal categories/sense properties, words of a language, a plurality of pre-defined categories of meaning describing sense properties of the words, and meaning-signals for all the words, wherein each meaning-signal is a univocal numerical characterization between one of the words and a category of meaning associated with said word, wherein the method comprises:
- a) reading of the text with input words into a device for data entry, from a means for data input, linked to a device for data processing,b) comparison, by the device for data processing, of the input words with the words in the table of words versus meaning-signal categories/sense properties stored in the database system that is connected directly and/or via remote data line to the device for data processing,c) based on the comparison in step b), assignment, by the device for data processing, of at least one meaning-signal from the table to each of the input words, wherein in the case of homonyms two or more meaning-signals are assigned, wherein each meaning-signal is assigned to an input word based on the sense property associated with the input word in the table;
  
  d) in the event that the assignment of the meaning-signals to the input words in step c) is univocal, the meaning-pattern identification is complete, and proceed to step g),e) in the event that more than one meaning-signal is assigned to an input word in step c), the device for data processing compares the meaning-signals assigned to the input word with one another in an exclusively context-controlled manner, excluding comparisons of meaning-signals to themselves and comparisons of meaning-signals that, based on a numerical pattern of the univocal numerical characterization of each meaning-signal, do not match semantically, logically, morphologically, or syntactically, and assigns a degree of meaning to each comparison based on a degree of matching semantically, logically, morphologically, or syntactically,f) meaning-signal comparisons that match are automatically numerically evaluated by the device for data processing according to the degree of matching of their meaning-signals and recorded,g) the device for data processing automatically compiles all input words resulting from steps d) and f) into output words in a target language and outputs said output words as the meaning-pattern of the text based on the degree of matching of the meaning-signals in step f), wherein;
  
  after a word meaning score “
  
  SW”
  
  is calculated by a meaning modulator of the device for data processing for all of the input words of the text, wherein the word meaning score is the number of entries of each word in the database system, coupled with the relevance of the meaning-pattern of each word in the context of the sentence;
  
  if the meaning score “
  
  SW”
  
  for a word of the sentence is equal to 0 (zero), then the word is spelled incorrectly and the sentence receives a sentence score “
  
  SS”
  
  =0,if the meaning score “
  
  SW”
  
  for a word of the sentence is greater than 1, wherein a word with SW>
  
  1 has more than one possible meaning in the sentence and its context, then the analyzed sentence is incorrect and/or is not univocally formulated, and the sentence score is then set to “
  
  SS”
  
  =“
  
  SW”
  
  ,if more than one word of the sentence has a meaning score “
  
  SW”
  
  >
  
  1, then the sentence score “
  
  SS”
  
  is set to the maximum value “
  
  SW”
  
  of the meaning scores of the words of said sentence,if all the words of the sentence have a meaning score “
  
  SW”
  
  =1, then the sentence is univocal and receives the sentence score “
  
  SS”
  
  =1,if words of the sentence have a meaning score “
  
  SW”
  
  =−
  
  2, then said words allow both upper and lower case spelling, wherein the sentence score “
  
  SS”
  
  then receives the value “
  
  SS”
  
  =−
  
  2, until a correct upper or lower case spelling of the words with “
  
  SW”
  
  =−
  
  2, in this sentence, is finally determined,if the text originates from speech input and if words have a meaning score “
  
  SW”
  
  not equal to 1 and belong to a homophone group—
  
  identified by device for data processing—
  
  then the words receive the meaning score “
  
  SW”
  
  =−
  
  3, and the sentence score “
  
  SS”
  
  receives the value −
  
  3 until the correct homophone of the group in this sentence and its context is finally determined, andif words of the sentence have meaning score “
  
  SW”
  
  >
  
  1, then with words of an arbitrary number “
  
  v”
  
  of preceding or of “
  
  n”
  
  following sentences of the text it is checked whether the words are included in the preceding or following sentences which, due to the modulation of their meaning-signals, lead to “
  
  SW”
  
  =1 in the input sentence, wherein fornormal speech applications and easily understandable texts, “
  
  v”
  
  =1 and “
  
  n”
  
  =0,andh) in response to input of a sentence via a speech recognition system, the device for data processing automatically determines from the sentence a grammatically correct sentence wherein inflectable homonyms are replaced with synonyms.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
- - 2. The method as claimed in claim 1, further comprising:
    - determining, in accordance a pre-defined matching criterion, whether the meaning-pattern for at least one input word of the text has more than one remaining meaning, whereupon no unique meaning-pattern and/or no unique meaning of the sentence exists in the context of the sentence, and outputting the non-uniqueness and its cause to a User Interaction Manager.
  - 3. The method as claimed in claim 1, wherein the text with the input words is a string of characters that originates from written text, from acoustically recorded text via a speech recognition program, photographed text, or OCR.
  - 4. The method as claimed in claim 1, wherein, following step (e), in response to all of the input words of the text being assigned meaning-signals,generating a signal for a degree of univocality of the text.
  - 5. The method as claimed in claim 1, further comprising:
    - generating, for each word where SW=0, an error message indicating a spelling error and determining for said word a possibility for eliminating the error that is stored in a storage that is accessible to a User Interaction Manager.
  - 6. The method as claimed in claim 4, wherein for a word where “
    - SW”
      
      =−
      
      2, launching an error message which indicates a case error in the spelling of said word, naming said word position in the sentence, the cause of the error, and storing the error, and storing the error message in a storage that is accessible to a User Interaction Manager.
  - 7. The method as claimed in claim 1, wherein in response to no words having SW=O, updating the meaning-signals of a current paragraph on the basis of constraint references associated with words of the current paragraph and storing updated meaning-signals in a storage that is accessible to a User Interaction Manager.
  - 8. The method as claimed in claim 1, wherein for sentences with SS>
    - 1, generating an autotranslation message which lists still existing number of SW meaning possibilities of each word and, for each word, retrieve synonyms of said word from the database system on the basis of said word'"'"'s meaning-signals, and storing the retrieved synonyms in a storage that is accessible to a User Interaction Manager.
  - 9. The method as claimed in claim 1, wherein the sentence is in a natural language which is translated into the target language, wherein a sentence with score SS=1 is automatically acquired, or the text of the sentence is processed until the sentence has a score SS=1.
  - 10. The method as claimed in claim 9, wherein the text of the sentence is translated into the target language, on the basis of univocal meaning-signals of the words of the sentence.
  - 11. The method as claimed in claim 9, further comprising:
    - on the basis of language-pair-specific rules stored in the database system, adjusting an order of the words in the sentence in relation to their morphology and inflection, and of the order of the sentence constituents;
      
      determining main clauses, dependent clauses, inserted dependent clauses, subjects, predicates, objects, text parts between hyphens, and/or text parts between two brackets (open/closed); and
      
      storing the words in the target language in a storage in an order that is at least as
12. The method as claimed in claim 1, wherein the output words in the target language are displayed or acoustically reproduced.
13. The method as claimed in claim 1, wherein in the presence of at least one word with homophones in the sentence, reviewing a degree of meaning-signal correspondence of the word and all its other homophonous spellings in relation to context, and replacing the word by the homophone with a greatest meaning modulation in the sentence or outputting an error message where there is insufficient computational differentiation among the meaning-signals of words of a homophone group in the context.
14. The method as claimed in claim 1, wherein in response to the sentence including garbled text when at least one word SW=O, automatically and systematically reformulating the sentence by correctly spelling incorrect words, with priority on words that are similar to homophones of said word, or that correspond to omissions of letters, spaces, upper/lower case error(s), and/or accenting.
15. The method as claimed in claim 14, wherein via the meaning-signals of correctable words, determining whether one or more sentences with a SS=1 is/are produced, and if so outputting the one or more sentences, otherwise if no sentence with a SS=1 is identified after a specified time, terminating the step of determining whether one or more sentences with a SS=1 are produced, wherein the sentence including the input words is then tagged with information of the words that were analyzed for correction, and if at least one sentence with a score unequal to 1 exist, the sentence having the fewest words with SW=0 is tagged and stored in a storage accessible to a User Interaction Manager.
16. The method as claimed in claim 15, wherein a textual content of the tagged sentence is determined by meaning-checking the univocality of the words of the sentence.
17. The method as claimed in claim 16, further comprising:
- updating the database with the meaning-signals of the words of the database before step (a).
18. The method as claimed in claim 1, further comprising:
- including all same-language synonyms and all foreign-language synonyms in all their valid inflections in the search.
19. The method as claimed in claim 1, further comprising:
- combining the meaning-signals of multiple input words.
20. The method as claimed in claim 1, further comprising:
- determining a relevance of statements in text in a natural language to a written topic on the basis of the meaning-signals of the words of the sentence, wherein pre-defined combinations or patterns of meaning-signals are compared with tagged words of the written topic.
21. The method as claimed in claim 20, further comprising ranking an overlap of the meaning-signals of the written topic and the sentence with pre-defined meaning modulation patterns on the basis of at least one of the following within the structure of the sentence:
- meaning-signals of logical operators, and/or meaning-signals disjunctors, and/or sentential connectors.
22. The method as claimed in claim 1, further comprising:
- acquiring, by the device for data processing, spoken input of the user as text and processing the text by meaning-checking the univocality of the words of the text.
23. The method as claimed in claim 22, further comprising:
- breakdown of the text into individual sentences and determining for each sentence if it is a statement sentence, a question sentence, or an exclamation sentence.
24. The method as claimed in claim 23, further comprising:
- comparing meaning-signals of the statement and/or the question sentences based on their matching/correspondence with a database of statement sentences, response sentences, and standard question sentences of a machine-readable text ontology and carrying out at least one of the following steps;
  
  (a) when values of the meaning-signals of the words of the sentence is above a certain level, the response sentence or the statement sentence rated highest in a matching/correspondence value is used;
  
  (b) generating by a speech output system a confirmation of highest ranking individual sentences;
  
  (c) outputting by a speech output system for selection by the user a highest ranking response sentence, wherein the speech output system only allows the user to make controlled answers on request;
  
  (d) receiving from the user, in response to the device for data processing outputting user detectable information, one or more questions on the basis of information obtained by the user in response to the output of detectable information; and
  
  (e) when values of the meaning-signals are below a predetermined level, generating, based on a previous question, a dialog to which the user replies and evaluating;
  
  redundancy of the dialog or of content-based patterns in the reply, meaning-signal patterns in a verbal reply of the user during the dialog, and/or visually perceivable replies of the user via a camera.
25. The method as claimed in claim 1, further comprising, in response to the words of the sentence not being tagged with meaning-signals after the sentence has SS>
- 0, performing spell-checking on the sentence.
26. The method as claimed in claim 1, further comprising during entry of words on a keyboard, recognizing the entered words using meaning checking, and automatic completion of the words with words from the database system on the basis of a best match with syntax and context at the time of entering the words on the keyboard.
27. The method as claimed in claim 1, wherein, for encryption of one or more input sentences of a natural language using meaning—
- checking the univocality of the sentence,in each input sentence, “
  
  m”
  
  words are replaced in a grammatically/semantically well-formed manner with words from the database system, and/or “
  
  n”
  
  words are added in a grammatically/semantically well-formed manner with words from the database system which have meaning-signals related to their immediate, contextual environment, whereupon by insertion, negation, relativization, or omission and/or by use of antonyms of the “
  
  m” and
  
  /or “
  
  n”
  
  words from the database system the sentence meaning can be changed, but without the sentence score being changed, whereupon the sentence is no less semantically/factually meaningful than the sentence from which it is produced, with “
  
  m”
  
  >
  
  =1 or “
  
  n”
  
  >
  
  =0, and wherein at least one of the following steps is carried out;
  
  a) all alphanumeric chains which are proper names and/or dates and/or pure numbers which have their own meaning-signals, or to which automatically matching meaning-signals can be assigned, and/or selected single words are each replaced by coded, anonymized words, to which shortened meaning-signals, appropriate to a degree of anonymization, are added,b) each input sentence is stored taking account of the original order, and a log file is stored of all changes that were created as sentence variants or anonymizations, wherein each change and derivable content of the change and the position in the respective sentence are recorded,c) identifying in a database, sentences that are semantically—
  
  but not logically—
  
  similar to each input sentence to be encrypted, and that has a sentence score SS=1,d) the number of sentences of the original text of one or more input sentences is increased to at least 7 if, over said text plus sentence variants, there are less than 7 input sentences to be encrypted,e) text is created which contains the one or more input sentences, plus “
  
  m”
  
  appended sentences which are automatically created variants of the one or more input sentences,f) scrambling a sequence of at least two of the input sentences and appending information regarding modification of the sequence before and after the scrambling to a log file, and unscrambling the scrambled sentence on the basis of the information regarding modification of sequence stored in the log file, andg) queries of encrypted text are tagged on individual words and/or sentences in such a way that, after reconstruction of the input text translation queries, error messages and/or semantic information of the sentences are automatically cancel whereupon context-related information which due to the scrambling are initially no longer in context, are reconstructed in the input text.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Speech SenSz GmbH
Original Assignee
Somol Zorzin GmbH
Inventors
Zorzin, Luciano

Granted Patent

US 11,068,662 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 40/232   Orthographic correction, e....

G06F 40/268   Morphological analysis

G06F 40/30   Semantic analysis

G06F 40/56   Natural language generation

Method for Automatically Detecting Meaning and Measuring the Univocality of Text

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

Method for Automatically Detecting Meaning and Measuring the Univocality of Text

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links