Statistical natural language understanding using hidden clumpings

US 5,987,404 A
Filed: 01/29/1996
Issued: 11/16/1999
Est. Priority Date: 01/29/1996
Status: Expired due to Fees

First Claim

Patent Images

1. A method for performing natural language understanding, comprising:

a training phase, comprising;

providing a first sentence E of a given language;

providing the meaning S of sentence E;

generating a translation model by summing a probability of correctness over one or more possible alignments and clumpings between sentence E and meaning S, wherein the actual alignment and clumping are not known, to produce the probabilities of the translation model'"'"'s parameters;

producing the probability distributions of the alignment and storing the probability distributions;

an understanding phase, comprising;

inputting a second sentence to be understood;

producing, based upon the probability distributions, the meaning of the second sentence, wherein the probability distribution assumes that the sentence E is generated under predetermined semantic rules in non-overlapping substrings such that each substring of the sentence E is generated by one concept in a semantic library, and wherein a set of substrings is called a clumping.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention proposes using statistical methods to do natural language understanding. The key notion is that there are "strings" of words in the natural language, that correspond to a single semantic concept. One can then define an alignment between an entire semantic meaning (consisting of a set of semantic concepts), and the English. This is modeled using P(E,A|S). One can model p(S) separately. This allows each parameter to be modeled using many different statistical models.

440 Citations

12 Claims

1. A method for performing natural language understanding, comprising:
- a training phase, comprising;
  
  providing a first sentence E of a given language;
  
  providing the meaning S of sentence E;
  
  generating a translation model by summing a probability of correctness over one or more possible alignments and clumpings between sentence E and meaning S, wherein the actual alignment and clumping are not known, to produce the probabilities of the translation model'"'"'s parameters;
  
  producing the probability distributions of the alignment and storing the probability distributions;
  
  an understanding phase, comprising;
  
  inputting a second sentence to be understood;
  
  producing, based upon the probability distributions, the meaning of the second sentence, wherein the probability distribution assumes that the sentence E is generated under predetermined semantic rules in non-overlapping substrings such that each substring of the sentence E is generated by one concept in a semantic library, and wherein a set of substrings is called a clumping.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein the training phase assumes that all possible alignments and the clumping is not known, and wherein an expectation maximization ("EM") algorithm is used to train the model parameters to a maximum likelihood value, and wherein the parameters of the translation model are used to predict the probability of the sentence E and a specific alignment of E to S given the semantic meaning S, p(E,A|S).
  - 3. The method of claim 2, wherein the translation model comprises a hierarchy of sub-models.
  - 4. The method of claim 3, further comprising constructing a language model to model the probability of the semantic meaning, p(S), and using this model to determine arg max_s (p(E|S)p(S)) for a new E.
  - 5. The method of claim 4, further comprising searching through the set of semantic meanings S to find the one that maximizes p(E|S)p(S), wherein for the maximum likelihood decoder, p(E|S)=the sum over A of p(E,A|S), and for the viterbi decoder p(E|S)=max_A p(E,A|S).

6. A method for training a natural language understanding system, comprising:
- providing a sentence E of a given language;
  
  providing meanings S of the sentence E;
  
  generating a translation model by summing a probability of correctness over one or more possible alignments and clumpings between E and S, wherein the actual alignment and clumping are not known, to produce the probabilities of the translation model'"'"'s parameters, and an expectation maximization ("EM") algorithm is used to train the model parameters to a maximum likelihood value, and wherein the parameters of the translation model are used to predict the probability of the sentence E and a specific alignment of E to S given the semantic meaning S, p(E,A|S).
- View Dependent Claims (7, 8, 9)
- - 7. The method of claim 6, wherein the translation model comprises a hierarchy of sub-models.
  - 8. The method of claim 6, further comprising constructing a language model to model the probability of the semantic meaning, p(S), and using this model to determine arg max_s (p(E|S)p(S) for a new E.
  - 9. The method of claim 6, further comprising searching through the set of semantic meanings S to find the one that maximizes p(E|S)p(S), wherein for the maximum likelihood decoder, p(E|S)=the sum over A of p(E,A|S), and for the viterbi decoder p(E|S)=max_A p(E,A|S).

10. A method for training a natural language understanding system, comprising:
- providing a sentence E of a given language;
  
  providing the meaning S of the sentence E;
  
  generating a translation model by summing the probability of correctness over one or more possible alignments and clumpings between E and S, wherein the actual alignment and clumping are not known, to produce the probabilities of the translation model'"'"'s parameters;
  
  producing the probability distributions of the alignment and storing the probability distributions;
  
  wherein the probability distribution assumes that the sentence E is generated under predetermined semantic rules in non-overlapping substrings such that each substring of the sentence E is generated by one concept in a semantic library, and wherein a set of substrings is called a clumping.

11. A method for performing natural language understanding, comprising:
- a training phase, comprising;
  
  providing a first sentence E of a given language;
  
  providing the meaning S of the sentence E;
  
  generating a translation model by summing a probability of correctness over one or more possible alignments and clumpings between E and S, wherein the actual alignment and clumping are not known, to produce the probabilities of the translation model'"'"'s parameters;
  
  producing the probability distributions of the alignment and storing the probability distributions;
  
  wherein the probability distribution assumes that the sentence E is generated under predetermined semantic rules in non-overlapping substrings such that each substring of sentence E is generated by one concept in a semantic library, and wherein a set of substrings is called a clumping;
  
  an understanding phase, comprising;
  
  inputting an English sentence to be understood;
  
  producing, based upon the probability distributions, the meaning of the English sentence.

12. A method for performing natural language understanding on an input string, comprising:
- (a) inputting a string to be understood;
  
  (b) performing a basic clumping operation on the string to identify clumps in the string;
  
  (c) performing a clumping-with-semantic-language-model operation on the string to determine the probability of clumps generated in step (b), given context;
  
  (d) generating a general fertility model to model a number of clumps the input string is allowed to generate;
  
  (e) generating a distortion model to model distances between clumps of the input string;
  
  (f) using the models generated in steps (c), (d), and (e) and, outputting an understanding result.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Franz, Martin, Della Pietra, Stephen Andrew, Koppelman, Joshua David Sherer, Roukos, Salim Estephan, Ward, Robert Todd, Epstein, Mark Edward
Primary Examiner(s)
Isen, Forester W.
Assistant Examiner(s)
EDOUARD, PATRICK NESTOR

Application Number

US08/593,032
Time in Patent Office

1,387 Days
Field of Search

704/1, 704/8, 704/9, 704/257, 704/277
US Class Current

704/9
CPC Class Codes

G06F 40/30 Semantic analysis

G06F 40/58 Use of machine translation,...

Statistical natural language understanding using hidden clumpings

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

440 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Statistical natural language understanding using hidden clumpings

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

440 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links