Method and apparatus for generating decision tree questions for speech processing

US 7,788,096 B2
Filed: 09/03/2002
Issued: 08/31/2010
Est. Priority Date: 09/03/2002
Status: Active Grant

First Claim

Patent Images

1. A computer-readable storage medium encoded with computer-executable instructions for causing a computer to perform steps comprising:

forming a separate cluster of tokens for each possible token that can appear in training data;

determining whether to combine a first cluster of tokens and a second cluster of tokens to form a new cluster of tokens using mutual information wherein the mutual information is based on the number of times tokens from the new cluster of tokens appear next to tokens from another cluster of tokens in the training data;

building a decision tree by utilizing at least one of the clusters of tokens to form a question for a node in the decision tree, the question asking whether a token in an input is found within the at least one cluster; and

using the decision tree to identify a leaf node of the tree based on an input.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention automatically builds question sets for a decision tree. Under the invention, mutual information is used to cluster tokens, representing either phones or letters. Each cluster is formed so as to limit the loss in mutual information in a set of training data caused by the formation of the cluster. The resulting sets of clusters represent questions that can be used at the nodes of the decision tree.

18 Citations

View as Search Results

19 Claims

1. A computer-readable storage medium encoded with computer-executable instructions for causing a computer to perform steps comprising:
- forming a separate cluster of tokens for each possible token that can appear in training data;
  
  determining whether to combine a first cluster of tokens and a second cluster of tokens to form a new cluster of tokens using mutual information wherein the mutual information is based on the number of times tokens from the new cluster of tokens appear next to tokens from another cluster of tokens in the training data;
  
  building a decision tree by utilizing at least one of the clusters of tokens to form a question for a node in the decision tree, the question asking whether a token in an input is found within the at least one cluster; and
  
  using the decision tree to identify a leaf node of the tree based on an input.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The computer-readable storage medium of claim 1 wherein a cluster of tokens comprises a cluster of phones.
  - 3. The computer-readable storage medium of claim 2 wherein using the decision tree to identify a leaf node based on an input comprises identifying a leaf node for a context-dependent phone.
  - 4. The computer-readable storage medium of claim 3 wherein the context-dependent phone comprises a triphone.
  - 5. The computer-readable storage medium of claim 1 wherein a cluster of tokens comprises a cluster of letters.
  - 6. The computer-readable storage medium of claim 5 wherein using the decision tree to identify a leaf node comprises using the decision tree to identify a pronunciation for an input comprising a combination of letters.
  - 7. The computer-readable storage medium of claim 1 where determining whether to combine a first cluster of tokens and a second cluster of tokens to form a new cluster of tokens using mutual information comprises:
    - identifying the first new cluster and a second possible new cluster;
      
      determining a mutual information score for the training data using the first new cluster;
      
      determining a mutual information score for the training data using the second possible new cluster; and
      
      selecting between the first new cluster and the second possible new cluster based on the mutual information scores for the first new cluster and the second possible new cluster.

8. A method of forming a decision tree used in speech processing, the method comprising:
- grouping at least two tokens to form a first possible cluster;
  
  a processing unit determining a mutual information score based on the first possible cluster through steps comprising determining the number of times tokens from the first possible cluster appear next to tokens from a second cluster, the number of times tokens from the first possible cluster appear individually, and the number of times tokens from the second cluster appear individually;
  
  grouping at least two tokens to form a third possible cluster;
  
  the processing unit determining a mutual information score based on the third possible cluster through steps comprising determining the number of times tokens from the third possible cluster appear next to tokens from a fourth cluster, the number of times tokens from the third possible cluster appear individually, and the number of times tokens from the fourth cluster appear individually;
  
  the processing unit selecting one of the first cluster and the third cluster based on the mutual information scores associated with the first cluster and the third cluster;
  
  using the selected cluster to form a question in the decision tree used in speech processing; and
  
  storing the decision tree on a computer-readable storage medium for later use in speech processing.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The method of claim 8 wherein each token is a linguistic phone.
  - 10. The method of claim 9 wherein the decision tree defines clusters of context-dependent phones.
  - 11. The method of claim 10 wherein the context-dependent phones are triphones.
  - 12. The method of claim 8 wherein each token is a letter in an alphabet.
  - 13. The method of claim 8 wherein selecting one of the first cluster and the third cluster comprises selecting the cluster that provides a higher mutual information score.
  - 14. The method of claim 8 wherein determining a mutual information score based on the first possible cluster comprises:
    - placing a cluster token representing the first possible cluster in place of each of the at least two tokens in a set of training data to form a modified set of training data; and
      
      determining the mutual information score of the modified set of training data.

15. A method of forming a decision tree for speech processing, the method comprising:
- identifying at least two possible clusters of tokens in a set of training data;
  
  a processing unit using co-occurrence frequency counts of clusters to select one of the at least two possible clusters wherein the co-occurrence frequency counts comprise the number of times tokens from two clusters appear next to each other in the training data;
  
  the processing unit storing the selected cluster on a computer-readable storage medium as a question for a node in the decision tree for speech processing wherein the question asks whether an input token is found in the selected cluster.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The method of claim 15 wherein each token is a letter.
  - 17. The method of claim 15 wherein using co-occurrence frequency counts comprises determining a mutual information score.
  - 18. The method of claim 15 wherein each token is a phone.
  - 19. The method of claim 18 wherein the decision tree defines groupings for context-dependent phones.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Morton, Rachel I., Chelba, Ciprian I.
Primary Examiner(s)
Opsasnick; Michael N

Application Number

US10/233,733
Publication Number

US 20040044528A1
Time in Patent Office

2,919 Days
Field of Search

704221-225, 704251-254
US Class Current

704/254
CPC Class Codes

G10L 15/083 Recognition networks G10L15...

G10L 15/197 Probabilistic grammars, e.g...

Method and apparatus for generating decision tree questions for speech processing

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

18 Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for generating decision tree questions for speech processing

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

18 Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links