Unified clustering tree

US 7,389,229 B2
Filed: 10/16/2003
Issued: 06/17/2008
Est. Priority Date: 10/17/2002
Status: Expired due to Fees

First Claim

Patent Images

1. A speech recognition system comprising:

a clustering tree configured to classify a series of sounds into predefined clusters based on one of the sounds and on a predetermined number of neighboring sounds that surround the one of the sounds, where the clustering tree comprises;

a first level with a first hierarchical arrangement of decision nodes in which the decision nodes of the first hierarchical arrangement are associated with a first group of questions relating to the series of sounds,a second level with a second hierarchical arrangement of decision nodes in which the decision nodes of the second hierarchical arrangement are associated with a second group of questions relating to the series of sounds, the second group of questions discriminating at a finer level of granularity within the series of sounds than the first group of questions, anda third level with a third hierarchical arrangement of decision nodes in which the decision nodes of the third hierarchical arrangement are associated with a third group of questions discriminating at a finer level of granularity within the series of sounds than the second group of questions; and

a plurality of speech recognition models trained to recognize speech based on the predefined clusters, the plurality of speech recognition models comprising;

a first model associated with the first level and including a triphone non-crossword speech recognition model,a second model associated with the second level and including a quinphone non-crossword speech recognition model, anda third model associated with the third level and including a quinphone crossword speech recognition model.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A unified clustering tree (500) generates phoneme clusters based on an input sequence of phonemes. The number of possible clusters is significantly less than the number of possible combinations of input phonemes. Nodes (510, 511) in the unified clustering tree are arranged into levels such that the clustering tree generates clusters for multiple speech recognition models. Models that correspond to higher levels in the unified clustering tree are coarse models relative to more fine-grain models at lower levels of the clustering tree.

Citations

14 Claims

1. A speech recognition system comprising:
- a clustering tree configured to classify a series of sounds into predefined clusters based on one of the sounds and on a predetermined number of neighboring sounds that surround the one of the sounds, where the clustering tree comprises;
  
  a first level with a first hierarchical arrangement of decision nodes in which the decision nodes of the first hierarchical arrangement are associated with a first group of questions relating to the series of sounds,a second level with a second hierarchical arrangement of decision nodes in which the decision nodes of the second hierarchical arrangement are associated with a second group of questions relating to the series of sounds, the second group of questions discriminating at a finer level of granularity within the series of sounds than the first group of questions, anda third level with a third hierarchical arrangement of decision nodes in which the decision nodes of the third hierarchical arrangement are associated with a third group of questions discriminating at a finer level of granularity within the series of sounds than the second group of questions; and
  
  a plurality of speech recognition models trained to recognize speech based on the predefined clusters, the plurality of speech recognition models comprising;
  
  a first model associated with the first level and including a triphone non-crossword speech recognition model,a second model associated with the second level and including a quinphone non-crossword speech recognition model, anda third model associated with the third level and including a quinphone crossword speech recognition model.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The system of claim 1, wherein the clustering tree is formed by freezing building of the first level of the clustering tree before building the second level of the clustering tree.
  - 3. The system of claim 2, wherein the clustering tree is further formed by freezing building of the first level of the clustering tree when an entropy level of the first level of the clustering tree is below a predetermined threshold.
  - 4. The system of claim 1 wherein the clustering tree is further formed by freezing building of the second level of the clustering tree before building the third level of the clustering tree.
  - 5. The system of claim 4, wherein the clustering tree is further formed by freezing building of the second level of the clustering tree when an entropy level of the second level of the clustering tree is below a predetermined threshold.
  - 6. The system of claim 1, wherein the clustering tree is further built to include terminal nodes that assign each of the groups of sound into one of the sound clusters.
  - 7. The system of claim 1, wherein the first group of questions includes questions that relate to the series of sounds as a sound being modeled and one context sound before and after the sound being modeled.
  - 8. The system of claim 7, wherein the second group of questions includes questions that relate to the series of sounds as the sound being modeled and two context sounds before and after the sound being modeled.
  - 9. The system of claim 1 wherein higher ones of the hierarchical levels include nodes that correspond to more general questions than questions corresponding to nodes at lower ones of the hierarchical levels.
  - 10. The system of claim 1, wherein the sounds are represented by phonemes.
  - 11. The system of claim 1, wherein the clustering tree comprises:
    - decision nodes associated with questions that relate to the series of sounds, andterminal nodes that define a sound cluster to which the series of sounds belong.
  - 12. The system of claim 11, wherein the decision nodes and the terminal nodes are defined hierarchically relative to one another.
  - 13. The system of claim 12, wherein the decision nodes correspond to lower ones of the levels in the hierarchically defined nodes are associated with more detailed questions than decision nodes corresponding to higher ones of the levels in the hierarchically defined nodes.

14. A device comprising:
- means for classifying a series of sounds into predefined clusters using a clustering tree and based on one of the sounds and a predetermined number of neighboring sounds that surround the one of the sounds, where the clustering tree includes;
  
  a first level with a first hierarchical arrangement of decision nodes in which the decision nodes of the first hierarchical arrangement are associated with a first group of questions relating to the series of sounds;
  
  a second level with a second hierarchical arrangement of decision nodes in which the decision nodes of the second hierarchical arrangement are associated with a second group of questions relating to the series of sounds, the second group of questions discriminating at a finer level of granularity within the series of sounds than the first group of questions; and
  
  a third level with a third hierarchical arrangement of decision nodes in which the decision nodes of the third hierarchical arrangement are associated with a third group of questions discriminating at a finer level of granularity within the series of sounds than the second group of questions; and
  
  means for training a plurality of speech recognition models to recognize speech based on the predefined clusters, the speech recognition models including;
  
  a first model associated with the first level and including a triphone non-crossword speech recognition model,a second model associated with the second level and including a quinphone non-crossword speech recognition model, anda third model associated with the third level and including a quinphone crossword speech recognition model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Ramp Holdings Incorporated (Clean Harbors Incorporated)
Original Assignee
BBN Technologies Corporation (Rtx Corporation)
Inventors
Billa, Jayadev, Kiecza, Daniel, Kubala, Francis G.
Primary Examiner(s)
Azad; Abul K.

Application Number

US10/685,410
Publication Number

US 20050038649A1
Time in Patent Office

1,706 Days
Field of Search

None
US Class Current

704/242
CPC Class Codes

G10L 15/28 Constructional details of s...

G10L 15/32 Multiple recognisers used i...

Unified clustering tree

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Unified clustering tree

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links