Method and apparatus for continuous speech recognition using a layered, self-adjusting decoded network

US 6,442,520 B1
Filed: 11/08/1999
Issued: 08/27/2002
Est. Priority Date: 11/08/1999
Status: Expired due to Term

First Claim

Patent Images

1. A system for recognizing speech, comprising:

means for converting input speech into frames of speech data;

a dynamic network that receives said frames of speech data and establishes nodes that represent likelihood scores of various pre-defined models corresponding to the speech data of the respective frame;

a phone expanding network operating in parallel with said dynamic network, said a phone expanding network providing phone rules that govern which nodes of said dynamic network can be connected by arcs to which other nodes dependent upon said speech data;

a word network operating in parallel with said phone network and said dynamic network to provide word rules that govern which portions of the phone network correspond to recognizable words and which do not correspond to recognizable words;

said dynamic network, said phone network and said word network cooperating to process said speech data frames to recognize said input speech.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A continuous speech decoder that is built up of multiple layers. Each of the layers uses independent knowledge sources and rules, but all the layers cooperate to quickly decode the speech input into words. A first layer is concerned with acoustic data, a second layer with phone data of speech and a third layer concerns word data and word sequences. By separating these layers, the higher layers can be made time independent and asynchronous. Thus the asynchronous layers can process data quickly and give fast support to the first layer which keeps a dynamic record called a dynamic network of the most likely continuous speech results. The speed and separation of this decoder allows better memory efficiency and better decoder results compared to previously known continuous speech decoders.

36 Citations

View as Search Results

10 Claims

1. A system for recognizing speech, comprising:
- means for converting input speech into frames of speech data;
  
  a dynamic network that receives said frames of speech data and establishes nodes that represent likelihood scores of various pre-defined models corresponding to the speech data of the respective frame;
  
  a phone expanding network operating in parallel with said dynamic network, said a phone expanding network providing phone rules that govern which nodes of said dynamic network can be connected by arcs to which other nodes dependent upon said speech data;
  
  a word network operating in parallel with said phone network and said dynamic network to provide word rules that govern which portions of the phone network correspond to recognizable words and which do not correspond to recognizable words;
  
  said dynamic network, said phone network and said word network cooperating to process said speech data frames to recognize said input speech.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The system of claim 1, wherein said dynamic network has a network growth management process utilizing a two level dynamic hashing process of existing arcs and nodes for each frame of input speech and a layer for fast network expansion and reduction.
  - 3. The system of claim 2, wherein said dynamic decoding network is a layered self adjusting graph where each layer is obtained by slicing the decoding network according to separate knowledge levels.
  - 4. The system of claim 2, wherein said network growth management process can expand the layers in a frame synchronous fashion and expand the network into a previously released region to perform dynamic decoding for speech recognition on a re-entrant network.
  - 5. The system of claim 1, wherein said network growth management process generates and maintains a time varying slice of the decoding network to provide a coverage of a minimum and sufficient sub-decoding graph for each input speech frame.
  - 6. The system of claim 1, wherein said word network rules operate on a plurality of separate dynamic networks recognizing a respective plurality of concomitant speech inputs.
  - 7. The system of claim 6, wherein said phone rule driven process and said word rule driven process are shared among plural speech decoder networks recognizing respective speech inputs concurrently.

8. A decoder for continuous speech recognition using a processor and a memory having a plurality of memory locations, the decoder comprising:
- a speech framer for regularly processing input speech into consecutive frames of acoustic data;
  
  a word network process for storing and applying language rules;
  
  a phone network process for storing and applying phone rules; and
  
  a dynamic programming network process for building a network of nodes connected by arcs which provide possible decodings of said input speech, said dynamic programming network process uses information from said word network process and said phone network process to direct the building of the nodes and their connections to previous nodes by arcs.
- View Dependent Claims (9, 10)
- - 9. The decoder of claim 8, wherein said dynamic programming network process is expanded dynamically and released frame synchronously.
  - 10. The decoder of claim 9, wherein a procedure for a soft beam search which uses arc prediction based on calculating a path score entering a first state of an arc before allocating a respective missing arc, if the path score is within the current soft beam then the arc is created and the arc prediction beam search results updated.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Avago Technologies International Sales Pte Limited (Broadcom, Inc.)
Original Assignee
Agere Systems Guardian Corp. (Broadcom, Inc.)
Inventors
Chou, Wu, Buhrke, Eric Rolse
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
NOLAN, DANIEL A

Application Number

US09/435,192
Time in Patent Office

1,023 Days
Field of Search

704/251-255, 704/236-243
US Class Current

704/255
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/19   Grammatical context, e.g. d...

Method and apparatus for continuous speech recognition using a layered, self-adjusting decoded network

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

36 Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for continuous speech recognition using a layered, self-adjusting decoded network

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

36 Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links