Speech Recognition Circuit and Method

US 20080255839A1
Filed: 09/14/2005
Published: 10/16/2008
Est. Priority Date: 09/14/2004
Status: Active Grant

First Claim

Patent Images

1. A speech recognition circuit, comprising:

an audio front end for calculating a feature vector from an audio signal, wherein the feature vector comprises a plurality of extracted and/or derived quantities from said audio signal during a defined audio time frame;

a calculating circuit for calculating distances indicating the similarity between a feature vector and a plurality of predetermined acoustic states of an acoustic model; and

a search stage for using said calculated distances to identify words within a lexical tree, the lexical tree comprising a model of words;

wherein said audio front end and said search stage are implemented using a first processor, and said calculating circuit is implemented using a second processor, and wherein data is pipelined from the front end to the calculating circuit to the search stage.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition circuit comprising a circuit for providing state identifiers which identify states corresponding to nodes or groups of adjacent nodes in a lexical tree, and for providing scores corresponding to said state identifiers, the lexical tree comprising a model of words; a memory structure for receiving and storing state identifiers identified by a node identifier identifying a node or group of adjacent nodes, said memory structure being adapted to allow lookup to identify particular state identifiers, reading of the scores corresponding to the state identifiers, and writing back of the scores to the memory structure after modification of the scores; an accumulator for receiving score updates corresponding to particular state identifiers from a score update generating circuit which generates the score updates using audio input, for receiving scores from the memory structure, and for modifying said scores by adding said score updates to said scores; and a selector circuit for selecting at least one node or group of adjacent nodes of the lexical tree according to said scores.

75 Citations

View as Search Results

86 Claims

1. A speech recognition circuit, comprising:
- an audio front end for calculating a feature vector from an audio signal, wherein the feature vector comprises a plurality of extracted and/or derived quantities from said audio signal during a defined audio time frame;
  
  a calculating circuit for calculating distances indicating the similarity between a feature vector and a plurality of predetermined acoustic states of an acoustic model; and
  
  a search stage for using said calculated distances to identify words within a lexical tree, the lexical tree comprising a model of words;
  
  wherein said audio front end and said search stage are implemented using a first processor, and said calculating circuit is implemented using a second processor, and wherein data is pipelined from the front end to the calculating circuit to the search stage.
- View Dependent Claims (2, 3, 4, 5, 36, 37, 38, 39, 40, 42, 44, 46, 86)
- - 2. A speech recognition circuit as claimed in claim 1, wherein the pipelining comprises alternating of front end and search stage processing on the first processor.
  - 3. A speech recognition circuit as claimed in claim 1, comprising dynamic scheduling whether the first processor should run the front end or search stage code, based on availability or unavailability of distance results and/or availability of space for storing more feature vectors and/or distance results.
  - 4. A speech recognition circuit as claimed in claim 1, wherein the first processor supports multi-threaded operation, and runs the search stage and front ends as separate threads.
  - 5. A speech recognition circuit as claimed in claim 1, wherein the said calculating circuit is configured to autonomously calculate distances for every acoustic state defined by the acoustic model.
  - 36. The speech recognition circuit of claim 1, comprising control means adapted to implement frame dropping, to discard one or more audio time frames.
  - 37. The speech recognition circuit of claim 1, wherein the feature vector comprises a plurality of spectral components of an audio signal for a predetermined time frame.
  - 38. The speech recognition circuit of claim 1, wherein the processor is configured to divert to another task if the data flow stalls.
  - 39. The speech recognition circuit of claim 1, wherein the speech accelerator has an interrupt signal to inform the front end that the accelerator is ready to receive a next feature vector from the front end.
  - 40. The speech recognition circuit of claim 1, wherein the accelerator signals to the search stage when distances for a new frame are available in a result memory.
  - 42. The speech recognition circuit of claim 1, comprising increasing the pipeline depth by computing extra front end frames in advance.
  - 44. A speech recognition circuit as claim 1, wherein the audio front end is configured to input a digital audio signal.
  - 46. A method as implemented by the apparatus of any previous claim 1.
  - 86. A speech recognition circuit of claim 1, wherein said distance comprises a Mahalanobis distance.

6-9. -9. (canceled)

10. An accelerator for calculating distances for a speech recognition circuit, the accelerator comprising:
- calculating circuit for calculating distances indicating the similarity between a feature vector and a plurality of predetermined acoustic states of an acoustic model, wherein the feature vector comprises a plurality of extracted and/or derived quantities from an audio signal during a defined audio time frame;
  
  first and second storage circuit, each for storing calculated distances for at least one said audio time frame, and for making said stored distances available for use by another part of the speech recognition circuit;
  
  control circuit for controlling read and write access to the first and second storage circuit, said control means being configured to allow writing to one said storage means while the other said storage means is available for reading, to allow first calculated distances for one audio time frame to be written to one said storage means while second calculated distances for an earlier audio time frame are made available for reading from the other said storage means.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 85)
- - 11. An accelerator as claimed in claim 10, wherein the control means is configured to alternate which of the first and second storage means is available for reading, and which of the first and second storage means is available for writing, as each new feature vector is processed by the calculating circuit.
  - 12. The accelerator of claim 10, wherein the distance is stored for each of a plurality of states.
  - 13. The accelerator of claim 10, wherein the first and second storage means are interfaced to a processor in the speech recognition circuit as different memory locations within the processor'"'"'s memory map.
  - 14. The accelerator of claim 10, wherein the first and second storage means are interfaced to a processor in the speech recognition circuit as being mapped to a first fixed memory address when configured for reading and/or to a second fixed memory address when configured for writing.
  - 15. The accelerator of claim 10, comprising releasing a result memory after the search stage has processed the contents, by informing the accelerator that it can overwrite the results in said memory.
  - 16. The accelerator of claim 10, comprising additional result memories, where at any time at most one result memory is configured for writing, and at least one result memory is configured for reading.
  - 17. The accelerator of claim 10, comprising random access of the distance results in the result memories, as they are needed by the search stage.
  - 18. The accelerator of claim 10, wherein the result memories are accessed in a round-robin fashion.
  - 85. The accelerator of claim 10, wherein the said calculating circuit is configured to autonomously calculate distances for every acoustic state defined by the acoustic model

19. A accelerator for a speech recognition circuit, the accelerator comprising:
- calculating means for calculating distances indicating the similarity between a feature vector and a plurality of predetermined acoustic states of an acoustic model, wherein the feature vector comprises a plurality of extracted and/or derived quantities from an audio signal during a defined audio time frame;
  
  means for receiving or storing compressed data representing said acoustic model;
  
  a decompressor for decompressing said compressed data for all states or selected states of the acoustic model, wherein the decompressed data is sent to the calculating means; and
  
  output means for outputting calculated distances to another part of the speech recognition circuit.
- View Dependent Claims (20, 21, 22, 23, 29)
- - 20. The accelerator of claim 19, wherein the decompression scheme comprises one or more of the following:
    - sign or zero extension or otherwise conversion of narrow or variable width data to a wider data format;
      
      sign or zero extension or otherwise conversion of narrow or variable width data to IEEE standard single or double precision floating point format;
      
      codebook decompression of a binary bitstream, where the codebook is stored as part of the acoustic model data;
      
      decompression of a Huffman or Lempel-Ziv compressed stream;
      
      decompression of run length encoded data;
      
      decompression of difference encoded data; and
      
      any well known compression scheme.
  - 21. The accelerator of claim 19, wherein the decompressor is configured to operate with an acoustic model that uses subspace distribution clustering.
  - 22. The accelerator of claim 19, wherein acoustic states are decompressed one or more times for each feature vector processed by the accelerator.
  - 23. The accelerator of claim 19, wherein the said calculating circuit is configured to autonomously calculate distances for every acoustic state defined by the acoustic model.
  - 29. An accelerator according to claim 22, further comprising means for generating a checksum or computed signature for the acoustic model data stored in the memory, and means for comparing checksums or computed signatures that have been calculated at different times, to indicate an error status if the checksums do not match.

24-25. -25. (canceled)

26. An accelerator for a speech recognition circuit, the accelerator comprising:
- calculating means for calculating distances indicating the similarity between a feature vector and a plurality of predetermined acoustic states of an acoustic model, wherein the feature vector comprises a plurality of extracted and/or derived quantities from an audio signal during a defined audio time frame; and
  
  a memory for storing said acoustic model,wherein the calculating means and the memory are fabricated as circuits on a single integrated circuit.
- View Dependent Claims (28, 30, 32, 33, 34, 35)
- - 28. An accelerator according to claim 26, wherein said memory is configured to be used to hold other data not related to speech recognition during periods when speech recognition is not active.
  - 30. An accelerator according to claim 26, wherein the acoustic model memory is a RAM or flash memory.
  - 32. The accelerator of claim 26, being provided in the form of a separate physical circuit, device or package that is removably connectable to said speech recognition circuit.
  - 33. A speech recognition circuit comprising the accelerator of claim 26.
  - 34. The speech recognition circuit of claim 33, comprising a shared processor for the front end and search stages, where said shared processor does not perform a majority of the distance calculations.
  - 35. The speech recognition circuit of claim 33, comprising a digital signal processor for the front end, and a general purpose microprocessor for the search stage.

27. (canceled)

31. (canceled)

41. (canceled)

43. A speech recognition circuit, comprising:
- an audio front end for calculating a feature vector from an audio signal, wherein the feature vector comprises a plurality of extracted and/or derived quantities from said audio signal during a defined audio time frame;
  
  calculating means for calculating a distance indicating the similarity between a feature vector and a predetermined acoustic state of an acoustic model; and
  
  a search stage for using said calculated distances to identify words within a lexical tree, the lexical tree comprising a model of words;
  
  wherein said audio front end, said calculating means, and said search stage are connected to each other to enable pipelined data flow from one to another.

45. A speech recognition method, comprising:
- calculating a feature vector from an audio signal, wherein the feature vector comprises a plurality of extracted and/or derived quantities from said audio signal during a defined audio time frame;
  
  calculating a distance indicating the similarity between a feature vector and a predetermined acoustic state of an acoustic model; and
  
  using said calculated distances to identify words within a lexical tree, the lexical tree comprising a model of words;
  
  wherein said audio front end and said search stage are implemented using a first processor, and said calculating means is implemented using a second processor, and wherein data is pipelined from the front end to the calculating means to the search stage.

47-84. -84. (canceled)

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Zentian Limited
Original Assignee
Zentian Limited
Inventors
Larri, Guy, Catchpole, Mark, Reynolds, Timothy Brian, Harris-Dowsett, Damian Kelly

Granted Patent

US 7,979,277 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/238
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/06   Creation of reference templ...

G10L 15/08   Speech classification or se...

G10L 15/10   using distance or distortio...

G10L 15/285   Memory allocation or algori...

Speech Recognition Circuit and Method

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

75 Citations

86 Claims

Specification

Use Cases

Quick Links

Others

Speech Recognition Circuit and Method

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

75 Citations

86 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others