Timing of speech recognition over lossy transmission systems

US 7,752,036 B2
Filed: 12/29/2008
Issued: 07/06/2010
Est. Priority Date: 06/30/1998
Status: Expired due to Fees

First Claim

Patent Images

1. A method of recognizing speech, the method causing a computing device to perform steps comprising:

generating via the computing device from received packets associated with input speech a speech vector;

identifying via the computing device features in the speech vector associated with corrupt data;

comparing via the computing device the speech vector to stored recognition models based on non-corrupt features in the speech vector to generate a first result or a second result;

recognizing via the computing device the speech input if the comparison generates the first result; and

requesting via the computing device a restoring mission of at least one packet if the comparison generates the second result.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Recognizing a stream of speech received as speech vectors over a lossy communications link includes constructing for a speech recognizer a series of speech vectors from packets received over a lossy packetized transmission link, wherein some of the packets associated with each speech vector are lost or corrupted during transmission. Each constructed speech vector is multi-dimensional and includes associated features. After waiting for a predetermined time, speech vectors are generated and potentially corrupted features within the speech vector are indicated to the speech recognizer when present. Speech recognition is attempted at the speech recognizer on the speech vectors when corrupted features are present. This recognition may be based only on certain or valid features within each speech vector. Retransmission of a missing or corrupted packet is requested when corrupted values are indicated by the indicating step and when the attempted recognition step fails.

Citations

18 Claims

1. A method of recognizing speech, the method causing a computing device to perform steps comprising:
- generating via the computing device from received packets associated with input speech a speech vector;
  
  identifying via the computing device features in the speech vector associated with corrupt data;
  
  comparing via the computing device the speech vector to stored recognition models based on non-corrupt features in the speech vector to generate a first result or a second result;
  
  recognizing via the computing device the speech input if the comparison generates the first result; and
  
  requesting via the computing device a restoring mission of at least one packet if the comparison generates the second result.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, further causing the computing device to perform steps comprising:
    - determining whether features within the speech vector are missing or corrupt; and
      
      if so, then requesting retransmission of packets associated with the missing or corrupt features; and
      
      if not, then requesting either that a speaker of the input speech make another utterance or to negotiate a higher bandwidth.
  - 3. The method of claim 2, wherein negotiation of higher bandwidth comprises at least one of an alteration of compression techniques, more robust air correction, use of partially redundant parallel data streams, or the use of principle component analysis.
  - 4. The method of claim 2, wherein the negotiation is transparent to the speaker.
  - 5. The method of claim 1, wherein the method further causes the computing device to perform steps comprising:
    - based on the comparison step, determining a plurality of probabilities associated with a likelihood that the speech vector is associated with each stored recognition model.
  - 6. The method of claim 1, further causing the computing device to perform steps comprising:
    - receiving packets associated with input speech transmitted over a link at a buffering and decoding unit and wherein generating the speech vector further generates the speech vector after waiting a predetermined amount of time after receiving a packet, wherein the predetermined time is based on at least one of;
      
      assumptions about the link, present network conditions and/or a tradeoff between recognizing speech in real time and receiving error free data.

7. A system for recognizing speech, the system comprising:
- a processor;
  
  a module configured to control the processor to generate from received packets associated with input speech a speech vector;
  
  a module configured to identify features in the speech vector associated with corrupt data;
  
  a module configured to compare the speech vector to stored recognition models based on non-corrupt features in the speech vector to generate a first result or a second result;
  
  a module configured, if the comparison generates the first result, to recognize the speech input; and
  
  a module configured, if the comparison generates the second result to request a restoring mission of at least one packet.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The system of claim 7, further comprising:
    - a module configured to determine whether features within the speech vector are missing or corrupt; and
      
      a module configured, if so, to request retransmission of packets associated with the missing or corrupt features; and
      
      a module configured, if not, to request either that a speaker of the input speech make another utterance or to negotiate a higher bandwidth.
  - 9. The system of claim 8, wherein negotiation of higher bandwidth comprises at least one of an alteration of compression techniques, more robust air correction, use of partially redundant parallel data streams, or the use of principle component analysis.
  - 10. The system of claim 8, wherein the negotiation is transparent to the speaker.
  - 11. The system of claim 7, wherein the system further comprises:
    - a module configured, based on the comparison step, to determine a plurality of probabilities associated with a likelihood that the speech vector is associated with each stored recognition model.
  - 12. The system of claim 7, further comprising:
    - a module configured to receive packets associated with input speech transmitted over a link at a buffering and decoding unit and wherein the module configured to generate the speech vector further generates the speech vector after waiting a predetermined amount of time after receiving a packet, wherein the predetermined time is based on at least one of;
      
      assumptions about the link, present network conditions and/or a tradeoff between recognizing speech in real time and receiving error free data.

13. A tangible computer readable medium storing a computer program for having instructions for controlling a computing device to recognize speech, the instructions comprising:
- generating via a processor from received packets associated with input speech a speech vector;
  
  identifying features in the speech vector associated with corrupt data;
  
  comparing the speech vector to stored recognition models based on non-corrupt features in the speech vector to generate a first result or a second result;
  
  recognizing the speech input if the comparison generates the first result; and
  
  requesting a restoring mission of at least one packet if the comparison generates the second result.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The computer readable medium of claim 13, further comprising:
    - determining whether features within the speech vector are missing or corrupt; and
      
      if so, then requesting retransmission of packets associated with the missing or corrupt features; and
      
      if not, then requesting either that a speaker of the input speech make another utterance or to negotiate a higher bandwidth.
  - 15. The computer readable medium of claim 14, wherein negotiation of higher bandwidth comprises at least one of an alteration of compression techniques, more robust air correction, use of partially redundant parallel data streams, or the use of principle component analysis.
  - 16. The computer readable medium of claim 14, wherein the negotiation is transparent to the speaker.
  - 17. The computer readable medium of claim 13, wherein the instructions further comprises:
    - based on the comparison step, determining a plurality of probabilities associated with a likelihood that the speech vector is associated with each stored recognition model.
  - 18. The computer readable medium of claim 13, wherein the instructions further comprise:
    - receiving packets associated with input speech transmitted over a link at a buffering and decoding unit and wherein generating the speech vector occurs after waiting a predetermined amount of time after receiving the packet, wherein the predetermined time is based on at least one of;
      
      assumptions about the link, present network conditions and/or a tradeoff between recognizing speech in real time and receiving error free data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Cox, Richard Vandervoort, Marcus, Stephen Michael, Rahim, Mazin G., Seshadri, Nambirajan, Sharp, Robert Douglas
Primary Examiner(s)
Vo, Huyen X.

Application Number

US12/344,815
Publication Number

US 20090112585A1
Time in Patent Office

554 Days
Field of Search

704/201, 704/231, 704/235, 704/236, 704/233, 704/240
US Class Current

704/201
CPC Class Codes

G10L 15/02 Feature extraction for spee...

G10L 15/20 Speech recognition techniqu...

Timing of speech recognition over lossy transmission systems

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Timing of speech recognition over lossy transmission systems

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links