Distributed Speech Recognition Using One Way Communication

US 20100057451A1
Filed: 08/30/2009
Published: 03/04/2010
Est. Priority Date: 08/29/2008
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

(A) at a client, transmitting a speech stream and a control stream to a speech recognition server using a Hypertext Transfer Protocol (HTTP) having a first timeout period;

(B) at the speech recognition server, using an automatic speech recognition engine to initiate recognition of the speech stream;

(C) at the client, transmitting a first request for a speech recognition result to the server using HTTP; and

(D) at the server, transmitting a notification to the client indicating that no speech recognition results have become available within a second timeout period that differs from the first timeout period; and

(E) at the client, in response to receiving the notification, transmitting a second request for the speech recognition result to the server using HTTP.

View all claims

12 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes the speech stream continuously. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition.

40 Citations

View as Search Results

21 Claims

1. A computer-implemented method comprising:
- (A) at a client, transmitting a speech stream and a control stream to a speech recognition server using a Hypertext Transfer Protocol (HTTP) having a first timeout period;
  
  (B) at the speech recognition server, using an automatic speech recognition engine to initiate recognition of the speech stream;
  
  (C) at the client, transmitting a first request for a speech recognition result to the server using HTTP; and
  
  (D) at the server, transmitting a notification to the client indicating that no speech recognition results have become available within a second timeout period that differs from the first timeout period; and
  
  (E) at the client, in response to receiving the notification, transmitting a second request for the speech recognition result to the server using HTTP.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, further comprising:
    - (F) at the server, recognizing a first portion of the speech stream to produce a first speech recognition result; and
      
      (G) transmitting the first speech recognition result to the client using HTTP in response to the second request.
  - 3. The method of claim 1, wherein (G) comprises:
    - (G) (1) determining whether any speech recognition results are available;
      
      (G) (2) if no speech recognition results are available, returning to (G) (1);
      
      (G) (3) otherwise, transmitting the first speech recognition result to the client.
  - 4. The method of claim 3, wherein the server performs (F) and (G) in parallel.
  - 5. The method of claim 1, wherein (A) comprises transmitting the speech stream and the control stream using a Hypertext Transfer Protocol over Secure Socket Layer (HTTPS), and wherein (C) comprises transmitting the first request using HTTPS.

6. A system comprising a client device and a speech recognition server:
- wherein the client device comprises;
  
  means for transmitting a speech stream and a control stream to a speech recognition server using a Hypertext Transfer Protocol (HTTP) having a first timeout period;
  
  means for transmitting a first request for a speech recognition result to the server using HTTP; and
  
  wherein the speech recognition server comprises;
  
  means for using an automatic speech recognition engine to initiate recognition of the speech stream;
  
  means for transmitting a notification to the client indicating that no speech recognition results have become available within a second timeout period that differs from the first timeout period; and
  
  wherein the client further comprises means, responsive to receipt of the notification, for transmitting a second request for the speech recognition result to the server using HTTP.

7. A computer-implemented method performed by a client device, the method comprising:
- (A) transmitting a speech stream and a control stream to a speech recognition server using a Hypertext Transfer Protocol (HTTP) having a first timeout period;
  
  (B) transmitting a first request for a speech recognition result to a server using HTTP at a first time;
  
  (C) receiving, at a second time that differs from the first time by less than the first timeout period, a notification from the server indicating that no speech recognition results are available; and
  
  (D) in response to receiving the notification, transmitting a second request for the speech recognition result to the server using HTTP.

8. An apparatus comprising:
- means for transmitting a speech stream and a control stream to a speech recognition server using a Hypertext Transfer Protocol (HTTP) having a first timeout period;
  
  means for transmitting a first request for a speech recognition result to a server using HTTP at a first time;
  
  means for receiving, at a second time that differs from the first time by less than the first timeout period, a notification from the server indicating that no speech recognition results are available; and
  
  means for transmitting a second request for the speech recognition result to the server using HTTP in response to receiving the notification.

9. A computer-implemented method perform by a server, the method comprising:
- (A) receiving a speech stream and a control stream from a client using a Hypertext Transfer Protocol (HTTP) having a first timeout period;
  
  (B) using an automatic speech recognition engine to initiate recognition of the speech stream;
  
  (C) receiving a first request for a speech recognition result from the client using HTTP; and
  
  (D) transmitting a notification to the client indicating that no speech recognition results have become available within a second timeout period that differs from the first timeout period.
- View Dependent Claims (10)
- - 10. The method of claim 9, further comprising:
    - (E) receiving a second request for the speech recognition result from the client using HTTP;
      
      (F) recognizing a first portion of the speech stream to produce a first speech recognition result; and
      
      (G) transmitting the first speech recognition result to the client using HTTP in response to the second request.

11. An apparatus comprising:
- means for receiving a speech stream and a control stream from a client using a Hypertext Transfer Protocol (HTTP) having a first timeout period;
  
  means for using an automatic speech recognition engine to initiate recognition of the speech stream;
  
  means for receiving a first request for a speech recognition result from the client using HTTP; and
  
  means for transmitting a notification to the client indicating that no speech recognition results have become available within a second timeout period that differs from the first timeout period.

12. A computer-implemented method comprising:
- (A) at a speech recognition server, using an automatic speech recognition engine to recognize a first portion of the speech stream and thereby to produce a first speech recognition result;
  
  (B) at the speech recognition server, if the first speech recognition result satisfies a first predetermined criterion, then waiting until the speech recognition engine has been reconfigured before continuing to (D); and
  
  (C) at the speech recognition server, using the automatic speech recognition engine to recognize a second portion of the speech stream and thereby to produce a second speech recognition result.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
- - 13. The method of claim 12, further comprising:
    - (D) at a client, before (A), transmitting the speech stream and the control stream to the speech recognition server.
  - 14. The method of claim 13, further comprising:
    - (E) at the client, transmitting a request for a speech recognition result to the server; and
      
      (F) at the server;
      
      (F) (1) determining whether any speech recognition results are available;
      
      (F) (2) if no speech recognition results are available, returning to (F) (1);
      
      (F) (3) otherwise, transmitting at least one of the speech recognition results to the client.
  - 15. The method of claim 14, wherein the server performs (B) in parallel with (F).
  - 16. The method of claim 14, wherein (A) comprises transmitting the speech stream and the control stream using a Hypertext Transfer Protocol (HTTP), and wherein (E) comprises transmitting the request for the speech recognition result using HTTP.
  - 17. The method of claim 14, wherein (A) comprises transmitting the speech stream and the control stream using a Hypertext Transfer Protocol over Secure Socket Layer (HTTPS), and wherein (E) comprises transmitting the request for the speech recognition result using HTTPS.
  - 18. The method of claim 13, wherein (A) comprises:
    - (A) (1) transmitting a first control message in the control stream to the speech recognition server;
      
      (A) (2) detecting a failure of the transmission of the first portion; and
      
      (A) (3) in response to detection of the failure;
      
      (A) (3) (a) creating a second control message specifying a combination of a first state change represented by the first control message and a second state change; and
      
      (A) (3) (b) transmitting the second control message in the control stream to the speech recognition server.
  - 19. The method of claim 12, wherein (C) comprises waiting until the automatic speech recognition engine is in a predetermined configuration state before continuing to (D).
  - 20. The method of claim 12, wherein (C) further comprises executing one of the control messages to reconfigure the speech recognition engine after (B).

21. An apparatus comprising:
- first portion recognition means for using an automatic speech recognition engine to recognize a first portion of the speech stream and thereby to produce a first speech recognition result;
  
  waiting means for waiting until the speech recognition engine has been reconfigured before activating the second portion recognition means if the first speech recognition result satisfies a first predetermined criterion; and
  
  second portion recognition means for using the automatic speech recognition engine to recognize a second portion of the speech stream and thereby to produce a second speech recognition result.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Solventum Intellectual Properties Company (Solventum Corp.)
Original Assignee
Multimodal Technologies Incorporated (3M Company)
Inventors
Carraux, Eric, Koll, Detlef

Granted Patent

US 8,019,608 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/231
CPC Class Codes

G10L 15/22   Procedures used during a sp...

G10L 15/30   Distributed recognition, e....

G10L 15/32   Multiple recognisers used i...

Distributed Speech Recognition Using One Way Communication

First Claim

12 Assignments

0 Petitions

Accused Products

Abstract

40 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Distributed Speech Recognition Using One Way Communication

First Claim

12 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

40 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links