Distributed speech recognition using one way communication
First Claim
1. A method performed by at least one computer processor executing computer program instructions stored on a non-transitory computer-readable medium, the method comprising:
- (A) at a speech recognition server;
(A)(1) receiving a speech stream and a control stream from a client;
(A) (2) using an automatic speech recognition engine in a first configuration state to recognize a first portion of the speech stream and thereby to produce a first speech recognition result;
(B) at the speech recognition server, if the first speech recognition result satisfies a first predetermined criterion specified by the control stream, then waiting until the speech recognition engine has been reconfigured before continuing to (C); and
(C) at the speech recognition server, using the automatic speech recognition engine in a second configuration state to recognize a second portion of the speech stream and thereby to produce a second speech recognition result.
12 Assignments
0 Petitions
Accused Products
Abstract
A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes a first portion of the speech stream and, if a predetermined criterion is satisfied by the speech recognition result, waits until the speech recognizer has been reconfigured before recognizing a second portion of the speech stream. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition.
39 Citations
18 Claims
-
1. A method performed by at least one computer processor executing computer program instructions stored on a non-transitory computer-readable medium, the method comprising:
-
(A) at a speech recognition server; (A)(1) receiving a speech stream and a control stream from a client; (A) (2) using an automatic speech recognition engine in a first configuration state to recognize a first portion of the speech stream and thereby to produce a first speech recognition result; (B) at the speech recognition server, if the first speech recognition result satisfies a first predetermined criterion specified by the control stream, then waiting until the speech recognition engine has been reconfigured before continuing to (C); and (C) at the speech recognition server, using the automatic speech recognition engine in a second configuration state to recognize a second portion of the speech stream and thereby to produce a second speech recognition result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A non-transitory computer-readable medium comprising computer program instructions stored on the computer-readable medium, wherein the computer program instructions are executable by at least one computer processor to perform a method comprising:
-
(A) at a speech recognition server; (A)(1) receiving a speech stream and a control stream from a client; (A) (2) using an automatic speech recognition engine in a first configuration state to recognize a first portion of the speech stream and thereby to produce a first speech recognition result; (B) at the speech recognition server, if the first speech recognition result satisfies a first predetermined criterion specified by the control stream, then waiting until the speech recognition engine has been reconfigured before continuing to (C); and (C) at the speech recognition server, using the automatic speech recognition engine in a second configuration state to recognize a second portion of the speech stream and thereby to produce a second speech recognition result. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification