Automated tuning of speech recognition parameters
First Claim
1. A method for providing dynamically loaded speech recognition parameters from a server to a speech recognition engine, comprising:
- (A) with the server, on a first occasion after a first speech recognition session has been initiated between a first user and the speech recognition engine;
executing at least one rule for selecting speech recognition parameters for use by a speech recognition engine, wherein the at least one rule comprises an if-portion including criteria and a then-portion specifying values of speech recognition parameters that must be used by the speech recognition engine for evaluating natural language options of a grammar when the criteria is met;
selecting first values of a set of speech recognition parameters responsive to executing the at least one rule on the first occasion; and
communicating to the speech recognition engine the selected first values of the set of speech recognition parameters for performing speech recognition of the first user on the first occasion;
(B) with the speech recognition engine, for a first utterance by the first user on the first occasion;
receiving the selected first values of the set of speech recognition parameters from the server;
using the received selected first values of the set of speech recognition parameters to evaluate the acoustic properties of the first utterance to determine first acoustic scores for the natural language options of the grammar;
combining the determined first acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute first computed scores for the natural language options of the grammar by using grammar weights for the natural language options of the grammar to bias the determined first acoustic scores; and
choosing the natural language option of the grammar having the highest first computed score as the natural language interpretation of the first utterance;
(C) with the server, on a second occasion after a second speech recognition session has been initiated between a second user and the speech recognition engine;
executing the at least one rule;
selecting second values of the set of speech recognition parameters responsive to executing the at least one rule on the second occasion; and
communicating to the speech recognition engine the selected second values of the set of speech recognition parameters for performing speech recognition of the second user on the second occasion; and
(D) with the speech recognition engine, for a second utterance by the second user on the second occasion;
receiving the selected second values of the set of speech recognition parameters from the server;
using the received selected second values of the set of speech recognition parameters to evaluate the acoustic properties of the second utterance to determine second acoustic scores for the natural language options of the grammar;
combining the determined second acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute second computed scores for the natural language options of the grammar by using grammar weights for the natural language options of the grammar to bias the determined second acoustic scores; and
choosing the natural language option of the grammar having the highest second computed score as the natural language interpretation of the second utterance;
wherein the set of speech recognition parameters comprises one or both of an accuracy setting and a sensitivity value.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for execution on a server for serving presence information, the method for providing dynamically loaded speech recognition parameters to a speech recognition engine, can be provided. The method can include storing at least one rule for selecting speech recognition parameters, wherein a rule comprises an if-portion including criteria and a then-portion specifying speech recognition parameters that must be used when the criteria is met. The method can further include receiving notice that a speech recognition session has been initiated between a user and the speech recognition engine. The method can further include selecting a first set of speech recognition parameters responsive to executing the at least one rule and providing to the speech recognition engine the first set of speech recognition parameters for performing speech recognition of the user.
-
Citations
20 Claims
-
1. A method for providing dynamically loaded speech recognition parameters from a server to a speech recognition engine, comprising:
-
(A) with the server, on a first occasion after a first speech recognition session has been initiated between a first user and the speech recognition engine; executing at least one rule for selecting speech recognition parameters for use by a speech recognition engine, wherein the at least one rule comprises an if-portion including criteria and a then-portion specifying values of speech recognition parameters that must be used by the speech recognition engine for evaluating natural language options of a grammar when the criteria is met; selecting first values of a set of speech recognition parameters responsive to executing the at least one rule on the first occasion; and communicating to the speech recognition engine the selected first values of the set of speech recognition parameters for performing speech recognition of the first user on the first occasion; (B) with the speech recognition engine, for a first utterance by the first user on the first occasion; receiving the selected first values of the set of speech recognition parameters from the server; using the received selected first values of the set of speech recognition parameters to evaluate the acoustic properties of the first utterance to determine first acoustic scores for the natural language options of the grammar; combining the determined first acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute first computed scores for the natural language options of the grammar by using grammar weights for the natural language options of the grammar to bias the determined first acoustic scores; and choosing the natural language option of the grammar having the highest first computed score as the natural language interpretation of the first utterance; (C) with the server, on a second occasion after a second speech recognition session has been initiated between a second user and the speech recognition engine; executing the at least one rule; selecting second values of the set of speech recognition parameters responsive to executing the at least one rule on the second occasion; and communicating to the speech recognition engine the selected second values of the set of speech recognition parameters for performing speech recognition of the second user on the second occasion; and (D) with the speech recognition engine, for a second utterance by the second user on the second occasion; receiving the selected second values of the set of speech recognition parameters from the server; using the received selected second values of the set of speech recognition parameters to evaluate the acoustic properties of the second utterance to determine second acoustic scores for the natural language options of the grammar; combining the determined second acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute second computed scores for the natural language options of the grammar by using grammar weights for the natural language options of the grammar to bias the determined second acoustic scores; and choosing the natural language option of the grammar having the highest second computed score as the natural language interpretation of the second utterance; wherein the set of speech recognition parameters comprises one or both of an accuracy setting and a sensitivity value. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for providing dynamically loaded speech recognition parameters from a server to speech recognition engines, comprising:
-
(A) with the server, on a first occasion after a first speech recognition session has been initiated between a first user and a first speech recognition engine, the first speech recognition engine having been selected from among a plurality of speech recognition engines for use during the first speech recognition session based on most recently stored metadata about the plurality of speech recognition engines; executing at least one rule for selecting speech recognition parameters for use by a speech recognition engine, wherein the at least one rule comprises an if-portion including criteria and a then-portion specifying values of speech recognition parameters that must be used by the speech recognition engine for evaluating natural language options of a grammar when the criteria is met; selecting first values of a set of speech recognition parameters responsive to executing the at least one rule on the first occasion; and communicating to the first speech recognition engine the selected first values of the set of speech recognition parameters for performing speech recognition of the first user on the first occasion; (B) with the first speech recognition engine, for a first utterance by the first user on the first occasion; receiving the selected first values of the set of speech recognition parameters from the server; using the received selected first values of the set of speech recognition parameters to evaluate the acoustic properties of the first utterance to determine first acoustic scores for the natural language options of the grammar; combining the determined first acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute first computed scores for the natural language option of the grammar by using grammar weights for the natural language options of the grammar to bias the determined first acoustic scores; and choosing the natural language option of the grammar having the highest first computed score as the natural language interpretation of the first utterance; (C) with the server, on a second occasion after a second speech recognition session has been initiated between a second user and the first speech recognition engine, the first speech recognition engine having been selected from among the plurality of speech recognition engines for use during the second speech recognition session based on most recently stored metadata about the plurality of speech recognition engines; executing the at least one rule; selecting second values of the set of speech recognition parameters responsive to executing the at least one rule on the second occasion; and communicating to the first speech recognition engine the selected second values of the set of speech recognition parameters for performing speech recognition of the second user on the second occasion; (D) with the first speech recognition engine, for a second utterance by the second user on the second occasion; receiving the selected second values of the set of speech recognition parameters from the server; using the received selected second values of the set of speech recognition parameters to evaluate the acoustic properties of the second utterance to determine second acoustic scores for the natural language options of the grammar; combining the determined second acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute second computed scores for the natural language option of the grammar by using grammar weights for the natural language options of the grammar to bias the determined second acoustic scores; and choosing the natural language option of the grammar having the highest second computed score as the natural language interpretation of the second utterance; (E) with the server, on a third occasion after a third speech recognition session has been initiated between a third user and a second speech recognition engine, the second speech recognition engine having been selected from among the plurality of speech recognition engines for use during the third speech recognition session based on most recently stored metadata about the plurality of speech recognition engines; executing the at least one rule; selecting third values of the set of speech recognition parameters responsive to executing the at least one rule on the third occasion; and communicating to the second speech recognition engine the selected third values of the set of speech recognition parameters for performing speech recognition of the third user on third occasion; (F) with the second speech recognition engine, for a third utterance by the third user on the third occasion; receiving the selected third values of the set of speech recognition parameters from the server; using the received selected third values of the set of speech recognition parameters to evaluate the acoustic properties of the second utterance to determine third acoustic scores for the natural language options of the grammar; combining the determined third acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute third computed scores for the natural language option of the grammar by using grammar weights for the natural language options of the grammar to bias the determined third acoustic scores; and choosing the natural language option of the grammar having the highest third computed score as the natural language interpretation of the third utterance; (G) with the server, on a fourth occasion after a fourth speech recognition session has been initiated between a fourth user and the second speech recognition engine, the second speech recognition engine having been selected from among the plurality of speech recognition engines for use during the fourth speech recognition session based on most recently stored metadata about the plurality of speech recognition engines; executing the at least one rule; selecting a fourth set of speech recognition parameters responsive to executing the at least one rule on the fourth occasion; and communicating to the second speech recognition engine the selected fourth values of the set of speech recognition parameters for performing speech recognition of the user on the fourth occasion; and (H) with the second speech recognition engine, for a fourth utterance by the fourth user on the fourth occasion; receiving the selected fourth values of the set of speech recognition parameters from the server; using the received selected fourth values of the set of speech recognition parameters to evaluate the acoustic properties of the fourth utterance to determine fourth acoustic scores for the natural language options of the grammar; combining the determined fourth acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute fourth computed scores for the natural language option of the grammar by using grammar weights for the natural language options of the grammar to bias the determined fourth acoustic scores; and choosing the natural language option of the grammar having the highest fourth computed score as the natural language interpretation of the fourth utterance; wherein the set of speech recognition parameters comprises one or both of an accuracy setting and a sensitivity value. - View Dependent Claims (7, 8, 9, 10, 11)
-
-
12. A computer system, comprising:
-
a server; a speech recognition engine; and a repository for storing at least one rule for selecting speech recognition parameters for use by a speech recognition engine, wherein a rule comprises an if-portion including criteria and a then-portion specifying values of speech recognition parameters that must be used for evaluating natural language options of a grammar when the criteria is met; wherein the server is configured to, on a first occasion after a first speech recognition session has been initiated between a first user and the speech recognition engine; execute the at least one rule; select first values of a set of speech recognition parameters responsive to executing the at least one rule on the first occasion; and communicate to the speech recognition engine the selected first values of the set of speech recognition parameters for performing speech recognition of the first user on the first occasion; wherein the speech recognition engine is configured to, for a first utterance by the first user on the first occasion; receive the selected first values of the set of speech recognition parameters from the server; use the received selected first values of the set of speech recognition parameters to evaluate the acoustic properties of the first utterance to determine first acoustic scores for the natural language options of the grammar; combine the determined first acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute first computed scores for the natural language options of the grammar by using grammar weights for the natural language options of the grammar to bias the determined first acoustic scores; and choose the natural language option of the grammar having the highest first computed score as the natural language interpretation of the first utterance; wherein the server is further configured to, on a second occasion after a second speech recognition session has been initiated between a second user and the speech recognition engine; execute the at least one rule; select second values of the set of speech recognition parameters responsive to executing the at least one rule on the second occasion; and communicate to the speech recognition engine the selected second values of the set of speech recognition parameters for performing speech recognition of the second user on the second occasion; wherein the speech recognition engine is further configured to, for a second utterance by the second user on the second occasion; receive the selected second values of the set of speech recognition parameters from the server; use the received selected second values of the set of speech recognition parameters to evaluate the acoustic properties of the second utterance to determine second acoustic scores for the natural language options of the grammar; and combine the determined second acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute second computed scores for the natural language options of the grammar by using grammar weights for the natural language options of the grammar to bias the determined second acoustic scores; and choose the natural language option of the grammar having the highest second computed score as the natural language interpretation of the second utterance; and wherein the set of speech recognition parameters comprises one or both of an accuracy setting and a sensitivity value. - View Dependent Claims (13, 14, 15)
-
-
16. A non-transitory computer-readable medium encoded with a plurality of instructions that, when executed by at least one processor, cause the at least one processor to perform a method for providing dynamically loaded speech recognition parameters from a server to a speech recognition engine, comprising:
-
(A) executing, on a first occasion after a first speech recognition session has been initiated between a first user and the speech recognition engine, at least one rule for selecting speech recognition parameters for use by the speech recognition engine, wherein the at least one rule comprises an if-portion including criteria and a then-portion specifying values of speech recognition parameters that must be used by the speech recognition engine for evaluating natural language options of a grammar when the criteria is met; (B) selecting first values of a set of speech recognition parameters responsive to executing the at least one rule on the first occasion; (C) communicating to the speech recognition engine the selected first values of the set of speech recognition parameters for performing speech recognition of the first user on the first occasion; wherein communication of the selected first values of the set of speech recognition parameters from the server to the speech recognition engine allows the speech recognition engine, for a first utterance by the first user on the first occasion, to; receive the selected first values of the set of speech recognition parameters from the server; use the received selected first values of the set of speech recognition parameters to evaluate the acoustic properties of the first utterance to determine first acoustic scores for the natural language options of the grammar; combine the determined first acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute first computed scores for the natural language options of the grammar by using grammar weights for the natural language options of the grammar to bias the determined first acoustic scores; and choose the natural language option of the grammar having the highest first computed score as the natural language interpretation of the first utterance; (D) executing the at least one rule on a second occasion after a second speech recognition session has been initiated between a second user and the speech recognition engine; (E) selecting second values of the set of speech recognition parameters responsive to executing the at least one rule on the second occasion; and (F) communicating to the speech recognition engine the selected second values of the set of speech recognition parameters for performing speech recognition of the second user on the second occasion; wherein communication of the selected second values of the set of speech recognition parameters from the server to the speech recognition engine allows the speech recognition engine, for a second utterance by the second user on the second occasion, to; receive the selected second values of the set of speech recognition parameters from the server; use the received selected second values of the set of speech recognition parameters to evaluate the acoustic properties of the second utterance to determine second acoustic scores for the natural language options of the grammar; combine the determined second acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute second computed scores for the natural language options of the grammar by using grammar weights for the natural language options of the grammar to bias the determined second acoustic scores; and choose the natural language option of the grammar having the highest second computed score as the natural language interpretation of the second utterance; and wherein the set of speech recognition parameters comprises one or both of an accuracy setting and a sensitivity value. - View Dependent Claims (17, 18)
-
-
19. A non-transitory computer-readable medium encoded with a plurality of instructions that, when executed by at least one processor, cause the at least one processor to perform a method for providing dynamically loaded speech recognition parameters from a server to speech recognition engines, comprising:
-
(A) executing the at least one rule on a first occasion after a first speech recognition session has been initiated between a first user and a first speech recognition engine, the first speech recognition engine having been selected from among a plurality of speech recognition engines for use during the first speech recognition session based on most recently stored metadata about the plurality of speech recognition engines, the at least one rule allowing selection of speech recognition parameters for use by the plurality of speech recognition engines, wherein the at least one rule comprises an if-portion including criteria and a then-portion specifying values of speech recognition parameters that must be used by a speech recognition engine for evaluating natural language options of a grammar when the criteria is met; (B) selecting first values of a set of speech recognition parameters responsive to executing the at least one rule on the first occasion; (C) communicating to the first speech recognition engine the selected first values of the set of speech recognition parameters for performing speech recognition of the first user on the first occasion; wherein communication of the selected first values of the set of speech recognition parameters from the server to the first speech recognition engine allows the first speech recognition engine, for a first utterance by the first user on the first occasion, to; receive the selected first values of the set of speech recognition parameters from the server; use the received selected first values of the set of speech recognition parameters to evaluate the acoustic properties of the first utterance to determine first acoustic scores for the natural language options of the grammar; combine the determined first acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute first computed scores for the natural language option of the grammar by using grammar weights for the natural language options of the grammar to bias the determined first acoustic scores; and choose the natural language option of the grammar having the highest first computed score as the natural language interpretation of the first utterance; (D) executing the at least one rule on a second occasion after a second speech recognition session has been initiated between a second user and the first speech recognition engine, the first speech recognition engine having been selected from among the plurality of speech recognition engines for use during the second speech recognition session based on most recently stored metadata about the plurality of speech recognition engines; (E) selecting second values of the set of speech recognition parameters responsive to executing the at least one rule on the second occasion; (F) communicating to the first speech recognition engine the selected second values of the set of speech recognition parameters for performing speech recognition of the second user on the second occasion; wherein communication of the selected second values of the set of speech recognition parameters from the server to the first speech recognition engine allows the first speech recognition engine, for a second utterance by the second user on the second occasion, to; receive the selected second values of the set of speech recognition parameters from the server; use the received selected second values of the set of speech recognition parameters to evaluate the acoustic properties of the second utterance to determine second acoustic scores for the natural language options of the grammar; and combine the determined second acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute second computed scores for the natural language option of the grammar by using grammar weights for the natural language options of the grammar to bias the determined second acoustic scores; and choose the natural language option of the grammar having the highest second computed score as the natural language interpretation of the second utterance; (G) executing the at least one rule on a third occasion after a third speech recognition session has been initiated between a third user and a second speech recognition engine, the second speech recognition engine having been selected from among the plurality of speech recognition engines for use during the third speech recognition session based on most recently stored metadata about the plurality of speech recognition engines; (H) selecting third values of the set of speech recognition parameters responsive to executing the at least one rule on the third occasion; (I) communicating to the second speech recognition engine the selected third values of the set of speech recognition parameters for performing speech recognition of the third user on the third occasion; wherein communication of the selected third values of the set of speech recognition parameters from the server to the second speech recognition engine allows the second speech recognition engine, for a third utterance by the third user on the third occasion, to; receive the selected third values of the set of speech recognition parameters from the server; use the received selected third values of the set of speech recognition parameters to evaluate the acoustic properties of the second utterance to determine third acoustic scores for the natural language options of the grammar; combine the determined third acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute third computed scores for the natural language option of the grammar by using grammar weights for the natural language options of the grammar to bias the determined third acoustic scores; and choose the natural language option of the grammar having the highest third computed score as the natural language interpretation of the third utterance; (J) executing the at least one rule on a fourth occasion after a fourth speech recognition session has been initiated between a fourth user and the second speech recognition engine, the second speech recognition engine having been selected from among the plurality of speech recognition engines for use during the fourth speech recognition session based on most recently stored metadata about the plurality of speech recognition engines; (K) selecting fourth values of the set of speech recognition parameters responsive to executing the at least one rule on the fourth occasion; and (L) communicating to the second speech recognition engine the selected fourth values of the set of speech recognition parameters for performing speech recognition of the fourth user on the fourth occasion; wherein communication of the selected fourth values of the set of speech recognition parameters from the server to the second speech recognition engine allows the second speech recognition engine, for a fourth utterance by the fourth user on the fourth occasion, to; receive the selected fourth values of the set of speech recognition parameters from the server; use the received selected fourth values of the set of speech recognition parameters to evaluate the acoustic properties of the fourth utterance to determine fourth acoustic scores for the natural language options of the grammar; and combine the determined fourth acoustics scores for the natural language options of the grammar with grammar weights for the natural language options of the grammar to compute fourth computed scores for the natural language option of the grammar by using grammar weights for the natural language options of the grammar to bias the determined fourth acoustic scores; and choose the natural language option of the grammar having the highest fourth computed score as the natural language interpretation of the fourth utterance; and wherein the set of speech recognition parameters comprises one or both of an accuracy setting and a sensitivity value. - View Dependent Claims (20)
-
Specification