Adapting enhanced acoustic models

US 8,185,392 B1
Filed: 09/30/2011
Issued: 05/22/2012
Est. Priority Date: 07/13/2010
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

one or more computers; and

a computer-readable medium coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations comprising;

receiving voice queries submitted by different users,obtaining, for each of a plurality of the voice queries, feedback information that references an action taken by a user that submitted the voice query after reviewing a result of the voice query,determining, for each of the plurality of the voice queries, a speech recognizer confidence measure for the voice query,generating, for each of the plurality of the voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is generated based at least on the feedback information for the voice query and a speech recognizer confidence measure for the voice query, wherein generating the posterior recognition confidence measures comprises;

generating the posterior recognition confidence measure for at least one of the plurality of the voice queries based on information that identifies that a user revealed an alternates list on a user interface of the mobile device, in which case the posterior recognition confidence measure is adjusted to indicate an increased probability that the voice query was correctly recognized; and

generating the posterior recognition confidence measure for at least one of the plurality of the voice queries based on information that identifies that a user did not reveal an alternates list, in which case the posterior recognition confidence measure is adjusted to indicate a decreased probability that the voice query was correctly recognized;

selecting a subset of the plurality of the voice queries for adapting an acoustic model based on the generated posterior recognition confidence measures, wherein the subset includes voice queries submitted by different users, andadapting the acoustic model using the subset of the voice queries.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving voice queries, obtaining, for one or more of the voice queries, feedback information that references an action taken by a user that submitted the voice query after reviewing a result of the voice query, generating, for the one or more voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is generated based at least on the feedback information for the voice query, selecting a subset of the one or more voice queries based on the posterior recognition confidence measures, and adapting an acoustic model using the subset of the voice queries.

Citations

22 Claims

1. A system comprising:
- one or more computers; and
  
  a computer-readable medium coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations comprising;
  
  receiving voice queries submitted by different users,obtaining, for each of a plurality of the voice queries, feedback information that references an action taken by a user that submitted the voice query after reviewing a result of the voice query,determining, for each of the plurality of the voice queries, a speech recognizer confidence measure for the voice query,generating, for each of the plurality of the voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is generated based at least on the feedback information for the voice query and a speech recognizer confidence measure for the voice query, wherein generating the posterior recognition confidence measures comprises;
  
  generating the posterior recognition confidence measure for at least one of the plurality of the voice queries based on information that identifies that a user revealed an alternates list on a user interface of the mobile device, in which case the posterior recognition confidence measure is adjusted to indicate an increased probability that the voice query was correctly recognized; and
  
  generating the posterior recognition confidence measure for at least one of the plurality of the voice queries based on information that identifies that a user did not reveal an alternates list, in which case the posterior recognition confidence measure is adjusted to indicate a decreased probability that the voice query was correctly recognized;
  
  selecting a subset of the plurality of the voice queries for adapting an acoustic model based on the generated posterior recognition confidence measures, wherein the subset includes voice queries submitted by different users, andadapting the acoustic model using the subset of the voice queries.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
- - 2. The system of claim 1, wherein:
    - the posterior recognition confidence measure reflects a probability that a first acoustic model correctly recognized the voice query; and
      
      adapting an acoustic model further comprises building a different, second acoustic model.
  - 3. The system of claim 2, wherein:
    - building the second acoustic model further comprises building the second acoustic model using the first acoustic model and the subset of the voice queries.
  - 4. The system of claim 1, wherein selecting the subset of the voice queries further comprises selecting the voice queries that have a posterior recognition confidence measure above a predefined threshold.
  - 5. The system of claim 4, wherein selecting the subset of the voice queries further comprises selecting the voice queries that have a posterior recognition confidence measure above a first predefined threshold and below a second predefined threshold.
  - 6. The system of claim 1, wherein the feedback information identifies whether the user selected the result of the voice query within a predetermined period of time of submitting the voice query.
  - 7. The system of claim 1, wherein the feedback information identifies whether the user submitted another voice query within a predetermined period of time of submitting the voice query.
  - 8. The system of claim 1, wherein the feedback information identifies whether the user typed into a query box after receiving the result.
  - 9. The system of claim 1, wherein the feedback information identifies whether the user explicitly confirmed that the voice query was correctly recognized.
  - 10. The system of claim 1, wherein adapting an acoustic model further comprises adapting an acoustic model using only the subset of the voice queries.
  - 11. The system of claim 1, wherein the operations further comprise:
    - receiving an additional voice query; and
      
      performing speech recognition on the additional voice query using the acoustic model.
  - 12. The system of claim 11, wherein the operations further comprise:
    - performing a search query using a result of the speech recognition.
  - 13. The system of claim 1, wherein selecting the subset of the plurality of the voice queries comprises selecting fewer than all of the plurality of the voice queries.
  - 14. The system of claim 1, wherein generating the posterior recognition confidence measure comprises adjusting the speech recognizer confidence measure for a particular voice query based on the feedback information for the voice query.
  - 15. The system of claim 1, wherein selecting a subset of the voice queries comprises selecting voice queries of the plurality of voice queries for which the speech recognizer confidence measure exceeds predetermined threshold except where the user types a correction within a predetermined period of time.
  - 16. The system of claim 1, wherein selecting a subset of the voice queries comprises selecting voice queries of the plurality of voice queries for which the speech recognizer confidence measure does not exceed a predetermined threshold, but for which a user opened or selected an alternates list.

17. A non-transitory computer readable storage medium encoded with a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
- receiving voice queries submitted by different users,obtaining, for each of a plurality of the voice queries, feedback information that references an action taken by a user that submitted the voice query after reviewing a result of the voice query,determining, for each of the plurality of the voice queries, a speech recognizer confidence measure for the voice query,generating, for each of the plurality of the voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is generated based at least on the feedback information for the voice query and a speech recognizer confidence measure for the voice query, wherein generating the posterior recognition confidence measures comprises;
  
  generating the posterior recognition confidence measure for at least one of the plurality of the voice queries based on information that identifies that a user revealed an alternates list on a user interface of the mobile device, in which case the posterior recognition confidence measure is adjusted to indicate an increased probability that the voice query was correctly recognized; and
  
  generating the posterior recognition confidence measure for at least one of the plurality of the voice queries based on information that identifies that a user did not reveal an alternates list, in which case the posterior recognition confidence measure is adjusted to indicate a decreased probability that the voice query was correctly recognized;
  
  selecting a subset of the plurality of the voice queries for adapting an acoustic model based on the generated posterior recognition confidence measures, wherein the subset includes voice queries submitted by different users, andadapting the acoustic model using the subset of the voice queries.

18. A computer-implemented method comprising:
- receiving voice queries submitted by different users,obtaining, for each of a plurality of the voice queries, feedback information that references an action taken by a user that submitted the voice query after reviewing a result of the voice query,determining, for each of the plurality of the voice queries, a speech recognizer confidence measure for the voice query,generating, for each of the plurality of the voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is generated based at least on the feedback information for the voice query and a speech recognizer confidence measure for the voice query, wherein generating the posterior recognition confidence measures comprises;
  
  generating the posterior recognition confidence measure for at least one of the plurality of the voice queries based on information that identifies that a user revealed an alternates list on a user interface of the mobile device, in which case the posterior recognition confidence measure is adjusted to indicate an increased probability that the voice query was correctly recognized; and
  
  generating the posterior recognition confidence measure for at least one of the plurality of the voice queries based on information that identifies that a user did not reveal an alternates list, in which case the posterior recognition confidence measure is adjusted to indicate a decreased probability that the voice query was correctly recognized;
  
  selecting a subset of the plurality of the voice queries for adapting an acoustic model based on the generated posterior recognition confidence measures, wherein the subset includes voice queries submitted by different users, andadapting the acoustic model using the subset of the voice queries.
- View Dependent Claims (19, 20)
- - 19. The method of claim 18, wherein:
    - the posterior recognition confidence measure reflects a probability that a first acoustic model correctly recognized the voice query; and
      
      adapting an acoustic model further comprises building a different, second acoustic model.
  - 20. The method of claim 19, wherein:
    - building the second acoustic model further comprises building the second acoustic model using the first acoustic model and the subset of the voice queries.

21. A system comprising:
- one or more computers; and
  
  a computer-readable medium coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations comprising;
  
  receiving voice queries submitted by different users,determining, for each of a plurality of the voice queries, a speech recognizer confidence measure for the voice query,obtaining, for each of the plurality of the voice queries, feedback information that references an action taken by a user that submitted the voice query after reviewing a result of the voice query,determining, for each of the plurality of the voice queries and based on the feedback information, whether the user took a particular action after receiving the result of the voice query,generating, for each of the plurality of the voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is based at least on a speech recognizer confidence measure for the voice query, wherein generating the posterior recognition confidence measures comprises;
  
  generating the posterior recognition confidence measure for at least one of the plurality of the voice queries based on information that identifies that a user revealed an alternates list on a user interface of the mobile device, in which case the posterior recognition confidence measure is adjusted to indicate an increased probability that the voice query was correctly recognized; and
  
  generating the posterior recognition confidence measure for at least one of the plurality of the voice queries based on information that identifies that a user did not reveal an alternates list, in which case the posterior recognition confidence measure is adjusted to indicate a decreased probability that the voice query was correctly recognized;
  
  selecting an acoustic model adaption subset of the plurality of voice queries for which the user is determined to have taken the particular action and for which the generated posterior recognition confidence measure satisfies a predetermined threshold, the acoustic model adaption subset comprising voice queries from different users, andadapting an acoustic model using the acoustic model adaptation subset of the voice queries.
- View Dependent Claims (22)
- - 22. The system of claim 21, further comprising:
    - generating, for the one or more voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is generated based at least on the feedback information for the voice query,wherein selecting a subset further comprises selecting a subset of the one or more voice queries in which the posterior recognition confidence measure satisfies a predefined threshold, and in which the user is determined to have taken the particular action.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Strope, Brian, Beeferman, Douglas H.
Primary Examiner(s)
Shah, Paras D

Application Number

US13/249,332
Time in Patent Office

235 Days
Field of Search

704/231, 704/236, 704/239, 704/251, 704/252
US Class Current

704/252
CPC Class Codes

G10L 15/01   Assessment or evaluation of...

G10L 15/065   Adaptation

G10L 15/07   to the speaker

G10L 15/10   using distance or distortio...

G10L 17/02   Preprocessing operations, e...

Adapting enhanced acoustic models

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Adapting enhanced acoustic models

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links