Correcting voice recognition using selective re-speak

US 10,354,647 B2
Filed: 04/28/2016
Issued: 07/16/2019
Est. Priority Date: 04/28/2015
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method executed by a server system comprising one or more computers, the method comprising:

providing, by the server system, first text for display on a computing device of a user, the first text being received from a first speech recognition engine, the first speech recognition engine having converted first speech received from the computing device into the first text by processing the first speech to generate multiple potential texts and associating each of the multiple potential texts with a respective plurality of entities, and the first text being displayed as a search query prior to executing the search query to obtain search results;

receiving, by the server system, a speech correction indication from the computing device, the speech correction indication (i) initiating a correction of the first text, (ii) providing context to select a portion of the first text that is to be corrected without explicitly indicating the portion of the first text to be corrected and without repeating the first text, and (iii) providing context for selecting second text to correct the portion of the first text without explicitly reciting the second text prior to executing the search query to obtain search results;

processing, by the server system, the speech correction indication to determine both (i) the portion of the first text that is to be corrected and (ii) the second text to correct the portion of the first text prior to executing the search query to obtain search results, the second text determined based on associating second speech with a second respective plurality of entities and selecting as the second text one of the multiple potential texts generated from the first speech and associated with the respective plurality of entities that best matches the second respective plurality of entities associated with the second speech;

replacing, by the server system, the portion of the first text with the second text to provide a combined text prior to executing the search query to obtain search results; and

providing, by the server system, the combined text for display on the computing device as a revised search query.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Implementations of the present disclosure include actions of providing first text for display on a computing device of a user, the first text being provided from a first speech recognition engine based on first speech received from the computing device, and being displayed as a search query, receiving a speech correction indication from the computing device, the speech correction indication indicating a portion of the first text that is to be corrected, receiving second speech from the computing device, receiving second text from a second speech recognition engine based on the second speech, the second speech recognition engine being different from the first speech recognition engine, replacing the portion of the first text with the second text to provide a combined text, and providing the combined text for display on the computing device as a revised search query.

183 Citations

15 Claims

1. A computer-implemented method executed by a server system comprising one or more computers, the method comprising:
- providing, by the server system, first text for display on a computing device of a user, the first text being received from a first speech recognition engine, the first speech recognition engine having converted first speech received from the computing device into the first text by processing the first speech to generate multiple potential texts and associating each of the multiple potential texts with a respective plurality of entities, and the first text being displayed as a search query prior to executing the search query to obtain search results;
  
  receiving, by the server system, a speech correction indication from the computing device, the speech correction indication (i) initiating a correction of the first text, (ii) providing context to select a portion of the first text that is to be corrected without explicitly indicating the portion of the first text to be corrected and without repeating the first text, and (iii) providing context for selecting second text to correct the portion of the first text without explicitly reciting the second text prior to executing the search query to obtain search results;
  
  processing, by the server system, the speech correction indication to determine both (i) the portion of the first text that is to be corrected and (ii) the second text to correct the portion of the first text prior to executing the search query to obtain search results, the second text determined based on associating second speech with a second respective plurality of entities and selecting as the second text one of the multiple potential texts generated from the first speech and associated with the respective plurality of entities that best matches the second respective plurality of entities associated with the second speech;
  
  replacing, by the server system, the portion of the first text with the second text to provide a combined text prior to executing the search query to obtain search results; and
  
  providing, by the server system, the combined text for display on the computing device as a revised search query.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein the portion comprises an entirety of the first text.
  - 3. The method of claim 1, wherein the portion comprises less than an entirety of the first text.
  - 4. The method of claim 1, wherein a second speech recognition engine is configured to select a potential text as the second text based on one or more entities associated with the first text.
  - 5. The method of claim 1, further comprising:
    - receiving search results based on the revised search query; and
      
      providing the search results for display on the computing device.

6. A system comprising:
- one or more computers and one or more storage devices on which are stored instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  providing first text for display on a computing device of a user, the first text being received from a first speech recognition engine, the first speech recognition engine having converted first speech received from the computing device into the first text by processing the first speech to generate multiple potential texts and associating each of the multiple potential texts with a respective plurality of entities, and the first text being displayed as a search query prior to executing the search query to obtain search results;
  
  receiving a speech correction indication from the computing device, the speech correction indication (i) initiating a correction of the first text, (ii) providing context to select a portion of the first text that is to be corrected without explicitly indicating the portion of the first text to be corrected and without repeating the first text, and (iii) providing context for selecting second text to correct the portion of the first text without explicitly reciting the second text prior to executing the search query to obtain search results;
  
  processing the speech correction indication to determine both (i) the portion of the first text that is to be corrected and (ii) the second text to correct the portion of the first text prior to executing the search query to obtain search results, the second text determined based on associating second speech with a second respective plurality of entities and selecting as the second text one of the multiple potential texts generated from the first speech and associated with the respective plurality of entities that best matches the second respective plurality of entities associated with the second speech;
  
  replacing the portion of the first text with the second text to provide a combined text prior to executing the search query to obtain search results; and
  
  providing the combined text for display on the computing device as a revised search query.
- View Dependent Claims (7, 8, 9, 10, 11)
- - 7. The system of claim 6, wherein the portion comprises an entirety of the first text.
  - 8. The system of claim 6, wherein the portion comprises less than an entirety of the first text.
  - 9. The system of claim 6, wherein a second speech recognition engine is configured to select a potential text as the second text based on one or more entities associated with the first text.
  - 10. The system of claim 6, wherein operations further comprise:
    - receiving search results based on the revised search query; and
      
      providing the search results for display on the computing device.
  - 11. The system of claim 6, wherein the operations further comprise:
    - recognizing in the second text an expression that signifies “
      
      I meant”
      
      .

12. A non-transitory computer readable medium storing instructions that, when executed by one or more computers, cause the one or more computers to perform operations comprising:
- providing first text for display on a computing device of a user, the first text being received from a first speech recognition engine, the first speech recognition engine having converted first speech received from the computing device into the first text by processing the first speech to generate multiple potential texts and associating each of the multiple potential texts with a respective plurality of entities, and the first text being displayed as a search query prior to executing the search query to obtain search results;
  
  receiving a speech correction indication from the computing device, the speech correction indication (i) initiating a correction of the first text, (ii) providing context to select a portion of the first text that is to be corrected without explicitly indicating the portion of the first text to be corrected and without repeating the first text, and (iii) providing context for selecting second text to correct the portion of the first text without explicitly reciting the second text prior to executing the search query to obtain search results;
  
  processing the speech indication to determine both (i) the portion of the first text that is to be corrected and (ii) the second text to correct the portion of the first text prior to executing the search query to obtain search results, the second text determined based on associating second speech with a second respective plurality of entities and selecting as the second text one of the multiple potential texts generated from the first speech and associated with the respective plurality of entities that best matches the second respective plurality of entities associated with the second speech;
  
  replacing the portion of the first text with the second text to provide a combined text prior to executing the search query to obtain search results; and
  
  providing the combined text for display on the computing device as a revised search query.
- View Dependent Claims (13, 14, 15)
- - 13. The non-transitory computer readable medium of claim 12, wherein the portion comprises an entirety of the first text.
  - 14. The non-transitory computer readable medium of claim 12, wherein the portion comprises less than an entirety of the first text.
  - 15. The non-transitory computer readable medium of claim 12, wherein a second speech recognition engine is configured to select a potential text as the second text based on one or more entities associated with the first text.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Bakshi, Dhruv, Sabur, Zaheed, Judd, Tilke Mary, Fey, Nicholas G.
Primary Examiner(s)
Adesanya, Olujimi A

Application Number

US15/140,891
Publication Number

US 20160322049A1
Time in Patent Office

1,174 Days
Field of Search

704231, 704235, 704246, 704270
US Class Current
CPC Class Codes

G06F 16/3329   Natural language query form...

G06F 16/632   Query formulation

G06F 16/638   Presentation of query results

G06F 16/685   using automatically derived...

G06F 40/232   Orthographic correction, e....

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 15/32   Multiple recognisers used i...

G10L 2015/223   Execution procedure of a sp...

Correcting voice recognition using selective re-speak

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

183 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Correcting voice recognition using selective re-speak

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

183 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links