Dynamic speech recognition data evaluation

US 10,192,555 B2
Filed: 04/28/2016
Issued: 01/29/2019
Est. Priority Date: 04/28/2016
Status: Active Grant

First Claim

Patent Images

1. A method of dynamically providing speech recognition data from a client computing device to a server computing device, the method comprising:

receiving audio input at the client computing device;

processing the audio input to generate the speech recognition data;

determining a first estimated confidence level for a first identified portion of the speech recognition data comprising a first feature vector, wherein the first estimated confidence level exceeds a predetermined confidence threshold that corresponds to a valid result;

based on determining that the first estimated confidence level corresponds to the valid result, continuing to process the speech recognition data with the first identified portion;

determining a second estimated confidence level for a second identified portion of the speech recognition data comprising a second feature vector, wherein the second estimated confidence level also exceeds the predetermined confidence threshold that corresponds to the valid result;

identifying at least one statistically improbable characteristic associated with the second feature vector;

determining that the client computing device comprises a first feature extractor;

comparing the first feature extractor of the client computing device with a second feature extractor utilized by the server computing device;

based on comparing the first feature extractor of the client computing device with the second feature extractor utilized by the server computing device, determining that the second feature extractor of the server computing device is different from the first feature extractor;

based on (1) determining the second estimated confidence level corresponds to the valid result, (2) identifying the at least one statistically improbable characteristic, and (3) determining that the server computing device comprises the second feature extractor different from the first feature extractor, providing the second feature vector to the server computing device for an evaluation of the second feature vector by the second, different feature extractor.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Computing devices and methods for providing speech recognition data from one computing device to another device are disclosed. In one disclosed embodiment, audio input is received at a client device and processed to generate speech recognition data. An estimated confidence level is determined for a portion of the data, where the confidence level exceeds a predetermined confidence threshold corresponding to a valid result. At least one statistically improbable characteristic associated with the portion of data is identified. Based on identifying the statistically improbable characteristic, the portion of data is provided to a server computing device for evaluation.

38 Citations

View as Search Results

16 Claims

1. A method of dynamically providing speech recognition data from a client computing device to a server computing device, the method comprising:
- receiving audio input at the client computing device;
  
  processing the audio input to generate the speech recognition data;
  
  determining a first estimated confidence level for a first identified portion of the speech recognition data comprising a first feature vector, wherein the first estimated confidence level exceeds a predetermined confidence threshold that corresponds to a valid result;
  
  based on determining that the first estimated confidence level corresponds to the valid result, continuing to process the speech recognition data with the first identified portion;
  
  determining a second estimated confidence level for a second identified portion of the speech recognition data comprising a second feature vector, wherein the second estimated confidence level also exceeds the predetermined confidence threshold that corresponds to the valid result;
  
  identifying at least one statistically improbable characteristic associated with the second feature vector;
  
  determining that the client computing device comprises a first feature extractor;
  
  comparing the first feature extractor of the client computing device with a second feature extractor utilized by the server computing device;
  
  based on comparing the first feature extractor of the client computing device with the second feature extractor utilized by the server computing device, determining that the second feature extractor of the server computing device is different from the first feature extractor;
  
  based on (1) determining the second estimated confidence level corresponds to the valid result, (2) identifying the at least one statistically improbable characteristic, and (3) determining that the server computing device comprises the second feature extractor different from the first feature extractor, providing the second feature vector to the server computing device for an evaluation of the second feature vector by the second, different feature extractor.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, further comprising using one or more machine learning techniques to identify the at least one statistically improbable characteristic.
  - 3. The method of claim 1, wherein the first identified portion and the second identified portion of the speech recognition data comprise audio data generated from the audio input.
  - 4. The method of claim 1, wherein the client computing device comprises an acoustic representation generator that processes the one or more feature vectors, the method further comprising providing state information of the acoustic representation generator to the server computing device.
  - 5. The method of claim 1, wherein the first identified portion and the second identified portion of the speech recognition data comprise one or more speech components.
  - 6. The method of claim 5, further comprising, based on determining that the client computing device comprises a first acoustic representation generator and that the server computing device comprises a second, different acoustic representation generator, providing the one or more speech components generated by the first acoustic representation generator to the server computing device for processing by the second, different acoustic representation generator.
  - 7. The method of claim 1, wherein the first identified portion and the second identified portion of the speech recognition data comprise recognized text.
  - 8. The method of claim 1, further comprising:
    - receiving from the server computing device weighting information derived from the evaluation of the speech recognition data; and
      
      using the weighting information to bias a speech recognition engine of the client.

9. A computing device, comprising:
- a processor;
  
  a mass storage device; and
  
  a speech recognition program stored in the mass storage device, the speech recognition program comprising instructions executable by the processor to;
  
  receive audio input;
  
  process the audio input to generate speech recognition data;
  
  determine a first estimated confidence level for a first identified portion of the speech recognition data comprising a first feature vector, wherein the first estimated confidence level exceeds a predetermined confidence threshold that corresponds to a valid result;
  
  based on determining that the first estimated confidence level corresponds to the valid result, continue to process the speech recognition data with the first identified portion;
  
  determine a second estimated confidence level for a second identified portion of the speech recognition data comprising a second feature vector, wherein the second estimated confidence level also exceeds the predetermined confidence threshold that corresponds to the valid result;
  
  identify at least one statistically improbable characteristic associated with the second feature vector;
  
  determine that the client computing device comprises a first feature extractor;
  
  compare the first feature extractor of the computing device with a second feature extractor utilized by a different computing device;
  
  based on comparing the first feature extractor of the computing device with the second feature extractor utilized by the different computing device, determine that the second feature extractor of the different computing device is different from the first feature extractor;
  
  based on (1) determining the second estimated confidence level corresponds to the valid result, (2) identifying the at least one statistically improbable characteristic, and (3) determining that the different computing device comprises the second feature extractor different from the first feature extractor, provide the second feature vector to the different computing device for an evaluation of the second feature vector by the second, different feature extractor.
- View Dependent Claims (10, 11, 12, 13, 14)
- - 10. The computing device of claim 9, wherein the speech recognition program is configured to use one or more machine learning techniques to identify the at least one statistically improbable characteristic.
  - 11. The computing device of claim 9, wherein the first identified portion and the second identified portion of the speech recognition data comprise audio data generated from the audio input.
  - 12. The computing device of claim 9, wherein the speech recognition program comprises an acoustic representation generator that processes the first feature vector and the second feature vector, the speech recognition program configured to provide state information of the acoustic representation generator to the different computing device.
  - 13. The computing device of claim 9, wherein the first identified portion and the second identified portion of the speech recognition data comprise one or more speech components.
  - 14. The computing device of claim 9, wherein the speech recognition program is configured to:
    - receive from the different computing device weighting information derived from the evaluation of the speech recognition data; and
      
      use the weighting information to bias a speech recognition engine of the speech recognition program.

15. A computing device, comprising:
- a processor;
  
  a mass storage device; and
  
  a speech recognition program stored in the mass storage device, the speech recognition program comprising instructions executable by the processor to;
  
  receive audio input;
  
  process the audio input to generate speech recognition data, wherein the speech recognition data comprises one or more of audio data, feature vectors, speech components, and recognized text;
  
  determine a first estimated confidence level for a first identified portion of the speech recognition data comprising a first feature vector, wherein the first estimated confidence level exceeds a predetermined confidence threshold that corresponds to a valid result;
  
  based on determining that the first estimated confidence level corresponds to the valid result, continue to process the speech recognition data with the first identified portion;
  
  determine a second estimated confidence level for a second identified portion of the speech recognition data comprising a second feature vector, wherein the second estimated confidence level also exceeds the predetermined confidence threshold that corresponds to the valid result;
  
  use one or more machine learning techniques to identify at least one statistically improbable characteristic associated with the second feature vectordetermine that the computing device comprises a first feature extractor;
  
  compare the first feature extractor of the computing device with a second feature extractor utilized by a different computing device;
  
  based on comparing the first feature extractor of the computing device with the second feature extractor utilized by the different computing device, determine that the second feature extractor of the different computing device is different from the first feature extractor;
  
  based on (1) determining the second estimated confidence level corresponds to the valid result, (2) identifying the at least one statistically improbable characteristic, and (3) determining that the different computing device comprises the second feature extractor different from the first feature extractor, provide the second feature vector to the different computing device for an evaluation of the second feature vector by the second, different feature extractor;
  
  receive from the different computing device weighting information derived from the evaluation of the second feature vector; and
  
  use the weighting information to bias a speech recognition engine of the speech recognition program.
- View Dependent Claims (16)
- - 16. The computing device of claim 15, wherein the speech recognition data comprises feature vectors that are generated by an acoustic representation generator of the speech recognition program, the speech recognition program configured to provide state information of the acoustic representation generator to the different computing device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Lovitt, Andrew William
Primary Examiner(s)
Adesanya, Olujimi

Application Number

US15/140,704
Publication Number

US 20170316780A1
Time in Patent Office

1,006 Days
Field of Search

704231, 704235, 704246, 704270
US Class Current
CPC Class Codes

G10L 15/01   Assessment or evaluation of...

G10L 15/16   using artificial neural net...

G10L 15/30   Distributed recognition, e....

G10L 15/32   Multiple recognisers used i...

G10L 2015/025   Phonemes, fenemes or fenone...

Dynamic speech recognition data evaluation

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

38 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Dynamic speech recognition data evaluation

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

38 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links