Systems and methods of interpreting speech data

US 10,186,261 B2
Filed: 08/02/2018
Issued: 01/22/2019
Est. Priority Date: 06/05/2014
Status: Active Grant

First Claim

Patent Images

1. A system for automatically providing text data to an electronic form for an uncontrolled environment, the system comprising:

one or more filters to each generate a set of processed audio data based on raw audio data received from one or more computing devices, the one or more filters applying filter processes to the raw audio data to generate the set of processed audio data, the one or more filters comprising at least one filter appropriate for the uncontrolled environment;

a translator, operable by a processor, to provide a set of translation results for the raw audio data based on the set of processed audio data, each translation result being associated with at least one processed audio data and each translation result including a text data and a confidence level associated with that text data;

a memory to store a trigger dictionary including a plurality of trigger identifiers, each trigger identifier is associated with one or more field identifiers; and

in response to receiving the set of translation results, a decision controller is automatically triggered by the processor to;

identify at least one translation result from the set of translation results to represent the raw audio data based at least on the confidence level of the text data, wherein the decision controller is further operated to identify a translation result based on an occurrence frequency when the at least one translation result comprises more than one translation result;

determine a trigger identifier associated with the selected translation result;

search in the trigger dictionary for a field identifier that corresponds to the trigger identifier;

identify, from one or more data fields of the electronic form, a data field associated with a data field identifier corresponding to the field identifier; and

provide the text data of the selected translation result to the identified data field.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Method and systems are provided for interpreting speech data. A method and system for recognizing speech involving a filter module to generate a set of processed audio data based on raw audio data; a translation module to provide a set of translation results for the raw audio data; and a decision module to select the text data that represents the raw audio data. A method for minimizing noise in audio signals received by a microphone array is also described. A method and system of automatic entry of data into one or more data fields involving receiving a processed audio data; and operating a processing module to: search in a trigger dictionary for a field identifier that corresponds to the trigger identifier; identify a data field associated with a data field identifier corresponding to the field identifier; and providing content data associated with the trigger identifier to the identified data field.

Citations

20 Claims

1. A system for automatically providing text data to an electronic form for an uncontrolled environment, the system comprising:
- one or more filters to each generate a set of processed audio data based on raw audio data received from one or more computing devices, the one or more filters applying filter processes to the raw audio data to generate the set of processed audio data, the one or more filters comprising at least one filter appropriate for the uncontrolled environment;
  
  a translator, operable by a processor, to provide a set of translation results for the raw audio data based on the set of processed audio data, each translation result being associated with at least one processed audio data and each translation result including a text data and a confidence level associated with that text data;
  
  a memory to store a trigger dictionary including a plurality of trigger identifiers, each trigger identifier is associated with one or more field identifiers; and
  
  in response to receiving the set of translation results, a decision controller is automatically triggered by the processor to;
  
  identify at least one translation result from the set of translation results to represent the raw audio data based at least on the confidence level of the text data, wherein the decision controller is further operated to identify a translation result based on an occurrence frequency when the at least one translation result comprises more than one translation result;
  
  determine a trigger identifier associated with the selected translation result;
  
  search in the trigger dictionary for a field identifier that corresponds to the trigger identifier;
  
  identify, from one or more data fields of the electronic form, a data field associated with a data field identifier corresponding to the field identifier; and
  
  provide the text data of the selected translation result to the identified data field.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The system of claim 1, wherein the decision controller is operable to:
    - determine a content source identifier for the identified data field based on the trigger identifier, the content source identifier indicating an origin of the raw audio data corresponding to the text data being provided to the identified data field.
  - 3. The system of claim 2, wherein the processor is operable to:
    - provide a user control for receiving an input control signal to access a portion of the processed audio data corresponding to the text data provided to the identified data field.
  - 4. The system of claim 3, wherein the user control is displayed in proximity to the data field.
  - 5. The system of claim 3, wherein the user control comprises an audio icon.
  - 6. The system of claim 1, wherein the decision controller is operable to, in response to failing to identify a data field, indicate the text data associated with the trigger identifier requires additional analysis prior to being provided to the electronic form.
  - 7. The system of claim 6, wherein the processor is operable to:
    - store, in the memory, the text data associated with that trigger identifier; and
      
      associate the text data with a manual analysis identifier for indicating that content data requires additional analysis.
  - 8. The system of claim 1, wherein:
    - each field identifier associated with the respective trigger identifier in the trigger dictionary is associated with one or more expected content data, the one or more expected content data identifying data that is acceptable by the one or more data fields corresponding to the field identifier; and
      
      the decision controller is operable to determine whether the content data corresponds with any expected content data associated with that field identifier.
  - 9. The system of claim 8, wherein the decision controller is operable to, in response to determining the content data fails to correspond to any expected content, indicate the content data associated with that trigger identifier requires additional analysis in order to be inputted into the respective data field.
  - 10. The system of claim 9, wherein the decision controller is operable to:
    - store, in the memory, the text data associated with that trigger identifier; and
      
      associate the text data with a manual analysis identifier for indicating the text data requires additional analysis.

11. A computer-implemented method of automatically providing text data to an electronic form using an uncontrolled environment-based system for recognizing speech, the method comprising:
- operating one or more filters to each generate a set of processed audio data based on raw audio data received from one or more computing devices, the one or more filters being operated to apply filter processes to the raw audio data to generate the set of processed audio data, wherein the one or more filters comprise at least one filter appropriate for the uncontrolled environment-based system for recognizing speech;
  
  operating a translator to provide a set of translation results for the raw audio data based on the set of processed audio data, each translation result being associated with at least one processed audio data and each translation result including a text data and a confidence level associated with that text data; and
  
  in response to receiving the set of translation results, automatically operating a decision controller to;
  
  identify at least one translation result from the set of translation results to represent the raw audio data based on the confidence level of the text data, and when the at least one translation result comprises more than one translation result, operating the decision controller to identify a translation result based on an occurrence frequency;
  
  determine a trigger identifier associated with the selected translation result;
  
  search in a trigger dictionary stored in a memory for a field identifier that corresponds to the trigger identifier, the trigger dictionary including a plurality of trigger identifiers and each trigger identifier is associated with one or more field identifiers;
  
  identify, from one or more data fields of the electronic form, a data field associated with a data field identifier corresponding to the field identifier; and
  
  provide the text data of the selected translation result to the identified data field.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The method of claim 11 comprising:
    - determining a content source identifier for the identified data field based on the trigger identifier, the content source identifier indicating an origin of the raw audio data corresponding to the text data being provided to the identified data field.
  - 13. The method of claim 12 comprising:
    - providing a user control for receiving an input control signal to access a portion of the processed audio data corresponding to the text data provided to the identified data field.
  - 14. The method of claim 13 comprising displaying the user control in proximity to the data field.
  - 15. The method of claim 13, wherein the user control comprises an audio icon.
  - 16. The method of claim 11 comprising in response to failing to identify a data field, indicating the text data associated with the trigger identifier requires additional analysis prior to being provided to the electronic form.
  - 17. The method of claim 14 comprising:
    - storing, in the memory, the text data associated with that trigger identifier; and
      
      associating the text data with a manual analysis identifier for indicating that content data requires additional analysis.
  - 18. The method of claim 11, wherein each field identifier associated with the respective trigger identifier in the trigger dictionary is associated with one or more expected content data, the one or more expected content data identifying data that is acceptable by the one or more data fields corresponding to the field identifier;
    - andthe method comprises determining whether the content data corresponds with any expected content data associated with that field identifier.
  - 19. The method of claim 18 comprising:
    - in response to determining the content data fails to correspond to any expected content, indicating the content data associated with that trigger identifier requires additional analysis in order to be inputted into the respective data field.
  - 20. The method of claim 19 comprising:
    - storing, in the memory, the text data associated with that trigger identifier; and
      
      associating the text data with a manual analysis identifier for indicating the text data requires additional analysis.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Interdev Technologies Inc. (Valsef Group)
Original Assignee
Interdev Technologies Inc. (Valsef Group)
Inventors
Rice, Janet M., Liang, Peng, Kuehn, Terence W.
Primary Examiner(s)
Sharma, Neeraj

Application Number

US16/053,337
Publication Number

US 20180342242A1
Time in Patent Office

173 Days
Field of Search

None
US Class Current
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 15/20   Speech recognition techniqu...

G10L 15/26   Speech to text systems G10L...

G10L 19/26   Pre-filtering or post-filte...

G10L 21/02   Speech enhancement, e.g. no...

G10L 25/93   Discriminating between voic...

H04R 29/006   Microphone matching

Systems and methods of interpreting speech data

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods of interpreting speech data

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links