SPEECH-TO-TEXT TRAINING DATA BASED ON INTERACTIVE RESPONSE DATA

US 20200135175A1
Filed: 10/29/2018
Published: 04/30/2020
Est. Priority Date: 10/29/2018
Status: Active Grant

First Claim

Patent Images

1. A device comprising:

a memory configured to store speech-to-text training data; and

a processor configured to;

access interactive response (IR) training data of an IR system, the IR training data associating input phrases supported by the IR system to user intent indicators;

in response to determining that a first input phrase of the input phrases includes a first term that is included in a term hierarchy, generate a second phrase by replacing the first term in the first input phrase with a second term included in the term hierarchy;

determine that the IR training data indicates that the first input phrase is associated with a first user intent indicator;

determine that user interaction data indicates that a first proportion of user phrases received by the IR system from users corresponds to the first user intent indicator; and

update the speech-to-text training data based on the first input phrase and the second phrase so that a second proportion of training phrases of the speech-to-text training data corresponds to the first user intent indicator, the second proportion based on the first proportion, wherein a speech-to-text model is trained based on the speech-to-text training data.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A device includes a processor configured to, in response to determining that an input phrase includes a first term that is included in a term hierarchy, generate a second phrase by replacing the first term in the input phrase with a second term included in the term hierarchy. The processor is configured to determine that interactive response (IR) training data indicates that the input phrase is associated with a user intent indicator. The processor is configured to determine that user interaction data indicates that a first proportion of user phrases received by an IR system correspond to the user intent indicator. The processor is configured to update speech-to-text training data based on the input phrase and the second phrase so that a second proportion of training phrases of the speech-to-text training data correspond to the user intent indicator. The second proportion is based on the first proportion. A speech-to-text model is trained based on the speech-to-text training data.

6 Citations

View as Search Results

20 Claims

1. A device comprising:
- a memory configured to store speech-to-text training data; and
  
  a processor configured to;
  
  access interactive response (IR) training data of an IR system, the IR training data associating input phrases supported by the IR system to user intent indicators;
  
  in response to determining that a first input phrase of the input phrases includes a first term that is included in a term hierarchy, generate a second phrase by replacing the first term in the first input phrase with a second term included in the term hierarchy;
  
  determine that the IR training data indicates that the first input phrase is associated with a first user intent indicator;
  
  determine that user interaction data indicates that a first proportion of user phrases received by the IR system from users corresponds to the first user intent indicator; and
  
  update the speech-to-text training data based on the first input phrase and the second phrase so that a second proportion of training phrases of the speech-to-text training data corresponds to the first user intent indicator, the second proportion based on the first proportion, wherein a speech-to-text model is trained based on the speech-to-text training data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The device of claim 1, wherein the user intent indicators include a financial transaction performance indicator, a human operator contact indicator, an information request indicator, or a combination thereof.
  - 3. The device of claim 1, wherein the term hierarchy indicates that the first term is a parent of the second term.
  - 4. The device of claim 1, further comprising an input interface configured to receive an input audio signal, wherein the processor is further configured to:
    - determine, based on the speech-to-text model, that the input audio signal matches the second phrase; and
      
      in response to determining that the input audio signal matches the second phrase, generate an output indicating that the input audio signal matches the second phrase.
  - 5. The device of claim 4, wherein the processor is further configured to process the first user intent indicator in response to determining that the input audio signal matches the second phrase.
  - 6. The device of claim 1, wherein the processor is further configured to update the speech-to-text training data by adding multiple copies of the second phrase to the training phrases.
  - 7. The device of claim 1, wherein the processor is further configured to update the speech-to-text training data by adding the first term to the training phrases.
  - 8. The device of claim 1, wherein the processor is further configured to update the speech-to-text training data by adding the second term to the training phrases.
  - 9. The device of claim 1, wherein the processor is configured to update the speech-to-text model based on the speech-to-text training data.
  - 10. The device of claim 1, further comprising an interface configured to provide, to a second device, the speech-to-text training data to initiate an update of the speech-to-text model.

11. A method comprising:
- accessing, at a device, interactive response (IR) training data of an IR system, the IR training data associating input phrases supported by the IR system to user intent indicators;
  
  determining, at the device, that a first input phrase of the input phrases includes a first term that is included in a term hierarchy;
  
  in response to determining that the first input phrase includes the first term, generating a second phrase by replacing the first term in the first input phrase with a second term included in the term hierarchy;
  
  determining, at the device, that the IR training data indicates that the first input phrase is associated with a first user intent indicator;
  
  determining, at the device, that user interaction data indicates that a first proportion of user phrases received by the IR system from users corresponds to the first user intent indicator; and
  
  updating, at the device, speech-to-text training data based on the first input phrase and the second phrase so that a second proportion of training phrases of the speech-to-text training data corresponds to the first user intent indicator, the second proportion based on the first proportion, wherein a speech-to-text model is trained based on the speech-to-text training data.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
- - 12. The method of claim 11, further comprising:
    - receiving an input audio signal at the device;
      
      determining, based on the speech-to-text model, that the input audio signal matches the second phrase; and
      
      in response to determining that the input audio signal matches the second phrase, processing the first user intent indicator.
  - 13. The method of claim 12, further comprising determining that the first user intent indicator includes a financial transaction performance indicator, wherein processing the first user intent indicator includes initiating a financial transaction.
  - 14. The method of claim 12, further comprising determining that the first user intent indicator includes a human operator contact indicator, wherein processing the first user intent indicator includes initiating contact with a human operator.
  - 15. The method of claim 12, further comprising determining that the first user intent indicator includes an information request indicator, wherein processing the first user intent indicator includes providing information.
  - 16. The method of claim 11, wherein the second term is a parent of the first term in the term hierarchy.
  - 17. The method of claim 11, further comprising updating the speech-to-text training data by adding multiple copies of the first input phrase to the training phrases.
  - 18. The method of claim 11, further comprising updating the speech-to-text training data by adding multiple copies of the second phrase to the training phrases.

19. A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform operations comprising:
- accessing interactive response (IR) training data of an IR system, the IR training data associating input phrases supported by the IR system to user intent indicators;
  
  determining that a first input phrase of the input phrases includes a first term that is included in a term hierarchy;
  
  in response to determining that the first input phrase includes the first term, generating a second phrase by replacing the first term in the first input phrase with a second term included in the term hierarchy;
  
  determining that the IR training data indicates that the first input phrase is associated with a first user intent indicator;
  
  determining that user interaction data indicates that a first proportion of user phrases received by the IR system from users corresponds to the first user intent indicator; and
  
  updating speech-to-text training data based on the first input phrase and the second phrase so that a second proportion of training phrases of the speech-to-text training data corresponds to the first user intent indicator, the second proportion based on the first proportion, wherein a speech-to-text model is based on the speech-to-text training data.
- View Dependent Claims (20)
- - 20. The computer program product of claim 19, wherein the operations further comprise updating the speech-to-text training data by adding multiple copies of the second phrase to the training phrases.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Katz, Edward G., Tonetti, Alexander C., Riendeau, John A., Thatcher, Sean T.

Granted Patent

US 11,062,697 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G10L 15/06   Creation of reference templ...

G10L 15/063   Training

G10L 15/18   using natural language mode...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/30   Distributed recognition, e....

G10L 2015/0635   updating or merging of old ...

SPEECH-TO-TEXT TRAINING DATA BASED ON INTERACTIVE RESPONSE DATA

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

6 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

SPEECH-TO-TEXT TRAINING DATA BASED ON INTERACTIVE RESPONSE DATA

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

6 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links