SYSTEM AND METHOD FOR GENERATING USER MODELS FROM TRANSCRIBED DIALOGS

US 20110054893A1
Filed: 09/02/2009
Published: 03/03/2011
Est. Priority Date: 09/02/2009
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of generating personalized user models, the method comprising:

receiving automatic speech recognition (ASR) output of a plurality of speech interactions with a user;

receiving an ASR transcription error model characterizing how ASR transcription errors are made;

generating guesses of a true transcription and a user model via an expectation maximization (EM) algorithm based on the error model and the respective ASR output; and

generating a personalized user model based on the guesses.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for generating personalized user models. The method includes receiving automatic speech recognition (ASR) output of speech interactions with a user, receiving an ASR transcription error model characterizing how ASR transcription errors are made, generating guesses of a true transcription and a user model via an expectation maximization (EM) algorithm based on the error model and the respective ASR output where the guesses will converge to a personalized user model which maximizes the likelihood of the ASR output. The ASR output can be unlabeled. The method can include casting speech interactions as a dynamic Bayesian network with four variables: (s), (u), (r), (m), and encoding relationships between (s), (u), (r), (m) as conditional probability tables. At each dialog turn (r) and (m) are known and (s) and (u) are hidden.

12 Citations

View as Search Results

20 Claims

1. A computer-implemented method of generating personalized user models, the method comprising:
- receiving automatic speech recognition (ASR) output of a plurality of speech interactions with a user;
  
  receiving an ASR transcription error model characterizing how ASR transcription errors are made;
  
  generating guesses of a true transcription and a user model via an expectation maximization (EM) algorithm based on the error model and the respective ASR output; and
  
  generating a personalized user model based on the guesses.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The computer-implemented method of claim 1, wherein generating the guesses of the true transcription and a user model further comprises, iteratively:
    - alternating between generating a guess of the true transcription and of the user model; and
      
      a current guess of one type is used to generate a next guess of the other type.
  - 3. The computer-implemented method of claim 2, wherein iteratively generating the guesses of the true transcription and the user model until a threshold is met.
  - 4. The computer-implemented method of claim 1, wherein the EM algorithm estimates conditional probabilities of hidden variables.
  - 5. The computer-implemented method of claim 1, wherein generating the guesses is further based on a set of manual transcriptions of speech interactions with the user.
  - 6. The computer-implemented method of claim 5, wherein the set of manual transcriptions is less numerous than the ASR output.
  - 7. The method of claim 1, the method further comprising:
    - casting the speech interactions as a dynamical Bayesian network with four variables;
      
      (s), (u), (r), and (m); and
      
      encoding relationships between the (s), (u), (r), and (m) as conditional probability tables.
  - 8. The computer-implemented method of claim 7, wherein at each dialog turn (r) and (m) are known and (s) and (u) are hidden.
  - 9. The computer-implemented method of claim 1, the method further comprising generating a personalized ASR model based on the personalized user model.
  - 10. The computer-implemented method of claim 1, the method further comprising recognizing additional speech from the user based on the personalized speech model.
  - 11. The computer-implemented method of claim 10, the method further comprising iteratively improving the personalized speech model based on the additional speech.
  - 12. The computer-implemented method of claim 1, wherein generating the personalized user model is further based on a previously generated personalized user model.
  - 13. The computer-implemented method of claim 12, wherein the previously generated personalized user model is a template.
  - 14. The computer-implemented method of claim 12, wherein the previously generated personalized user model is from a similar user.
  - 15. The computer-implemented method of claim 1, wherein the ASR output is unlabeled.

16. A system for recognizing speech using personalized speech models, the system comprising:
- a processor;
  
  a module configured to control the processor to receive automatic speech recognition (ASR) output of a plurality of speech interactions with a user;
  
  a module configured to control the processor to receive an ASR transcription error model characterizing how ASR transcription errors are made;
  
  a module configured to control the processor to generate guesses of a true transcription and a user model via an expectation maximization (EM) algorithm based on the error model and the respective ASR output; and
  
  a module configured to control the processor to generate a personalized user model based on the guesses.
- View Dependent Claims (17, 18)
- - 17. The system of claim 16, wherein the ASR output is unlabeled.
  - 18. The system of claim 16, wherein the module configured to control the processor to generate the guesses of the true transcription and the user model further comprises a module configured to iteratively perform the following steps:
    - alternating between generating a guess of the true transcription and of the user model; and
      
      a current guess of one type is used to generate a next guess of the other type.

19. A computer-readable storage medium storing a computer program having instructions for controlling a processor to generate personalized user models, the instructions comprising:
- receiving a user model personalized for a specific user generated by steps comprising;
  
  receiving automatic speech recognition (ASR) output of a plurality of speech interactions with the specific user;
  
  receiving an ASR transcription error model characterizing how ASR transcription errors are made;
  
  generating guesses of a true transcription and a user model via an expectation maximization (EM) algorithm based on the error model and the respective ASR output;
  
  generating a personalized user model for the specific user based on the guesses; and
  
  building a personalized dialog system for the specific user based on the received personalized user model.
- View Dependent Claims (20)
- - 20. The computer-readable storage medium of claim 19, wherein the ASR output is unlabeled.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Williams, Jason, Syed, Umar

Granted Patent

US 8,473,292 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/235
CPC Class Codes

G10L 15/07   to the speaker

G10L 15/26   Speech to text systems G10L...

G10L 2015/0631   Creating reference template...

SYSTEM AND METHOD FOR GENERATING USER MODELS FROM TRANSCRIBED DIALOGS

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

12 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM AND METHOD FOR GENERATING USER MODELS FROM TRANSCRIBED DIALOGS

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

12 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links