METHOD FOR SUPPORTING DYNAMIC GRAMMARS IN WFST-BASED ASR

US 20150348547A1
Filed: 09/23/2014
Published: 12/03/2015
Est. Priority Date: 05/27/2014
Status: Active Grant

First Claim

Patent Images

1. A method for recognizing speech, the method comprising:

at an electronic device;

receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and

receiving speech input from a user;

in response to receiving the speech input;

composing a weighted finite state transducer having a first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data;

transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and

outputting the word based on the associated probability.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and processes are disclosed for recognizing speech using a weighted finite state transducer (WFST) approach. Dynamic grammars can be supported by constructing the final recognition cascade during runtime using difference grammars. In a first grammar, non-terminals can be replaced with a, weighted phone loop that produces sequences of mono-phone words. In a second grammar, at runtime, non-terminals can be replaced with sub-grammars derived from user-specific usage data including contact, media, and application lists. Interaction frequencies associated with these entities can be used to weight certain words over others. With all non-terminals replaced, a static recognition cascade with the first grammar can be composed with the personalized second grammar to produce a user-specific WEST. User speech can then be processed to generate candidate words having associated probabilities, and the likeliest result can be output.

Citations

23 Claims

1. A method for recognizing speech, the method comprising:
- at an electronic device;
  
  receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and
  
  receiving speech input from a user;
  
  in response to receiving the speech input;
  
  composing a weighted finite state transducer having a first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data;
  
  transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and
  
  outputting the word based on the associated probability.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method of claim 1, wherein the one or more entities comprise a list of user contacts.
  - 3. The method of claim 2, wherein the indication of user interaction comprises a frequency of interaction with a contact in the list of user contacts.
  - 4. The method of claim 1, wherein the one or more entities comprise a list of applications on a device associated with the user.
  - 5. The method of claim 4, wherein the indication of user interaction comprises a frequency of interaction with an application in the list of applications.
  - 6. The method of claim 1, wherein the one or more entities comprise a list of media associated with the user.
  - 7. The method of claim 6, wherein the indication of user interaction comprises a play frequency of media in the list of media.
  - 8. The method of claim 1, wherein the weighted finite state transducer comprises a context-dependency transducer and a lexicon transducer.
  - 9. The method of claim 1, wherein the first grammar transducer comprises a weighted phone loop capable of generating a sequence of mono-phone words.
  - 10. The method of claim 1, wherein the associated probability is based on a likelihood that the word corresponds to the speech input, and wherein the likelihood is based on the user-specific usage data.
  - 11. The method of claim 1, wherein outputting the word comprises:
    - transmitting the word to a user device.
  - 12. The method of claim 1, wherein outputting the word comprises:
    - transmitting the word to a virtual assistant knowledge system.
  - 13. The method of claim 1, wherein outputting the word comprises:
    - transmitting the word to a server.

14. A non-transitory computer-readable storage medium comprising computer-executable instructions for:
- receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and
  
  receiving speech input from a user;
  
  in response to receiving the speech input;
  
  composing a weighted finite state transducer having a first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data;
  
  transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and
  
  outputting the word based on the associated probability.
- View Dependent Claims (15, 16, 17, 18)
- - 15. The non-transitory computer-readable storage medium of claim 14, wherein the one or more entities comprise a list of user contacts.
  - 16. The non-transitory computer-readable storage medium of claim 15, wherein the indication of user interaction comprises a frequency of interaction with a contact in the list of user contacts.
  - 17. The non-transitory computer-readable storage medium of claim 14, wherein the one or more entities comprise a list of applications on a device associated with the user.
  - 18. The non-transitory computer-readable storage medium of claim 17, wherein the indication of user interaction comprises a frequency of interaction with an application in the list of applications.

19. A system for recognizing speech, the system comprising:
- one or more processors;
  
  memory; and
  
  one or more programs;
  
  wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for;
  
  receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and
  
  receiving speech input from a user;
  
  in response to receiving the speech input;
  
  composing a weighted finite state transducer having first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data;
  
  transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and
  
  outputting the word based on the associated probability.
- View Dependent Claims (20, 21, 22, 23)
- - 20. The system of claim 19, wherein the one or more entities comprise a list of user contacts.
  - 21. The system of claim 20, wherein the indication of user interaction comprises a frequency of interaction with a contact in the list of user contacts.
  - 22. The system of claim 19, wherein the one or more entities comprise a list of applications on a device associated with the user.
  - 23. The system of claim 22, wherein the indication of user interaction comprises a frequency of interaction with an application in the list of applications.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
PAULIK, Matthias, HUANG, Rongqing

Granted Patent

US 9,502,031 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 40/211   Syntactic parsing, e.g. bas...

G10L 15/19   Grammatical context, e.g. d...

G10L 15/197   Probabilistic grammars, e.g...

G10L 15/22   Procedures used during a sp...

G10L 2015/223   Execution procedure of a sp...

G10L 2015/226   using non-speech characteri...

METHOD FOR SUPPORTING DYNAMIC GRAMMARS IN WFST-BASED ASR

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

METHOD FOR SUPPORTING DYNAMIC GRAMMARS IN WFST-BASED ASR

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links