Method for supporting dynamic grammars in WFST-based ASR
First Claim
1. A method for recognizing speech, the method comprising:
- at an electronic device;
receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and
receiving speech input from a user;
in response to receiving the speech input;
composing a weighted finite state transducer having a first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data;
transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and
outputting the word based on the associated probability.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and processes are disclosed for recognizing speech using a weighted finite state transducer (WFST) approach. Dynamic grammars can be supported by constructing the final recognition cascade during runtime using difference grammars. In a first grammar, non-terminals can be replaced with a, weighted phone loop that produces sequences of mono-phone words. In a second grammar, at runtime, non-terminals can be replaced with sub-grammars derived from user-specific usage data including contact, media, and application lists. Interaction frequencies associated with these entities can be used to weight certain words over others. With all non-terminals replaced, a static recognition cascade with the first grammar can be composed with the personalized second grammar to produce a user-specific WEST. User speech can then be processed to generate candidate words having associated probabilities, and the likeliest result can be output.
2847 Citations
23 Claims
-
1. A method for recognizing speech, the method comprising:
at an electronic device; receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and receiving speech input from a user; in response to receiving the speech input; composing a weighted finite state transducer having a first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data; transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and outputting the word based on the associated probability. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
14. A non-transitory computer-readable storage medium comprising computer-executable instructions for:
-
receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and receiving speech input from a user; in response to receiving the speech input; composing a weighted finite state transducer having a first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data; transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and outputting the word based on the associated probability. - View Dependent Claims (15, 16, 17, 18)
-
-
19. A system for recognizing speech, the system comprising:
-
one or more processors; memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for; receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and receiving speech input from a user; in response to receiving the speech input; composing a weighted finite state transducer having a first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data; transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and outputting the word based on the associated probability. - View Dependent Claims (20, 21, 22, 23)
-
Specification