METHOD FOR SUPPORTING DYNAMIC GRAMMARS IN WFST-BASED ASR
First Claim
1. A method for recognizing speech, the method comprising:
- at an electronic device;
receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and
receiving speech input from a user;
in response to receiving the speech input;
composing a weighted finite state transducer having a first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data;
transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and
outputting the word based on the associated probability.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and processes are disclosed for recognizing speech using a weighted finite state transducer (WFST) approach. Dynamic grammars can be supported by constructing the final recognition cascade during runtime using difference grammars. In a first grammar, non-terminals can be replaced with a, weighted phone loop that produces sequences of mono-phone words. In a second grammar, at runtime, non-terminals can be replaced with sub-grammars derived from user-specific usage data including contact, media, and application lists. Interaction frequencies associated with these entities can be used to weight certain words over others. With all non-terminals replaced, a static recognition cascade with the first grammar can be composed with the personalized second grammar to produce a user-specific WEST. User speech can then be processed to generate candidate words having associated probabilities, and the likeliest result can be output.
-
Citations
23 Claims
-
1. A method for recognizing speech, the method comprising:
at an electronic device; receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and receiving speech input from a user; in response to receiving the speech input; composing a weighted finite state transducer having a first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data; transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and outputting the word based on the associated probability. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
14. A non-transitory computer-readable storage medium comprising computer-executable instructions for:
-
receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and receiving speech input from a user; in response to receiving the speech input; composing a weighted finite state transducer having a first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data; transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and outputting the word based on the associated probability. - View Dependent Claims (15, 16, 17, 18)
-
-
19. A system for recognizing speech, the system comprising:
-
one or more processors; memory; and one or more programs;
wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for;receiving user-specific usage data comprising one or more entities and an indication of user interaction with the one or more entities; and receiving speech input from a user; in response to receiving the speech input; composing a weighted finite state transducer having first grammar transducer with a second grammar transducer, wherein the second grammar transducer comprises the user-specific usage data; transducing the speech input into a word and an associated probability using the weighted finite state transducer composed with the second grammar transducer; and outputting the word based on the associated probability. - View Dependent Claims (20, 21, 22, 23)
-
Specification