Adapting language models with a bit mask for a subset of related words
First Claim
1. A computer-readable storage device having instructions stored which, when executed by a processor, cause the processor to perform operations comprising:
- receiving, via the processor, a language model having a vocabulary of words, the language model comprising a speech recognizer output lattice;
receiving a user query;
identifying, via the processor, an adaptation subset of related words in the vocabulary of words in the language model based on the user query;
adding a geographically-specific bit mask to each word in the vocabulary of words based on the adaptation subset, thereby generating a masked language model; and
modifying weights of other words in the speech recognizer output lattice based on the masked language model.
3 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are systems, methods, and computer-readable storage media for performing speech recognition based on a masked language model. A system configured to practice the method receives a masked language model including a plurality of words, wherein a bit mask identifies whether each of the plurality of words is allowed or disallowed with regard to an adaptation subset, receives input speech, generates a speech recognition lattice based on the received input speech using the masked language model, removes from the generated lattice words identified as disallowed by the bit mask for the adaptation subset, and recognizes the received speech based on the lattice. Alternatively during the generation step, the system can only add words indicated as allowed by the bit mask. The bit mask can be separate from or incorporated as part of the masked language model. The system can dynamically update the adaptation subset and bit mask.
31 Citations
20 Claims
-
1. A computer-readable storage device having instructions stored which, when executed by a processor, cause the processor to perform operations comprising:
-
receiving, via the processor, a language model having a vocabulary of words, the language model comprising a speech recognizer output lattice; receiving a user query; identifying, via the processor, an adaptation subset of related words in the vocabulary of words in the language model based on the user query; adding a geographically-specific bit mask to each word in the vocabulary of words based on the adaptation subset, thereby generating a masked language model; and modifying weights of other words in the speech recognizer output lattice based on the masked language model. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising; receiving, via the processor, a language model having a vocabulary of words, the language model comprising a speech recognizer output lattice; receiving a user query; identifying, via the processor, an adaptation subset of related words in the vocabulary of words in the language model based on the user query; adding a geographically-specific bit mask to each word in the vocabulary of words based on the adaptation subset, thereby generating a masked language model; and modifying weights of other words in the speech recognizer output lattice based on the masked language model. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A method comprising:
-
receiving, via the processor, a language model having a vocabulary of words, the language model comprising a speech recognizer output lattice; receiving a user query; identifying, via the processor, an adaptation subset of related words in the vocabulary of words in the language model based on the user query; adding a geographically-specific bit mask to each word in the vocabulary of words based on the adaptation subset, thereby generating a masked language model; and modifying weights of other words in the speech recognizer output lattice based on the masked language model. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification