Automatic language model update
First Claim
1. A computer-implemented method comprising:
- obtaining, by a server-side, updater module of a search system, for each of one or more terms, word count data indicating a number of times that a term has occurred within one or more real-time textual information streams within a predetermined period of time;
after obtaining, for each of the one or more terms, word count data indicating a number of times that the term has occurred within the one or more real-time textual information streams within the predetermined period of time, receiving, by a search engine of the search system, a query including one or more particular terms from a mobile device or a digital assistant device;
in response to receiving, by the search engine of the search system, the query including one or more particular terms from the mobile device or the digital assistant device, transmitting (i) one or more search results associated with the query that are identified by the search engine of the search system, and (ii) particular word count data indicating the number of times that the particular term has occurred within the one or more real-time textual information streams within the predetermined period of time, for use by a speech recognition model trainer that is included on the mobile device or the digital assistant device in updating a language model that is used by an automated speech recognizer that is included on the mobile device or the digital assistant device; and
updating, by the speech recognition model trainer that is included on the mobile device or the digital assistant device, statistical information associated with the language model based at least on the particular word count, to favor one or more words that were received by the search engine within a recent time period over one or more words that were not received by the search engine within the recent time period.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.
90 Citations
17 Claims
-
1. A computer-implemented method comprising:
-
obtaining, by a server-side, updater module of a search system, for each of one or more terms, word count data indicating a number of times that a term has occurred within one or more real-time textual information streams within a predetermined period of time; after obtaining, for each of the one or more terms, word count data indicating a number of times that the term has occurred within the one or more real-time textual information streams within the predetermined period of time, receiving, by a search engine of the search system, a query including one or more particular terms from a mobile device or a digital assistant device; in response to receiving, by the search engine of the search system, the query including one or more particular terms from the mobile device or the digital assistant device, transmitting (i) one or more search results associated with the query that are identified by the search engine of the search system, and (ii) particular word count data indicating the number of times that the particular term has occurred within the one or more real-time textual information streams within the predetermined period of time, for use by a speech recognition model trainer that is included on the mobile device or the digital assistant device in updating a language model that is used by an automated speech recognizer that is included on the mobile device or the digital assistant device; and updating, by the speech recognition model trainer that is included on the mobile device or the digital assistant device, statistical information associated with the language model based at least on the particular word count, to favor one or more words that were received by the search engine within a recent time period over one or more words that were not received by the search engine within the recent time period. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A search system comprising:
-
a processor configured to execute computer program instructions; and a computer storage medium encoded with the computer program instructions that, when executed by the processor, cause the system to perform operations comprising; obtaining, by a server-side, updater module of the search system, for each of one or more terms, word count data indicating a number of times that a term has occurred within one or more real-time textual information streams within a predetermined period of time; after obtaining, for each of the one or more terms, word count data indicating a number of times that the term has occurred within the one or more real-time textual information streams within the predetermined period of time, receiving, by a search engine of the search system, a query including one or more particular terms from a mobile device or a digital assistant device; in response to receiving, by the search engine of the search system, the query including one or more particular terms from the mobile device or the digital assistant device, transmitting (i) one or more search results associated with the query that are identified by the search engine of the search system, and (ii) particular word count data indicating the number of times that the particular term has occurred within the one or more real-time textual information streams within the predetermined period of time, for use by a speech recognition model trainer that is included on the mobile device or the digital assistant device in updating a language model that is used by an automated speech recognizer that is included on the mobile device or the digital assistant device; and updating, by the speech recognition model trainer that is included on the mobile device or the digital assistant device, statistical information associated with the language model based at least on the particular word count, to favor one or more words that were received by the search engine within a recent time period over one or more words that were not received by the search engine within the recent time period. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage device encoded with a computer program, the computer program comprising instructions that, when executed by one or more computers, cause the one or more computers to perform operations comprising:
-
obtaining, by a server-side, updater module of a search system, for each of one or more terms, word count data indicating a number of times that a term has occurred within one or more real-time textual information streams within a predetermined period of time; after obtaining, for each of the one or more terms, word count data indicating a number of times that the term has occurred within the one or more real-time textual information streams within the predetermined period of time, receiving, by a search engine of the search system, a query including one or more particular terms from a mobile device or a digital assistant device; in response to receiving, by the search engine of the search system, the query including one or more particular terms from the mobile device or the digital assistant device, transmitting (i) one or more search results associated with the query that are identified by the search engine of the search system, and (ii) particular word count data indicating the number of times that the particular term has occurred within the one or more real-time textual information streams within the predetermined period of time for use by a speech recognition model trainer that is included on the mobile device or the digital assistant device in updating a language model that is used by an automated speech recognizer that is included on the mobile device or the digital assistant device; and updating, by the speech recognition model trainer that is included on the mobile device or the digital assistant device, statistical information associated with the language model based at least on the particular word count, to favor one or more words that were received by the search engine within a recent time period over one or more words that were not received by the search engine within the recent time period. - View Dependent Claims (14, 15, 16, 17)
-
Specification