Automatic language model update
First Claim
1. A computer-implemented method, comprising:
- receiving, by an internet search service, multiple search queries that were sent by multiple respective computing devices, each of the multiple search queries including an occurrence of a particular term that is common to each of the multiple search queries, the multiple search queries having been entered textually by users of the multiple respective computing devices;
applying, by a computing system and to the occurrences of the particular term in the multiple search queries, weightings based on chronological receipt of the multiple search queries such that the weightings favor recent occurrences of the particular term in the multiple search queries more heavily than older occurrences of the particular term in the multiple search queries;
revising, by the computing system and based on the weightings that have been applied to the occurrences of the particular term in the multiple search queries, a probability that is assigned to the particular term in a language model;
receiving, by the computing system and as having been sent by a remote computing system, a verbal input; and
using the language model, by the computing system and after the revising of the probability, to identify a text translation of the verbal input, wherein the text translation of the verbal input includes the particular term.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.
45 Citations
28 Claims
-
1. A computer-implemented method, comprising:
-
receiving, by an internet search service, multiple search queries that were sent by multiple respective computing devices, each of the multiple search queries including an occurrence of a particular term that is common to each of the multiple search queries, the multiple search queries having been entered textually by users of the multiple respective computing devices; applying, by a computing system and to the occurrences of the particular term in the multiple search queries, weightings based on chronological receipt of the multiple search queries such that the weightings favor recent occurrences of the particular term in the multiple search queries more heavily than older occurrences of the particular term in the multiple search queries; revising, by the computing system and based on the weightings that have been applied to the occurrences of the particular term in the multiple search queries, a probability that is assigned to the particular term in a language model; receiving, by the computing system and as having been sent by a remote computing system, a verbal input; and using the language model, by the computing system and after the revising of the probability, to identify a text translation of the verbal input, wherein the text translation of the verbal input includes the particular term. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method, comprising:
-
accessing, by a computing system, information that identifies weightings that have been applied to occurrences of a particular term that is common to multiple search queries, the weightings having been applied to the occurrences of the particular term in the multiple search queries based on chronological receipt of the multiple search queries such that the weightings favor recent occurrences of the particular term in the multiple search queries more heavily than older occurrences of the particular term in the multiple search queries; determining, by the computing system and based on the information that identifies the weightings that have been applied to the occurrences of the particular term in the multiple search queries, a probability of usage of the particular term, for use by a speech recognition system; and building or updating, by the computing system, a language model that is stored by the computing system, to reflect the determined probability of usage of the particular term, or sending, by the computing system and for receipt by a remote computing system, an indication of the determined probability of usage of the particular term so as to cause the remote computing system to build or update a language model that is stored by the remote computing system, to reflect the determined probability of usage of the particular term. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A computer-implemented method, comprising:
-
receiving, by a computing system, a verbal input that was provided by a user; and using, by the computing system, a language model to identify a text translation of the verbal input, the text translation of the verbal input including a particular term, the use of the language model including accessing, in the language model, a probability of usage of the particular term, wherein the probability of usage of the particular term was determined based on weightings that have been applied to occurrences of the particular term in multiple search queries, the weightings having been applied to the occurrences of the particular term in the multiple search queries based on chronological receipt of the multiple search queries such that the weightings favor recent occurrences of the particular term in the multiple search queries more heavily than older occurrences of the particular term in the multiple search queries. - View Dependent Claims (16, 17, 18, 19, 20)
-
-
21. A non-transitory computer-readable medium storing instructions that, when executed by a computer processor, cause performance of operations that comprise:
-
accessing, by a computing system, information that identifies weightings that have been applied to occurrences of a particular term that is common to multiple search queries, the weightings having been applied to the occurrences of the particular term in the multiple search queries based on chronological receipt of the multiple search queries such that the weightings favor recent occurrences of the particular term in the multiple search queries more heavily than older occurrences of the particular term in the multiple search queries; determining, by the computing system and based on the information that identifies the weightings that have been applied to the occurrences of the particular term in the multiple search queries, a probability of usage of the particular term, for use by a speech recognition system; and building or updating, by the computing system, a language model that is stored by the computing system, to reflect the determined probability of usage of the particular term, or sending, by the computing system and for receipt by a remote computing system, an indication of the determined probability of usage of the particular term so as to cause the remote computing system to build or update a language model that is stored by the remote computing system, to reflect the determined probability of usage of the particular term. - View Dependent Claims (22, 23)
-
-
24. A non-transitory computer-readable medium storing instructions that, when executed by a computer processor, cause performance of operations that comprise:
-
receiving, by a computing system, a verbal input that was provided by a user; and using, by the computing system, a language model to identify a text translation of the verbal input, the text translation of the verbal input including a particular term, the use of the language model including accessing, in the language model, a probability of usage of the particular term, wherein the probability of usage of the particular term was determined based on weightings that have been applied to occurrences of the particular term in multiple search queries, the weightings having been applied to the occurrences of the particular term in the multiple search queries based on chronological receipt of the multiple search queries such that the weightings favor recent occurrences of the particular term in the multiple search queries more heavily than older occurrences of the particular term in the multiple search queries. - View Dependent Claims (25, 26, 27, 28)
-
Specification