Speech recognition using a personal vocabulary and language model
First Claim
1. A method comprising:
- monitoring network traffic from a plurality of users including a first user and a second user;
extracting words from the network traffic;
building a personal vocabulary for at least the second user from the words;
identifying a connection between the first user and the second user, wherein the connection is created from a trigger that includes an email including one or more subject matter keywords and the first and the second user as one or more of a recipient of the email, a sender of the email, or a part of text in the email;
receiving audio of the first user originating from audio content that does not involve the second user; and
converting the audio of the first user into text using a language model based at least partially on the personal vocabulary of the second user and the connection between the first user and the second user, where the audio of the first user includes at least part of the one or more subject matter keywords.
1 Assignment
0 Petitions
Accused Products
Abstract
In one implementation, speech or audio is converted to a searchable format by a speech recognition system. The speech recognition system uses a language model including probabilities of certain words occurring, which may depend on the occurrence of other words or sequences of words. The language model is partially built from personal vocabularies. Personal vocabularies are determined by known text from network traffic, including emails and Internet postings. The speech recognition system may incorporate the personal vocabulary of one user into the language model of another user based on a connection between the two users. The connection may be triggered by an email, a phone call, or an interaction in a social networking service. The speech recognition system may remove or add personal vocabularies to the language model based on a calculated confidence score from the resulting language model.
-
Citations
20 Claims
-
1. A method comprising:
-
monitoring network traffic from a plurality of users including a first user and a second user; extracting words from the network traffic; building a personal vocabulary for at least the second user from the words; identifying a connection between the first user and the second user, wherein the connection is created from a trigger that includes an email including one or more subject matter keywords and the first and the second user as one or more of a recipient of the email, a sender of the email, or a part of text in the email; receiving audio of the first user originating from audio content that does not involve the second user; and converting the audio of the first user into text using a language model based at least partially on the personal vocabulary of the second user and the connection between the first user and the second user, where the audio of the first user includes at least part of the one or more subject matter keywords. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An apparatus comprising:
-
a collector interface configured to monitor network traffic from a plurality of users including a first user and a second user and extract n-grams from the network traffic; a memory configured to store a personal vocabulary for at least the second user from the n-grams; and a controller configured to; identify a connection between the first user and the second user, wherein the connection is created from a trigger that includes a message including one or more subject matter keywords and the first and the second user as one or more of a recipient of the message, a sender of the message, or a part of a body or a header in the message; receive audio of the first user originating from audio content that does not involve the second user; and convert the audio of the first user into text using a language model based at least partially on the personal vocabulary of the second user and the connection between the first user and the second user, where the audio of the first user includes at least part of the one or more subject matter keywords. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. Logic encoded in one or more non-transitory tangible media, the logic executable by a processor and operable to:
-
monitor network traffic from a plurality of users including a first user and a second user; extract words from the network traffic; build a personal vocabulary from the words for each of the plurality of users; identify a connection between the first user and the second user, wherein the connection is created from a trigger that includes an email, call, or social media interaction including one or more subject matter keywords and the first and the second user as one or more of a recipient, a sender, or a part of content of the email, call, or social media interaction; receive audio of the first user originating from audio content that does not involve the second user; and convert the audio of the first user into text using a language model based on the personal vocabulary of the second user and the connection between the first user and the second user, where the audio of the first user includes at least part of the one or more subject matter keywords. - View Dependent Claims (17, 18, 19, 20)
-
Specification