Automatic language model update

US 20070233487A1
Filed: 04/03/2006
Published: 10/04/2007
Est. Priority Date: 04/03/2006
Status: Active Grant

First Claim

Patent Images

1. A method for generating a speech recognition model, comprising:

accessing a baseline speech recognition model;

obtaining information related to recent language usage from search queries; and

modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.

Citations

23 Claims

1. A method for generating a speech recognition model, comprising:
- accessing a baseline speech recognition model;
  
  obtaining information related to recent language usage from search queries; and
  
  modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information.
- View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 3. The method of claim 1, further comprising receiving from a remote device a verbal search term that is associated with a text search term using a recognizer implementing the speech recognition model.
  - 4. The method of claim 3, further comprising transmitting to the remote device search results associated with the text search term.
  - 5. The method of claim 3, further comprising transmitting to the remote device the text search term.
  - 6. The method of claim 1, further comprising receiving a verbal command for an application that is associated with a text command using a recognizer implementing the speech recognition model.
  - 7. The method of claim 6, wherein the speech recognition model assigns the verbal command to a sub-grammar slot.
  - 8. The method of claim 1, wherein the speech recognition model is a rule-based model, a statistical model, or both.
  - 9. The method of claim 1, further comprising transmitting to a remote device at least a portion of the modified speech recognition model.
  - 10. The method of claim 9, wherein the remote device accesses the portion of the modified speech recognition model to perform speech recognition functions.
  - 11. The method of claim 1, wherein the speech recognition model includes weightings for co-concurrence events between two or more words.
  - 12. The method of claim 1, wherein the speech recognition model includes weightings associated with when the search queries are received.
  - 13. The method of claim 1, wherein obtaining the information related to recent language includes generating word counts for each word.

2. The method of claim 1, wherein the portion of the sound comprises a word.

14. A method for generating a speech recognition model, comprising:
- receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording;
  
  synchronizing the transcript with the audio recording;
  
  extracting one or more letters from the transcript and extracting an associated pronunciation of the one or more letters from the audio recording; and
  
  generating a dictionary entry in a pronunciation dictionary.
- View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
- - 15. The method of claim 14, wherein the audio recording and associated transcript are part of a video.
  - 16. The method of claim 14, wherein the remote device is a television transmitter.
  - 17. The method of claim 14, wherein the remote device is a personal computer.
  - 18. The method of claim 14, wherein the dictionary entry comprises the extracted one or more letters and the associated pronunciation.
  - 19. The method of claim 18, further comprising receiving verbal input that is identified by a recognizer that accesses the pronunciation dictionary.
  - 20. The method of claim 14, further comprising receiving multiple audio recordings and transcripts and separating the recordings and transcripts into training and test sets.
  - 21. The method of claim 20, further comprising applying a set of weightings to the association between one or more letters and the pronunciation in the training set and selecting a weight from the set that produces a greatest recognition accuracy when processing the test set.

22. The method of claim 14, wherein the dictionary entry includes weightings associated with when the transcript was received.

22-1. A computer implemented method for transmitting verbal terms, comprising:
- transmitting search terms from a remote device to a server device, wherein the server device generates word occurrence data associated with the search terms and modifies a language model based on the word occurrence data.

23. A system for updating a language model, comprising:
- a request processor to receive search terms;
  
  an extractor for obtaining information related to recent language usage from the search terms; and
  
  means for modifying a language model to revise probabilities of a word occurrence based on the information.

23-2. The computer implemented method of claim 24, wherein the remote device is selected from a group consisting of a mobile telephone, a personal digital assistant, a desktop computer, and a mobile email device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Baluja, Shumeet, Moreno, Pedro, Cohen, Michael

Granted Patent

US 7,756,708 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/255
CPC Class Codes

G10L 15/06   Creation of reference templ...

G10L 15/063   Training

G10L 15/065   Adaptation

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/26   Speech to text systems G10L...

G10L 2015/0635   updating or merging of old ...

Automatic language model update

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Automatic language model update

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links