Method to train the language model of a speech recognition system to convert and index voicemails on a search engine
First Claim
Patent Images
1. A method for training a language model of a speech recognition engine, the method comprising:
- extracting textual content from email documents from an email document source;
extracting contact data from a contact from a contact source comprising contacts, the contact data comprising at least one of a person'"'"'s name and email address;
forming a training set comprising the email documents, the email documents each having recipient or sender information that comprises the at least one of a person'"'"'s name and email address, the forming a training set further comprises;
extracting metadata from the email documents;
providing manually created sentence templates;
creating new training sentences by filling the sentence templates using the metadata; and
adding the new training sentences to the training set; and
training the language model using the textual content from the email documents and the new training sentences in the training set to produce a speech recognition profile for the contact.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and a related system to index voicemail documents by training a language model for a speaker or group of speakers by using existing emails and contact information on available repositories.
37 Citations
4 Claims
-
1. A method for training a language model of a speech recognition engine, the method comprising:
-
extracting textual content from email documents from an email document source; extracting contact data from a contact from a contact source comprising contacts, the contact data comprising at least one of a person'"'"'s name and email address; forming a training set comprising the email documents, the email documents each having recipient or sender information that comprises the at least one of a person'"'"'s name and email address, the forming a training set further comprises; extracting metadata from the email documents; providing manually created sentence templates; creating new training sentences by filling the sentence templates using the metadata; and adding the new training sentences to the training set; and training the language model using the textual content from the email documents and the new training sentences in the training set to produce a speech recognition profile for the contact. - View Dependent Claims (2, 3, 4)
-
Specification