Methods and system for capturing voice files and rendering them searchable by keyword or phrase
First Claim
1. A system for capturing voice files and rendering them searchable, comprising:
- (a) a database system having a plurality of grammars stored therein;
(b) at least one device that electronically captures audio speech for a conversation between two or more participants;
(c) a recorder coupled to said at least one device, the recorder capturing audio speech from the device for storage as audio speech data in said database system; and
(d) a speech recognition engine adapted totranscribe the audio speech data into machine-readable text data in a plurality of transcription passes using grammars selected from said plurality of stored grammars, andstore the machine-readable text data as well as data associating the machine-readable text data with the corresponding audio speech data in the database system for subsequent retrieval by a search application;
wherein the speech recognition engine is adapted to select a grammar from said database system prior to performing a first transcription pass, the grammar for a first transcription pass selected on the basis ofinformation pertaining to the subject matter or purpose of the conversation, andinformation pertaining to one or more of the participants,and further wherein the recognition engine is adapted to revise the machine-readable text data for the conversation by performing a subsequent transcription pass on the audio speech data using a grammar which was not used in the first transcription pass.
14 Assignments
0 Petitions
Accused Products
Abstract
A system for capturing voice files and rendering them searchable, comprising one or more devices capable of capturing audio speech electronically, a recorder coupled to the devices for retrieving audio speech, a controller coupled to the recorder, a recognition engine adapted to transcribe audio speech into text, and a database system is disclosed. In the system, the controller causes the recorder to capture audio speech from at least one of the devices, the recorder stores the audio speech as data in the database system, and the recognition engine subsequently retrieves the audio speech data, transcribes the audio speech data into text, and stores the text and data associating the text data with at least the audio speech data in the database system for subsequent retrieval by a search application.
-
Citations
17 Claims
-
1. A system for capturing voice files and rendering them searchable, comprising:
-
(a) a database system having a plurality of grammars stored therein; (b) at least one device that electronically captures audio speech for a conversation between two or more participants; (c) a recorder coupled to said at least one device, the recorder capturing audio speech from the device for storage as audio speech data in said database system; and (d) a speech recognition engine adapted to transcribe the audio speech data into machine-readable text data in a plurality of transcription passes using grammars selected from said plurality of stored grammars, and store the machine-readable text data as well as data associating the machine-readable text data with the corresponding audio speech data in the database system for subsequent retrieval by a search application; wherein the speech recognition engine is adapted to select a grammar from said database system prior to performing a first transcription pass, the grammar for a first transcription pass selected on the basis of information pertaining to the subject matter or purpose of the conversation, and information pertaining to one or more of the participants, and further wherein the recognition engine is adapted to revise the machine-readable text data for the conversation by performing a subsequent transcription pass on the audio speech data using a grammar which was not used in the first transcription pass. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method for capturing voice files and rendering them searchable, comprising the steps of:
-
(a) recording audio speech data for a conversation between two or more participants, said audio speech data obtained from at least one audio-capable device; (b) storing the audio speech data in a database system; (c) selecting and loading into a speech recognition engine a grammar selected from a plurality of stored grammars, wherein said grammar is selected prior to the transcribing step and is selected on the basis of information pertaining to the subject matter or purpose of the conversation, and the identity of one or more of the participants; (d) transcribing the audio speech data into machine-readable text data using the speech recognition engine employing said grammar; (e) creating at least one data element associating the machine-readable text data with the corresponding audio speech data; (f) storing the machine-readable text data and the associated data element in a searchable database; and (f) revising the machine-readable text data by performing a subsequent transcription pass on the audio speech data using another grammar which is different than the previously selected grammar. - View Dependent Claims (13, 14, 15, 16, 17)
-
Specification