Distributed speech recognition system with multi-user input stations
First Claim
1. A method of operating a document creation system, the system including a plurality of voice input stations and a server computer connected to exchange data signals with the voice input stations, the method comprising the steps of:
- storing speech recognition software at one of the voice input stations;
logging on to said one of the voice input stations;
placing said one of the voice input stations in a training mode for training a speech recognition algorithm included in the stored speech recognition software;
dictating into said one of the voice input stations to generate speech signals;
analyzing the speech signals at the voice input station using the stored speech recognition software in the training mode to generate acoustic reference files applicable to a particular user; and
uploading the acoustic reference files from the voice input station to the server computer.
10 Assignments
1 Petition
Accused Products
Abstract
A central dictation system includes a central server computer and a plurality of voice input stations connected to the server computer. Speaker-dependent speech recognition capabilities are provided at each voice input station, and any authorized user may use any station. When a user logs on to one of the stations, acoustic reference files to be used in recognizing the speech of the user are downloaded from the server to the station onto which the user has logged on. The down-loaded acoustic reference files are used to perform the speech recognition at the voice input station. A high bandwidth signal generated at the voice input station and used for speech recognition processing is transcoded at the input station to form a low bandwidth speech signal. The low bandwidth speech signal and the text document resulting from the speech recognition process are uploaded to the server computer.
-
Citations
20 Claims
-
1. A method of operating a document creation system, the system including a plurality of voice input stations and a server computer connected to exchange data signals with the voice input stations, the method comprising the steps of:
-
storing speech recognition software at one of the voice input stations;
logging on to said one of the voice input stations;
placing said one of the voice input stations in a training mode for training a speech recognition algorithm included in the stored speech recognition software;
dictating into said one of the voice input stations to generate speech signals;
analyzing the speech signals at the voice input station using the stored speech recognition software in the training mode to generate acoustic reference files applicable to a particular user; and
uploading the acoustic reference files from the voice input station to the server computer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
second logging on to a second one of the voice input stations, said second logging-on step including inputting author I.D. data for identifying an author who is performing the second logging-on step, said author being the person who performed the logging-on step referred to in claim 2;
transmitting to the server computer the author I.D. data inputted in said second logging-on step;
in response to said transmitting step, downloading from the server computer to the second of the voice input stations the acoustic reference files uploaded to the server computer in said uploading step;
dictating into said second one of the voice input stations to generate second speech signals; and
applying a speech recognition algorithm to the second speech signals at the second one of the voice input stations by using the downloaded acoustic reference files, to generate text document data from the second speech signals.
-
-
4. A method according to claim 3, wherein said second speech signals are digital signals generated at a first data rate, the method further comprising the step of transcoding the second speech signals to form transcoded speech signals which have a second data rate which is lower than the first data rate.
-
5. A method according to claim 4, wherein the second data rate is not more than one-tenth the first data rate.
-
6. A method according to claim 4, further comprising the step of uploading the transcoded speech signals, the text document data and the author I.D. data from said second one of the voice input stations to the server computer.
-
7. A method according to claim 6, wherein the document creation system further includes a plurality of document review stations, and the method further comprises downloading the uploaded transcoded speech signals, text document data and author I.D. data from the server computer to one of the document review stations.
-
8. A method according to claim 7, further comprising the steps of:
-
reviewing the downloaded speech signals and text document data at the document review stations;
correcting the downloaded text document data at the document review stations; and
uploading the corrected text document data to the server computer.
-
-
9. A method according to claim 8, further comprising the step of updating the acoustic reference files in the server computer on the basis of said step of correcting the text document data at the document review stations.
-
10. A method according to claim 3, further comprising the steps of uploading the text document data from said second one of the voice input stations to the server computer.
-
11. A method according to claim 1, wherein each of the voice input stations comprises a personal computer and a microphone interfaced to the personal computer.
-
12. A method of generating a text document, comprising the steps of:
-
logging on to a voice input station connected to a server computer;
downloading acoustic reference files from the server computer to the voice input station, the downloaded acoustic reference files having been generated in a speech recognition training mode and being applicable to a particular author;
dictating into the voice input station to generate a voice data file; and
applying a speech recognition algorithm to the voice data file at the voice input station by using the downloaded acoustic reference files to generate a text document. - View Dependent Claims (13, 14)
transcoding the voice data file from the first data rate to a second data rate which is lower than the first data rate; and
uploading the transcoded voice data file to the server computer.
-
-
15. A method of generating a text document, comprising the steps of:
-
dictating into a voice input station to generate a voice data file, the voice input station connected to a server computer, the voice data file being generated at a first data rate;
applying a speech recognition algorithm to the voice data file at the voice input station to generate a text document;
transcoding the voice data file at the voice input station from the first data rate to a second data rate which is lower than the first data rate; and
uploading the text document and the transcoded voice data file from the voice input station to the server computer.
-
-
16. A central dictation system comprising:
-
a server computer;
a plurality of voice input stations; and
a data communication network connecting the voice input stations to the server computer;
wherein the voice input stations are programmed to generate text documents by performing speech recognition with respect to speech signals inputted into the voice input stations, said speech recognition being performed by using acoustic reference files downloaded to the voice input stations from the server computer, the downloaded acoustic reference files having been generated in a speech recognition training mode and being applicable to a particular author. - View Dependent Claims (17, 18, 19, 20)
wherein text documents created by said speech recognition performed by said voice input stations are uploaded from said voice input stations to said server computer and downloaded from said server computer to said document review stations.
-
-
18. A central dictation system according to claim 17, wherein said voice input stations are further programmed to transcode said speech signals inputted into the voice input stations from a first data rate to a second data rate which is lower than the first data rate, the transcoded speech signals being uploaded from the voice input stations to the server computer and being downloaded from the server computer to the document review stations.
-
19. A central dictation system according to claim 18, wherein said first data rate is at least ten times higher than said second data rate.
-
20. A central dictation system according to claim 16, wherein each of said voice input stations includes a personal computer and a hand microphone connected to the personal computer.
Specification