Distributed speech recognition system with multi-user input stations

US 6,308,158 B1
Filed: 06/30/1999
Issued: 10/23/2001
Est. Priority Date: 06/30/1999
Status: Expired due to Term

First Claim

Patent Images

1. A method of operating a document creation system, the system including a plurality of voice input stations and a server computer connected to exchange data signals with the voice input stations, the method comprising the steps of:

storing speech recognition software at one of the voice input stations;

logging on to said one of the voice input stations;

placing said one of the voice input stations in a training mode for training a speech recognition algorithm included in the stored speech recognition software;

dictating into said one of the voice input stations to generate speech signals;

analyzing the speech signals at the voice input station using the stored speech recognition software in the training mode to generate acoustic reference files applicable to a particular user; and

uploading the acoustic reference files from the voice input station to the server computer.

View all claims

10 Assignments

Timeline View

Assignment View

1 Petition

Accused Products

Abstract

A central dictation system includes a central server computer and a plurality of voice input stations connected to the server computer. Speaker-dependent speech recognition capabilities are provided at each voice input station, and any authorized user may use any station. When a user logs on to one of the stations, acoustic reference files to be used in recognizing the speech of the user are downloaded from the server to the station onto which the user has logged on. The down-loaded acoustic reference files are used to perform the speech recognition at the voice input station. A high bandwidth signal generated at the voice input station and used for speech recognition processing is transcoded at the input station to form a low bandwidth speech signal. The low bandwidth speech signal and the text document resulting from the speech recognition process are uploaded to the server computer.

Citations

20 Claims

1. A method of operating a document creation system, the system including a plurality of voice input stations and a server computer connected to exchange data signals with the voice input stations, the method comprising the steps of:
- storing speech recognition software at one of the voice input stations;
  
  logging on to said one of the voice input stations;
  
  placing said one of the voice input stations in a training mode for training a speech recognition algorithm included in the stored speech recognition software;
  
  dictating into said one of the voice input stations to generate speech signals;
  
  analyzing the speech signals at the voice input station using the stored speech recognition software in the training mode to generate acoustic reference files applicable to a particular user; and
  
  uploading the acoustic reference files from the voice input station to the server computer.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. A method according to claim 1, wherein the logging-on step includes inputting I.D. data for identifying a person who is performing the logging-on step;
    - and further comprising the step of uploading the I.D. data to the server together with the acoustic reference files.
  - 3. A method according to claim 2, further comprising the steps of:
4. A method according to claim 3, wherein said second speech signals are digital signals generated at a first data rate, the method further comprising the step of transcoding the second speech signals to form transcoded speech signals which have a second data rate which is lower than the first data rate.
5. A method according to claim 4, wherein the second data rate is not more than one-tenth the first data rate.
6. A method according to claim 4, further comprising the step of uploading the transcoded speech signals, the text document data and the author I.D. data from said second one of the voice input stations to the server computer.
7. A method according to claim 6, wherein the document creation system further includes a plurality of document review stations, and the method further comprises downloading the uploaded transcoded speech signals, text document data and author I.D. data from the server computer to one of the document review stations.
8. A method according to claim 7, further comprising the steps of:
- reviewing the downloaded speech signals and text document data at the document review stations;
  
  correcting the downloaded text document data at the document review stations; and
  
  uploading the corrected text document data to the server computer.
9. A method according to claim 8, further comprising the step of updating the acoustic reference files in the server computer on the basis of said step of correcting the text document data at the document review stations.
10. A method according to claim 3, further comprising the steps of uploading the text document data from said second one of the voice input stations to the server computer.
11. A method according to claim 1, wherein each of the voice input stations comprises a personal computer and a microphone interfaced to the personal computer.

12. A method of generating a text document, comprising the steps of:
- logging on to a voice input station connected to a server computer;
  
  downloading acoustic reference files from the server computer to the voice input station, the downloaded acoustic reference files having been generated in a speech recognition training mode and being applicable to a particular author;
  
  dictating into the voice input station to generate a voice data file; and
  
  applying a speech recognition algorithm to the voice data file at the voice input station by using the downloaded acoustic reference files to generate a text document.
- View Dependent Claims (13, 14)
- - 13. A method according to claim 12, further comprising the step of uploading the text document to the server computer.
  - 14. A method according to claim 13, wherein the voice data file is generated at a first data rate at said dictating step, and further comprising the steps of:

15. A method of generating a text document, comprising the steps of:
- dictating into a voice input station to generate a voice data file, the voice input station connected to a server computer, the voice data file being generated at a first data rate;
  
  applying a speech recognition algorithm to the voice data file at the voice input station to generate a text document;
  
  transcoding the voice data file at the voice input station from the first data rate to a second data rate which is lower than the first data rate; and
  
  uploading the text document and the transcoded voice data file from the voice input station to the server computer.

16. A central dictation system comprising:
- a server computer;
  
  a plurality of voice input stations; and
  
  a data communication network connecting the voice input stations to the server computer;
  
  wherein the voice input stations are programmed to generate text documents by performing speech recognition with respect to speech signals inputted into the voice input stations, said speech recognition being performed by using acoustic reference files downloaded to the voice input stations from the server computer, the downloaded acoustic reference files having been generated in a speech recognition training mode and being applicable to a particular author.
- View Dependent Claims (17, 18, 19, 20)
- - 17. A central dictation system according to claim 16, further comprising a plurality of document review stations connected to said server computer by said data communication network;
    - and
18. A central dictation system according to claim 17, wherein said voice input stations are further programmed to transcode said speech signals inputted into the voice input stations from a first data rate to a second data rate which is lower than the first data rate, the transcoded speech signals being uploaded from the voice input stations to the server computer and being downloaded from the server computer to the document review stations.
19. A central dictation system according to claim 18, wherein said first data rate is at least ten times higher than said second data rate.
20. A central dictation system according to claim 16, wherein each of said voice input stations includes a personal computer and a hand microphone connected to the personal computer.

Specification

Resources

Litigation Campaign Assessment

Litigation Data

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Dictaphone Corporation (Microsoft Corporation)
Inventors
Larossa-Greene, Channell, Kuhnen, Regina, Howes, Simon L.
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US09/345,588
Time in Patent Office

846 Days
Field of Search

704/246, 704/251, 704/270, 704/276, 704/275, 704/235, 345/329, 707/530
US Class Current

704/275
CPC Class Codes

G10L 15/063 Training

G10L 15/30 Distributed recognition, e....

Distributed speech recognition system with multi-user input stations

First Claim

10 Assignments

1 Petition

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Distributed speech recognition system with multi-user input stations

First Claim

10 Assignments

Subscription Required

Subscription Required

1 Petition

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links