Network application software services containing a speech recognition capability

US 6,434,526 B1
Filed: 06/29/1998
Issued: 08/13/2002
Est. Priority Date: 06/29/1998
Status: Expired due to Term

First Claim

Patent Images

1. A speech recognition system for network provided services comprising:

a private network permitting access to only network subscribers, which private network contains an application software service application specific software, located at a central location, that is provided through a network server to multiple independent clients at various client locations, each client location connected to the network through a client computer;

a first transducer in each client location for transduction of voice messages by users of the computers;

a second transducer in the client location for transcription of digital print character message by users of the computers;

analog to digital data conversion software in each client location for converting analog voice messages into digital speech data and combining the digital speech data with linkage information identifying the digital data as speech data and the client and user source of the digital speech data;

a data link connecting the client computer to a network server at the central location to transfer digital data to the network;

a speech server containing speech recognition software at the central location for the simultaneous receipt and handling of the digital speech data from the client locations and converting the digital speech data to digital data representative of print messages, said speech recognition software including a speech engine, a speech library and a speech template for each of the users of the speech recognition software;

speech manager at the central location for providing speech services to multiple users at the same time by dividing digital speech data from the various client locations into packets at the central location on the basis of the traffic pattern of users trying to use the speech server, each packet containing digital speech data and an ID identifying the user said speech manager interleaving the packets containing the digital speech data from multiple simultaneous users and transferring the packets to the speech engine so that the packets from users share the speech engine on a first in first out basis;

decoding circuitry at the central location responsive to the linkage information in the packets to provide the speech data to the speech engine and to select the proper speech template for the user to be used with the speech library in converting digital speech data to digital data representative of print; and

routing circuitry using the linkage information to direct the printed message data to an appropriate client computer location so that the independent clients can be provided with the textual transmission of their voice messages.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech recognition software is provided in combination with application specific software on a communications network. Analog voice data is digitized at a user'"'"'s location, identified as voice data, and transmitted to the application software residing at a central location. The network server receiving data identified as voice data transmits it to a speech server. Speech recognition software resident at the speech server contains a dictionary and modules tailored to the voice of each of the users of the speech recognition software. As the user speaks, a translation of the dictation is transmitted back to the user'"'"'s location and appears in print on the user'"'"'s computer screen for examination and if necessary, voice or typed correction of its contents. Multiple users have interleaved access to the speech recognition software so that transmission back to each of the users is contemporaneous.

Citations

15 Claims

1. A speech recognition system for network provided services comprising:
- a private network permitting access to only network subscribers, which private network contains an application software service application specific software, located at a central location, that is provided through a network server to multiple independent clients at various client locations, each client location connected to the network through a client computer;
  
  a first transducer in each client location for transduction of voice messages by users of the computers;
  
  a second transducer in the client location for transcription of digital print character message by users of the computers;
  
  analog to digital data conversion software in each client location for converting analog voice messages into digital speech data and combining the digital speech data with linkage information identifying the digital data as speech data and the client and user source of the digital speech data;
  
  a data link connecting the client computer to a network server at the central location to transfer digital data to the network;
  
  a speech server containing speech recognition software at the central location for the simultaneous receipt and handling of the digital speech data from the client locations and converting the digital speech data to digital data representative of print messages, said speech recognition software including a speech engine, a speech library and a speech template for each of the users of the speech recognition software;
  
  speech manager at the central location for providing speech services to multiple users at the same time by dividing digital speech data from the various client locations into packets at the central location on the basis of the traffic pattern of users trying to use the speech server, each packet containing digital speech data and an ID identifying the user said speech manager interleaving the packets containing the digital speech data from multiple simultaneous users and transferring the packets to the speech engine so that the packets from users share the speech engine on a first in first out basis;
  
  decoding circuitry at the central location responsive to the linkage information in the packets to provide the speech data to the speech engine and to select the proper speech template for the user to be used with the speech library in converting digital speech data to digital data representative of print; and
  
  routing circuitry using the linkage information to direct the printed message data to an appropriate client computer location so that the independent clients can be provided with the textual transmission of their voice messages.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The speech recognition system of claim 1 including separate internal databases in the central location in the private network for each of the clients of the private network.
  - 3. The speech recognition system of claim 2 including an internet connection to the private network allowing only network subscribers with an appropriate id and password to enter the private network from the internet.
  - 4. The speech recognition system of claim 1, wherein said speech manager includes software registers arranged in stages for transmission of the packets of digital speech data to the speech engine on a first in first out basis.
  - 5. The speech recognition system of claim 4, wherein the speech data in the packets is provided as compressed data to the first register stage.
  - 6. The speech recognition system of claim 5 including decompression logic for receiving compressed data from the last register stage and providing uncompressed data to the speech engine.
  - 7. The speech recognition system of claim 6 including means for using the speech pattern of the user to check the verbal entry of an ID and password to determine if user is a valid user.

8. A process for use of speech recognition in services provided on a computer network by a network server to multiple unrelated clients comprising:
- providing analog voice and print character messages to digital data signals at client locations on the network;
  
  compressing the digital data at the client locations and transferring it to a central service location with linkage information that identifies the digital data as either print character or speech data and the client source of the data;
  
  decompressing the digital data identified as speech data at the central service location;
  
  providing at the central location a speech server with speech recognition software for the simultaneous translation of the decompressed digital speech data into print characters which software has a speech engine and personal speech templates of users to be combined with the core library of speech templates in converting the decoded digital speech data to digital data representative of printed messages;
  
  a plurality of other application software programs at the central location to be used with the software of the speech server by user selection on a computer screen at the client location;
  
  dividing the speech data into packets at the central service location on the basis of the traffic pattern of users trying to use the speech server at the central location;
  
  interleaving the packets of digital speech data from multiple simultaneous users and feeding the interleaved speech packets to the speech engine for converting the packets to text based data on a first in first out basis irrespective of client source to provide the simultaneous services to the clients;
  
  using the linkage information to access users personal speech templates; and
  
  combining the converted print character data with destination information at the central location and returning them to the appropriate client location on a first in first out basis.
- View Dependent Claims (9, 10)
- - 9. The process of claim 8 including providing separate databases in the central location for each of the clients of the network.
  - 10. The processor claim 8 including allowing only network subscribers with an appropriate ID and password to enter the network;

11. Apparatus for use in providing speech recognition services on a network at a central location to multiple unrelated clients comprising:
- a speech server containing speech recognition software at the central location for the simultaneous receipt and handling of the digital speech data from different client locations and converting the digital speech data to digital data representative of print messages, said speech recognition software including a speech engine, a speech library and a speech template for each of the users of the speech recognition software;
  
  speech manager at the central location for providing speech services to multiple users at the same time by dividing digital speech data from the various client locations into packets at the central location on the basis of the number of users trying to use the speech server at the central location to increase and decrease the number of simultaneously serviced speech servers based on the traffic pattern each packet containing digital speech data and an ID identifying the user, said speech manager interleaving the packets containing the digital speech data from multiple simultaneous users and providing the packets to the speech engine on a first in first out basis;
  
  decoding circuitry at the central location responsive to the linkage information in the packets to provide the speech data to the speech engine and to select the proper speech template for the user to be used with the speech library in converting digital speech data to digital data representative of print; and
  
  routing circuitry using the linkage information to direct the printed message data to an appropriate client computer location so that the independent clients can be provided with the textual transmission of their voice messages.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The apparatus of claim 11, wherein said speech manager includes software registers arranged in stages for transmission of the packets of digital speech data to the speech engine on a first in first out basis.
  - 13. The apparatus of claim 12, wherein the speech data in the packets is compressed data at the first stage of the registers.
  - 14. The apparatus of claim 13 including decompression logic for receiving compressed data from the last stage of the software registers and providing uncompressed data to the speech engine.
  - 15. The speech recognition system of claim 14 including means for using the speech pattern of the user to check on verbal entry of an ID and password to determine if user is a valid user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Miller, Roger Matthew, Cilurzo, Frank
Primary Examiner(s)
{haeck over (S)}mits, Tãlivaldis Ivars

Application Number

US09/107,568
Time in Patent Office

1,506 Days
Field of Search

704/235, 704/270, 704/270.1
US Class Current

704/270.1
CPC Class Codes

G10L 15/30 Distributed recognition, e....

G10L 2015/228 of application context

Network application software services containing a speech recognition capability

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Network application software services containing a speech recognition capability

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links