Speech recognition system with network accessible speech processing resources

US 6,785,647 B2
Filed: 04/20/2001
Issued: 08/31/2004
Est. Priority Date: 04/20/2001
Status: Active Grant

First Claim

Patent Images

1. A method of speech recognition comprising the acts of:

receiving speech signals into a front-end processor;

storing at least some speaker-dependent and/or speaker group-dependent resources used for speech recognition in a network-attached server, including resources that implement a mapping between the speech signals and tokens that have mean ing to a voice enabled application;

coupling the front-end processor to the network-attached server over a network to perform the speech recognition using the resources.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of speech recognition including receiving speech signals into a front-end processor and storing at least some resources used for speech recognition in a network-attached server. The front-end processor is coupled to the network-attached server to perform the speech recognition.

Citations

28 Claims

1. A method of speech recognition comprising the acts of:
- receiving speech signals into a front-end processor;
  
  storing at least some speaker-dependent and/or speaker group-dependent resources used for speech recognition in a network-attached server, including resources that implement a mapping between the speech signals and tokens that have mean ing to a voice enabled application;
  
  coupling the front-end processor to the network-attached server over a network to perform the speech recognition using the resources.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1 wherein the front-end processor comprises an untrained and/or minimally trained voice recognition system whereby the resources provided by the network-attached server provide training for the minimally trained voice recognition system.
  - 3. The method of claim 1 further comprising identifying a speaker in the front-end processor.
  - 4. The method of claim 1 wherein the stored resources comprise a speaker signature data structure comprising at least one speaker dependent voice model.
  - 5. The method of claim 1 further comprising the acts of:
6. The method of claim 5 further comprising the acts of:
- returning a response from the network-attached server over the network to the front-end processor comprising phoneme probabilities corresponding to the speech signal.
7. The method of claim 5 further comprising the acts of:
- returning a response from the network-attached server to the front-end processor comprising text corresponding to the speech signal.
8. The method of claim 1 wherein the front-end processor comprises a trainable neural network and the stored resources comprise a neural network training file.
9. The method of claim 1 wherein the stored resources comprise voice signal:
- value pairs associating a particular value with each voice signal.

10. A speech recognition server comprising:
- a network interface configured to receive a request from an external entity;
  
  an identification of a speaker associated with each request;
  
  speaker-dependent signature data structures stored in the speech recognition server; and
  
  means for generating a response including speaker-dependent voice recognition resources in response to the received request, wherein the voice recognition resources are used by the external entity to perform speech recognition.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
- - 11. The server of claim 10 further comprising encoded speech signals within each request.
  - 12. The server of claim 11 wherein the means for generating a response comprises a neural network operable to receive the encoded speech signals and generate an output comprising values representing the language-based content of the encoded speech signals.
  - 13. The server of claim 11 wherein the response includes phoneme probabilities corresponding to the speech signal.
  - 14. The server of claim 11 wherein the response includes text corresponding to the speech signal.
  - 15. The server of claim 10 further comprising speaker group signature data structures stored in the speech recognition server, wherein the means for generating a response includes the speaker group resources when speaker-dependent resources are not available.
  - 16. The server of claim 10 wherein the means for generating a response generates a response including a voice model corresponding to the identified speaker.
  - 17. The server of claim 10 further comprising:

18. A speech recognition system comprising:
- a centralized resource of shared, speaker-dependent speech recognition resources;
  
  two or more applications having processes for receiving a voice signal and communicating with the centralized resource over a network to perform speech recognition on the voice signal using the speech recognition resources from the centralized resource.
- View Dependent Claims (19, 20, 21, 22, 23, 24)
- - 19. The speech recognition system of claim 18 wherein the speaker-dependent speech recognition resources comprise voice models of individual speakers.
  - 20. The speech recognition system of claim 18 wherein the speaker-dependent speech recognition resources comprise voice models of groups of speakers.
  - 21. The speech recognition system of claim 18 wherein the speaker-dependent speech recognition resources comprise feature extraction processes.
  - 22. The speech recognition system of claim 21 wherein the speaker-dependent speech recognition resources comprise processes operable to associative linguistic tokens with the extracted features.
  - 23. The speech recognition system of claim 18 wherein the speaker-dependent speech recognition resources comprise speech samples for individual speakers.
  - 24. The speech recognition system of claim 18 further comprising:

25. A speech-enabled software application comprising:
- a first interface for receiving a voice signal from a speaker;
  
  a second interface for sending the voice signal over a network to a centralized speech recognition server;
  
  a third interface for receiving phoneme probabilities from the speech recognition server corresponding to the voice signal; and
  
  processes for using the phoneme probabilities to launch speech-enabled functions for the speaker.

26. A method of creating a speech sample database comprising:
- accepting a voice recognition task at an application;
  
  communicating the voice recognition task to a centralized resource;
  
  performing the voice recognition task at a the centralized resource;
  
  causing the application to evaluate correctness of the voice recognition; and
  
  storing the speech sample from the task with its recognition result in a speech sample database.
- View Dependent Claims (27, 28)
- - 27. The method of claim 26 further comprising storing context information metadata describing the speaker with an association to the sample.
  - 28. The method of claim 26 further comprising providing speech samples meeting specified criteria to external entities.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sensory Incorporated
Original Assignee
William R. Hutchison
Inventors
Hutchison, William R.
Primary Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US09/838,973
Publication Number

US 20020156626A1
Time in Patent Office

1,229 Days
Field of Search

704/257, 704/235, 704/273, 704/243, 704/240, 704/246, 704/231, 704/201, 704/270, 704/270.1, 371/88.01
US Class Current

704/231
CPC Class Codes

G10L 15/16   using artificial neural net...

G10L 15/30   Distributed recognition, e....

G10L 17/00   Speaker identification or v...

G10L 2015/025   Phonemes, fenemes or fenone...

Speech recognition system with network accessible speech processing resources

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

28 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition system with network accessible speech processing resources

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

28 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links