Methods and apparatus for performing speech recognition over a network and using speech recognition results

US 20050216273A1
Filed: 05/25/2005
Published: 09/29/2005
Est. Priority Date: 11/30/2000
Status: Active Grant

First Claim

Patent Images

1-30. -30. (canceled)

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques for generating, distributing, and using speech recognition models are described. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. Thus, the speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device'"'"'s speech processing capability. Voice dialing, telephone control and/or other services are provided by the speech processing facility in response to speech recognition results.

110 Citations

View as Search Results

48 Claims

1-30. -30. (canceled)

31. A method comprising:
- receiving speech data transmitted over a data network at a speech processing facility, the speech data associated with a user;
  
  receiving a user identifier associated with the user transmitted over the data network at the speech processing facility;
  
  performing a speech recognition operation at the speech processing facility, the speech recognition operation including-retrieving a speaker dependent speech recognition model associated with a user based on the user identifier from a plurality of speaker dependent speech recognition models stored at the speech processing facility, performing speech recognition using the retrieved speaker dependent speech recognition model and the speech data, determining an outcome of the speech recognition;
  
  transmitting the outcome over the data network.
- View Dependent Claims (32, 33, 34, 35, 36)
- - 32. The method of claim 31, wherein the outcome is transmitted to the user over the data network.
  - 33. The method of claim 32, wherein the outcome comprises a name recognized from the speech data and a telephone number associated with the recognized name.
  - 34. The method of claim 31, wherein the outcome comprises a name recognized from the speech data and a telephone number associated with the recognized name.
  - 35. The method of claim 33, wherein the outcome is transmitted to a voice dialing peripheral over the data network, and the method further comprises:
    - initiating a telephone call by the voice dialing peripheral using the telephone number associated with the recognized name.
  - 36. The method of claim 34, further comprising:
    - initiating an additional telephone call using a telephone number associated with the user; and
      
      bridging said telephone call and said additional telephone call.

37. A method, comprising:
- receiving over a data network, at a speech processing facility connected to the data network, speech data, a text version of the speech data, and a user identifier associated with a user;
  
  generating a set of feature vectors corresponding to the speech data, the set of feature vectors including speech characteristic information;
  
  training at the speech processing facility a speaker dependent speech recognition model associated with the user using the set of feature vectors and the text version of the speech data, producing a trained speaker dependent speech recognition model;
  
  transmitting to the user over the data network the trained speaker dependent speech recognition model.
- View Dependent Claims (38, 39, 40, 41, 42)
- - 38. The method of claim 37, further comprising:
    - receiving the speaker dependent speech recognition model associated with the user from the user over the data network.
  - 39. The method of claim 37, further comprising:
    - receiving over the data network a speech recognition model type;
      
      selecting the speaker dependent speech recognition model based on the speech recognition model type;
      
      wherein the model type is one of a Hidden Markov Model or dynamic time warping template.
  - 40. The method of claim 37, further comprising:
    - retrieving the speaker dependent speech recognition model from a plurality of speech recognition models stored at the speech processing facility based on the user identifier.
  - 41. The method of claim 37, wherein generating the set of feature vectors occurs remotely from the speech processing facility, and further comprising:
    - receiving the set of feature vectors corresponding to the speech data over the data network.
  - 42. The method of claim 37, further comprising:
    - storing the trained speaker dependent speech recognition model in a storage area of the speech processing facility corresponding to the user, the storage area configured to store a plurality of speech recognition models.

43. A method, comprising:
- receiving an updated speech recognition model over a data network from a remote speech processing facility;
  
  replacing an existing speech recognition model with the updated speech recognition model in a local memory store;
  
  receiving speech data associated with a user;
  
  retrieving the updated speech recognition model from the memory store;
  
  performing speech recognition using the speech data and the updated speech recognition model;
  
  performing an operation based on an outcome of the speech recognition;
- View Dependent Claims (44, 45, 46, 47, 48)
- - 44. The method of claim 43, further comprising:
    - sending a request to the remote speech processing facility to send the updated speech recognition model.
  - 45. The method of claim 43, wherein the existing speech recognition model and the updated speech recognition model are speaker dependent speech recognition models associated with the user.
  - 46. The method of claim 43, wherein the existing speech recognition model and the updated speech recognition model are different.
  - 47. The method of claim 43, further comprising:
    - sending the speech data and a text representation of the speech data to the remote speech processing facility over the data network;
      
      wherein the remote speech processing facility uses the speech data and text representation to generate the updated speech recognition model.
  - 48. A processor-readable memory storing instructions configured to cause the processor to perform the method of claim 43 when executed by the processor training at the speech processing facility a speaker dependent speech recognition model associated with the user using the set of feature vectors and the text version of the speech data, producing a trained speaker dependent speech recognition model;
    - transmitting to the user over the data network the updated speech recognition model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Telesector Resources Group Incorporated
Inventors
Reding, Craig L., Levas, Suzi

Granted Patent

US 7,302,391 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/275
CPC Class Codes

G10L 13/08   Text analysis or generation...

G10L 15/02   Feature extraction for spee...

G10L 15/063   Training

G10L 15/08   Speech classification or se...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

G10L 17/04   Training, enrolment or mode...

G10L 2015/221   Announcement of recognition...

G10L 2015/223   Execution procedure of a sp...

H04M 2201/40   using speech recognition sp...

H04M 2207/18   wireless networks

H04M 3/42204   Arrangements at the exchang...

Methods and apparatus for performing speech recognition over a network and using speech recognition results

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

110 Citations

48 Claims

Specification

Solutions

Use Cases

Quick Links

Methods and apparatus for performing speech recognition over a network and using speech recognition results

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

110 Citations

48 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links