Method and system for network-based speech recognition

US 6,453,290 B1
Filed: 10/04/1999
Issued: 09/17/2002
Est. Priority Date: 10/04/1999
Status: Expired due to Term

First Claim

Patent Images

1. A method for supporting speech recognition on a server, comprising:

receiving a URL from a client remote from the server, the URL comprising a grammar context number;

receiving one or more input packets of encoded audio speech data from the client;

decoding each of the one or more input packets of encoded audio speech data into a portion of raw speech data upon receipt of the respective input packet;

storing each portion of raw speech data into a buffer of a linked list of buffers;

indicating a grammar associated with the grammar context number to a speech recognition engine;

providing each buffer containing a portion of raw speech data to the speech recognition engine as the speech recognition engine is ready to accept it; and

, receiving a response from the speech recognition engine, wherein the response is based on an evaluation of the raw speech data in relation to the grammar.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and systems for handling speech recognition processing in effectively real-time, via the internet, in order that users do not experience noticeable delays from the start of an exercise until they receive responsive feedback. A user uses a client to access the internet and a server supporting speech recognition processing, e.g., for language learning activities. The user inputs speech to the client, which transmits the user speech to the server in approximate real-time. The server evaluates the user speech in context of the current speech recognition exercise being executed, and provides responsive feedback to the client, again, in approximate real-time, with minimum latency delays. The client upon receiving responsive feedback from the server, displays, or otherwise provides, the feedback to the user.

Citations

16 Claims

1. A method for supporting speech recognition on a server, comprising:
- receiving a URL from a client remote from the server, the URL comprising a grammar context number;
  
  receiving one or more input packets of encoded audio speech data from the client;
  
  decoding each of the one or more input packets of encoded audio speech data into a portion of raw speech data upon receipt of the respective input packet;
  
  storing each portion of raw speech data into a buffer of a linked list of buffers;
  
  indicating a grammar associated with the grammar context number to a speech recognition engine;
  
  providing each buffer containing a portion of raw speech data to the speech recognition engine as the speech recognition engine is ready to accept it; and
  
  , receiving a response from the speech recognition engine, wherein the response is based on an evaluation of the raw speech data in relation to the grammar.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, further comprising:
3. The method of claim 2, wherein the one or more input packets of encoded audio speech data received from the client are transmitted via an internet connection, and the one or more transmission packets are transmitted to the client via the same internet connection.
4. The method of claim 1, wherein the one or more input packets of encoded audio speech data are received from the client for a language learning exercise.
5. The method of claim 1, wherein a first buffer containing a first portion of raw speech data is provided to the speech recognition engine before all of the input packets of encoded audio speech data are received from the client.
6. The method of claim 1, further comprising:
- identifying a speech file stored on the server based on receiving a second URL from the client;
  
  encoding the speech file into a smaller file representation;
  
  packaging the encoded speech file into one or more transmission packets; and
  
  , transmitting each of the one or more transmission packets to the client.

7. A method for supporting speech recognition on a server, comprising:
- receiving a first URL from a first client remote from the server, the first URL comprising a first grammar context;
  
  associating a first set of a plurality of buffers with the first client;
  
  generating a first instance of a speech recognition engine for the first client;
  
  indicating a first grammar associated with the first grammar context to the first instance of the speech recognition engine;
  
  receiving a second URL from a second client remote from the server, the second URL comprising a second grammar context;
  
  associating a second set of a plurality of buffers with the second client;
  
  generating a second instance of the speech recognition engine for the second client;
  
  indicating a second grammar associated with the second grammar context to the second instance of the speech recognition engine;
  
  receiving a packet of encoded audio speech data from the first client;
  
  decoding the packet of encoded audio speech data from the first client into a first client portion of raw data;
  
  storing the first client portion of raw data into a buffer of the first set of a plurality of buffers;
  
  providing the buffer containing the first client portion of raw data to the first instance of a speech recognition engine for processing with the first grammar;
  
  receiving a packet of encoded audio speech data from the second client;
  
  decoding the packet of encoded audio speech data from the second client into a second client portion of raw data;
  
  storing the second client portion of raw data into a buffer of the second set of a plurality of buffers; and
  
  , providing the buffer containing the second client portion of raw data to the second instance of the speech recognition engine for processing with the second grammar.
- View Dependent Claims (8, 9, 10, 11)
- - 8. The method of claim 7, further comprising:
9. The method of claim 8, wherein the packet of encoded audio speech data from the first client is received via a first TCP/IP connection and the one or more first client transmission packets are transmitted to the first client via the same first TCP/IP connection, and wherein the packet of encoded audio speech data from the second client is received via a second TCP/IP connection and the one or more second client transmission packets are transmitted to the second client via the same second TCP/IP connection.
10. The method of claim 9, further comprising:
- releasing the first TCP/IP connection following the transmission of the last packet of the one or more first client transmission packets to the first client;
  
  terminating the first instance of the speech recognition engine some time after receiving the first response from the first instance of the speech recognition engine;
  
  releasing the second TCP/IP connection following the transmission of the last packet of the one or more second client transmission packets to the second client; and
  
  , terminating the second instance of the speech recognition engine some time after receiving the second response from the second instance of the speech recognition engine.
11. The method of claim 7, wherein the packet of encoded audio speech data from the first client is for a first language learning exercise, and the packet of encoded audio speech data from the second client is for a second language learning exercise.

12. A method for supporting speech recognition on a user processing device, comprising:
- receiving a stream of audio speech data;
  
  storing a portion of the stream of audio speech data into a buffer of a linked list of buffers as it is received;
  
  transmitting a URL comprising a grammar context number which is indicative of a speech recognition exercise that the stream of audio speech data is for;
  
  at a time t₁, wherein t₁is prior to the time when the entirety of the stream of audio speech data is received, encoding a buffer of audio speech data into a smaller file representation;
  
  at a time t₂, wherein t₂is prior to the time when the entirety of the stream of audio speech data is received, formatting a portion of the smaller file representation into a packet for transmitting over the internet;
  
  at a time t₃, wherein t₃is prior to the time when the entirety of the stream of audio speech data is received, transmitting the packet over the internet; and
  
  establishing an internet connection prior to time t₃.
- View Dependent Claims (13, 14, 15, 16)
- - 13. The method of claim 12, wherein the stream of audio speech data is transmitted to a server for processing a language learning exercise.
  - 14. The method of claim 13, wherein the stream of audio speech data is received by a client, and a packet of encoded audio speech data is transmitted by the client to the server remotely located from the client.
  - 15. The method of claim 12, further comprising:
16. The method of claim 12, further comprising, at a time t₄, receiving one or more packets of a text response, decoding a packet of text response after it is received, and displaying the text response.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Pearson Education Incorporated (Pearson plc)
Original Assignee
GlobalEnglish Corporation (Pearson plc)
Inventors
Jochumson, Christopher S.
Primary Examiner(s)
Banks-Harold, Marsha D.
Assistant Examiner(s)
Lerner, Martin

Application Number

US09/412,043
Time in Patent Office

1,079 Days
Field of Search

704/270, 704/270.1, 704/271, 704/231, 434/185, 370/229, 370/235, 370/412, 370/413, 370/415
US Class Current

704/231
CPC Class Codes

G09B 19/04   Speaking with audible prese...

G10L 15/19   Grammatical context, e.g. d...

G10L 15/30   Distributed recognition, e....

G10L 2015/228   of application context

Y10S 707/99933   Query processing, i.e. sear...

Method and system for network-based speech recognition

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for network-based speech recognition

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links