Client-server speech recognition for altering processing time based on a value communicated between client and server

US 7,831,422 B1
Filed: 10/26/2007
Issued: 11/09/2010
Est. Priority Date: 10/04/1999
Status: Expired due to Fees

First Claim

Patent Images

1. A system supporting speech recognition comprising:

two or more clients, each client comprising the capability to receive audio speech from a user, store the audio speech in a first set of one or more buffers in a raw uncompressed audio format, each buffer comprising a portion of the received audio speech, write the stored audio speech from a first buffer in the first set of buffers to a second buffer in a second set of one or more buffers, encode the stored audio speech in the second buffer before all of the audio speech is received, package the encoded audio speech from the second buffer into one or more packets to be transmitted over the Internet before all of the audio speech is received, and transmit a packet of encoded audio speech over the Internet before all of the audio speech is received; and

a server, the server comprising the capability to receive packets of encoded audio speech from the two or more clients, decode each of the packets of audio speech and store the resultant raw speech into one or more buffers for the two or more clients, and evaluate the resultant raw speech received from each of the two or more clients,wherein the server further comprises the capability to transmit a response to a client of the two or more clients, the response a result of the server'"'"'s evaluation of the resultant raw speech received from the client, and the server alters a processing time used to evaluate the resultant raw speech based on a value communicated between the client and the server, and the client further comprises the capability to receive the response from the server.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and systems for handling speech recognition processing in effectively real-time, via the Internet, in order that users do not experience noticeable delays from the start of an exercise until they receive responsive feedback. A user uses a client to access the Internet and a server supporting speech recognition processing, e.g., for language learning activities. The user inputs speech to the client, which transmits the user speech to the server in approximate real-time. The server evaluates the user speech in context of the current speech recognition exercise being executed, and provides responsive feedback to the client, again, in approximate real-time, with minimum latency delays. The client upon receiving responsive feedback from the server, displays, or otherwise provides, the feedback to the user.

Citations

20 Claims

1. A system supporting speech recognition comprising:
- two or more clients, each client comprising the capability to receive audio speech from a user, store the audio speech in a first set of one or more buffers in a raw uncompressed audio format, each buffer comprising a portion of the received audio speech, write the stored audio speech from a first buffer in the first set of buffers to a second buffer in a second set of one or more buffers, encode the stored audio speech in the second buffer before all of the audio speech is received, package the encoded audio speech from the second buffer into one or more packets to be transmitted over the Internet before all of the audio speech is received, and transmit a packet of encoded audio speech over the Internet before all of the audio speech is received; and
  
  a server, the server comprising the capability to receive packets of encoded audio speech from the two or more clients, decode each of the packets of audio speech and store the resultant raw speech into one or more buffers for the two or more clients, and evaluate the resultant raw speech received from each of the two or more clients,wherein the server further comprises the capability to transmit a response to a client of the two or more clients, the response a result of the server'"'"'s evaluation of the resultant raw speech received from the client, and the server alters a processing time used to evaluate the resultant raw speech based on a value communicated between the client and the server, and the client further comprises the capability to receive the response from the server.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. The system of claim 1 wherein based on a URL sent by the client, the server determines whether to expect speech data for processing from the client.
  - 3. The system of claim 1 wherein the response is in text format, and the client of the two or more clients comprises a text-to-speech engine which converts a text format response to audio data, and an audio output device that the client uses to output the audio data to a user.
  - 4. The system of claim 1 wherein the server further comprises two or more stored text format files, and the server selects a stored text format file to transmit to the client as a result of the server'"'"'s evaluation of the resultant raw speech received from the client.
  - 5. The system of claim 4 wherein the server further comprises the capability to partition a stored text format file into two or more packets for the transmission over the Internet, and to transmit each packet over the Internet to the client.
  - 6. The system of claim 5 wherein the client further comprises an audio output device, and the capability to receive the packets of text format, convert the packets of text format to audio data and play the audio data to a user.
  - 7. The system of claim 1 wherein the first set of one or more buffers comprises a linked list of buffers.
  - 8. The system of claim 1 wherein the second set of one or more buffers comprises a linked list of buffers.
  - 9. The system of claim 1 wherein the second set of one or more buffers comprises a predefined number of buffers.
  - 10. The system of claim 1 wherein the encoded audio speech is in a compressed format.
  - 11. The system of claim 1 wherein the server evaluates the resultant raw speech using speech recognition processing.
  - 12. The system of claim 1 wherein the response transmitted to the client comprises an audio output at the client.
  - 13. The system of claim 1 wherein the response transmitted to the client comprises a visual output at the client.
  - 14. The system of claim 1 wherein the response transmitted to the client comprises a graphical output at the client.
  - 15. The system of claim 1 wherein the value is communicated from the client to the server over the Internet.
  - 16. The system of claim 1 wherein a browser component is executing on each client, and the browser component controls capturing and sending the audio speech transmitting the speech to the server.
  - 17. The system of claim 16 wherein the browser component executes JavaScript code.

18. A system supporting speech recognition comprising:
- at least one client, the client comprising the capability to receive audio speech from a user, store the audio speech in a first set of one or more buffers in a raw uncompressed audio format, each buffer comprising a portion of the received audio speech, write the stored audio speech from a first buffer in the first set of buffers to a second buffer in a second set of one or more buffers, encode the stored audio speech in the second buffer before all of the audio speech is received, package the encoded audio speech from the second buffer into one or more packets to be transmitted over the Internet before all of the audio speech is received, and transmit a packet of encoded audio speech over the Internet before all of the audio speech is received; and
  
  a server, the server comprising the capability to receive packets of encoded audio speech from the client, decode each of the packets of audio speech and store the resultant raw speech into one or more buffers for the client, and evaluate the resultant raw speech received from the client,wherein the server further comprises the capability to transmit a response to the client, the response a result of the server'"'"'s evaluation of the resultant raw speech received from the client, and the server alters a processing time used to evaluate the resultant raw speech based on a value communicated between the client and the server over the Internet, and the client further comprises the capability to receive the response from the server transmitted over the Internet.
- View Dependent Claims (19, 20)
- - 19. The system of claim 18 wherein the server evaluates the resultant raw speech based on speech recognition processing performed at the server.
  - 20. The system of claim 18 wherein a browser component executed on the client and controls capturing and sending the audio speech transmitting the speech to the server.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Pearson Education Incorporated (Pearson plc)
Original Assignee
GlobalEnglish Corporation (Pearson plc)
Inventors
Jochumson, Christopher S.
Primary Examiner(s)
Lerner; Martin

Application Number

US11/925,584
Time in Patent Office

1,110 Days
Field of Search

704/270, 704/270.1, 704/275, 704/231, 704/236, 709/202, 709/250, 709/203, 710/52, 370/229
US Class Current

704/231
CPC Class Codes

G10L 13/08   Text analysis or generation...

G10L 15/00   Speech recognition G10L17/0...

G10L 15/22   Procedures used during a sp...

G10L 15/30   Distributed recognition, e....

G10L 19/00   Speech or audio signals ana...

G10L 19/0018   Speech coding using phoneti...

Client-server speech recognition for altering processing time based on a value communicated between client and server

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Client-server speech recognition for altering processing time based on a value communicated between client and server

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links