Method and system for network-based speech recognition
First Claim
1. A system supporting speech recognition comprising:
- two or more clients, each client comprising the capability to receive audio speech from a user, store the audio speech in one or more buffers, each buffer comprising a portion of the received audio speech, encode a buffer of the received audio speech before all of the audio speech is received, package the encoded buffer to receive audio speech into one or more packets to be transmitted over the internet before all of the audio speech is received, and transmit a packet of encoded audio speech over the internet before all of the audio speech is received; and
a server, the server comprising the capability to receive packets of encoded audio speech from at least two clients, decode each of the packets of audio speech and store the resultant raw speech into one or more buffers for the respective client, and a processing time used to evaluate the resultant raw speech will vary based on a value communicated to the server from each respective client.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for handling speech recognition processing in effectively real-time, via the internet, in order that users do not experience noticeable delays from the start of an exercise until they receive responsive feedback. A user uses a client to access the internet and a server supporting speech recognition processing, e.g., for language learning activities. The user inputs speech to the client, which transmits the user speech to the server in approximate real-time. The server evaluates the user speech in context of the current speech recognition exercise being executed, and provides responsive feedback to the client, again, in approximate real-time, with minimum latency delays. The client upon receiving responsive feedback from the server, displays, or otherwise provides, the feedback to the user.
66 Citations
25 Claims
-
1. A system supporting speech recognition comprising:
-
two or more clients, each client comprising the capability to receive audio speech from a user, store the audio speech in one or more buffers, each buffer comprising a portion of the received audio speech, encode a buffer of the received audio speech before all of the audio speech is received, package the encoded buffer to receive audio speech into one or more packets to be transmitted over the internet before all of the audio speech is received, and transmit a packet of encoded audio speech over the internet before all of the audio speech is received; and a server, the server comprising the capability to receive packets of encoded audio speech from at least two clients, decode each of the packets of audio speech and store the resultant raw speech into one or more buffers for the respective client, and a processing time used to evaluate the resultant raw speech will vary based on a value communicated to the server from each respective client. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system supporting speech recognition comprising:
-
two or more clients, each client comprising the capability to receive audio speech from a user, store the audio speech in one or more buffers in a raw uncompressed audio format each buffer comprising a portion of the received audio speech encode a buffer of the received audio speech before all of the audio speech is received package the encoded buffer to receive audio speech into one or more packets to be transmitted over the Internet before all of the audio speech is received, and transmit a packet of encoded audio speech over the Internet before all of the audio speech is received; and a server, the server comprising the capability to receive packets of encoded audio speech from at least two clients decode each of the packets of audio speech and store the resultant raw speech into one or more buffers for the respective client and evaluate the resultant raw speech received from each of the at least two clients, wherein the server further comprises two or more stored text format files, and the server selects a stored text format file to transmit to a client of the two or more clients as a result of the server'"'"'s evaluation of the resultant raw speech received from the client, and the server adjusts a processing time used to evaluate the resultant raw speech based on a value in a URL connection between the client and the server. - View Dependent Claims (11)
-
-
12. A system supporting speech recognition comprising:
-
two or more clients, each client comprising the capability to receive audio speech from a user, store the audio speech in one or more buffers organized as a linked list in a raw uncompressed audio format, each buffer comprising a portion of the received audio speech, encode a buffer of the received audio speech before all of the audio speech is received, package the encoded buffer to receive audio speech into one or more packets to be transmitted over a network before all of the audio speech is received, and transmit a packet of encoded audio speech over the network before all of the audio speech is received; and a server, the server comprising the capability to receive packets of encoded audio speech from at least two clients, decode each of the packets of audio speech and store the resultant raw speech into one or more buffers for the respective client, and evaluate the resultant raw speech received from each of the at least two clients, wherein a level of processing used in the evaluation of the resultant raw speech received from each of the at least two clients is alterable based on a value communicated between the clients and the server. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A system supporting speech recognition comprising:
-
two or more clients, each client comprising the capability to receive audio speech from a user, store the audio speech in one or more buffers in a raw uncompressed audio format, each buffer comprising a portion of the received audio speech, encode a buffer of the received audio speech before all of the audio speech is received, package the encoded buffer to receive audio speech into one or more packets to be transmitted over the internet before all of the audio speech is received, and transmit a packet of encoded audio speech over the internet before all of the audio speech is received; and a server, the server comprising the cap ability to receive packets of encoded audio speech from at least two clients, decode each of the packets of audio speech and store the resultant raw speech into one or more buffers for the respective client, and evaluate the resultant raw speech received from each of the at least two clients, wherein a user selects a user objective at a client, the client transmits the user objective to the server, and the server evaluates the resultant raw speech received from the client based on the user objective and a value communicated to the server by URL.
-
-
23. A system supporting speech recognition comprising:
-
two or more clients, each client comprising the capability to receive audio speech from a user, store the audio speech in one or more buffers in a raw uncompressed audio format, each buffer comprising a portion of the received audio speech, encode a buffer of the received audio speech before all of the audio speech is received package the encoded buffer to receive audio speech into one or more packets to be transmitted over the internet before all of the audio speech is received, and transmit a packet of encoded audio speech over the internet before all of the audio speech is received; and a server, the server comprising the capability to receive packets of encoded audio speech from at least two clients, decode each of the packets of audio speech and store the resultant raw speech into one or more buffers for the respective client and evaluate the resultant raw speech received from each of the at least two clients, wherein before the client receives audio speech from a user, the server transmits a file to a client, the client presents the file in at least one of an audio or visual format to the user, and the server evaluates the resultant raw speech received from the client in connection with the file transmitted from the server to the client and a processing time used to evaluate the resultant raw speech will vary based on a value communicated to the server from the client.
-
-
24. A system supporting speech recognition comprising:
-
two or more clients, each client comprising the capability to receive audio speech from a user, store the audio speech in one or more buffers in a raw uncompressed audio format, each buffer comprising a portion of the received audio speech, encode a buffer of the received audio speech before all of the audio speech is received package the encoded buffer to receive audio speech into one or more packets to be transmitted over the internet before all of the audio speech is received, and transmit a packet of encoded audio speech over the internet before all of the audio speech is received; and a server, the server comprising the capability to receive packets of encoded audio speech from at least two clients decode each of the packets of audio speech and store the resultant raw speech into one or more buffers for the respective client, and evaluate the resultant raw speech received from each of the at least two clients, wherein the server transmits a first file to a client, the client presents the first file in at least one of an audio or visual format to the user, after presenting the first file to the user, the client receives audio speech from the user, and the server evaluates the resultant raw speech received from the client in connection with the first file transmitted from the server to the client and a processing time used by the server to evaluate the resultant raw speech is alterable based on a value communicated from the client to the server. - View Dependent Claims (25)
-
Specification