Internet based speech recognition system with dynamic grammars
First Claim
Patent Images
1. A World-Wide Web (WWW) accessible natural language computing system comprising:
- a speech recognition engine configured to distribute speech processing operations between a client device and a server device for processing an utterance for a speech based query;
said server device being configured to generate a recognized speech query using a grammar accessible to said server device;
wherein the server device transfers data representing speech using a hypertext transfer protocol (HTTP) and a format which includes one or more NULL characters for denoting an end of a speech data stream, the one or more NULL characters being inserted into the speech data stream after other NULL characters are removed from the speech data stream;
a natural language routine executing on the server device and configured to process said recognized speech query to generate a natural language result in real-time;
an Internet accessible page coupled to the server device and having a list of items, wherein content for at least some of said list of items identified on said Internet accessible page is selectable by a user based on said natural language result;
a database coupled to the server device for storing answers which correspond to said content for said list of items on said Internet accessible page;
wherein a grammar used to recognize said speech based query can be varied between utterances and loaded dynamically as needed to recognize utterances associated with said content.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech-enabled WWW based computing system allows a user to interact with content associated with a web page and select items of interest using speech as a mode of input. Dynamic grammars can assist in the recognition operations to improve speed and comprehension.
798 Citations
12 Claims
-
1. A World-Wide Web (WWW) accessible natural language computing system comprising:
-
a speech recognition engine configured to distribute speech processing operations between a client device and a server device for processing an utterance for a speech based query; said server device being configured to generate a recognized speech query using a grammar accessible to said server device; wherein the server device transfers data representing speech using a hypertext transfer protocol (HTTP) and a format which includes one or more NULL characters for denoting an end of a speech data stream, the one or more NULL characters being inserted into the speech data stream after other NULL characters are removed from the speech data stream; a natural language routine executing on the server device and configured to process said recognized speech query to generate a natural language result in real-time; an Internet accessible page coupled to the server device and having a list of items, wherein content for at least some of said list of items identified on said Internet accessible page is selectable by a user based on said natural language result; a database coupled to the server device for storing answers which correspond to said content for said list of items on said Internet accessible page; wherein a grammar used to recognize said speech based query can be varied between utterances and loaded dynamically as needed to recognize utterances associated with said content.
-
-
2. The World-Wide Web (WWW) accessible natural language computing system of claim 1 wherein the server device transfers said speech related data in an amount which is determined automatically to reduce latency.
-
3. The system of claim 1 wherein said client device includes one of a personal digital assistant (PDA), a cellphone, a notebook computer or a computer peripheral.
-
4. The system of claim 1 wherein said Internet accessible page is associated with a search engine.
-
5. The system of claim 1 wherein said data representing speech includes acoustic features of said utterance.
-
6. The system of claim 1 wherein said grammar is derived from representative examples of words collected from samples from users of different geographical areas.
-
7. A World-Wide Web (WWW) accessible natural language computing system comprising:
-
a speech recognition engine configured to distribute speech processing operations between a client device and a server device for processing an utterance for a speech based query; wherein the speech recognition engine stores calibration configuration data pertaining to calibrating speech and silence components of a speech utterance for at least one of a plurality of portable client devices supported by said server device; said server device being configured to generate a recognized speech query using a grammar accessible to said server device; wherein the server device processes data representing speech that is transferred using a hypertext transfer protocol (HTTP) and a transport format which includes one or more NULL characters, the one or more NULL characters being inserted into the speech data stream after other NULL characters are removed from the speech data stream; a natural language routine executing on the server device and configured to process said recognized speech query to generate a natural language result in real-time; an Internet accessible page coupled to the server device and having a list of items, wherein content for at least some of said list of items identified on said Internet accessible page is selectable by a user based on said natural language result; a database coupled to the server device for storing answers which correspond to said content for said list of items on said Internet accessible page; wherein a grammar used to recognize said speech based query can be varied between utterances and loaded dynamically as needed to recognize utterances associated with said content.
-
-
8. World-Wide Web (WWW) accessible natural language computing system of claim 7, further including a query/answer routine executing on the client device and adapted to transmit said speech based query over a communications channel in response to a button being pressed on the client device.
-
9. The system of claim 7 wherein said portable client device includes one of a personal digital assistant (PDA), a cellphone, a notebook computer or a computer peripheral.
-
10. The system of claim 7 wherein said Internet accessible page is associated with a search engine.
-
11. The system of claim 7 wherein said data representing speech includes acoustic features of said utterance.
-
12. The system of claim 7 wherein said grammar is derived from representative examples of words collected from samples from users of different geographical areas.
Specification