Speech recognition system trained with regional speech characteristics
DCFirst Claim
Patent Images
1. A method of optimizing recognition of a speech utterance from a user with a distributed speech processing system comprising the steps of:
- (a) training one or more speech recognition models for recognizing speech utterances in a first natural language in a first training operation;
wherein said speech recognition models are implemented as part of a speech recognition engine executing on a network server system of the distributed speech processing system;
wherein said first training operation is based on samples of speech from a group of persons employing said first natural language and which are communicated over a network to the distributed speech processing system from geographic regions served by the distributed speech processing system, such that said speech recognition models are derived and constituted at least in part at said network server system;
wherein recognition of speech utterances during a speech recognition process is optimized for a geographic region by using one or more speech models which include variants of words to be uttered by users of the distributed speech processing system;
(b) configuring a set of speech recognition operations to be performed by the network server system based on computing resources available to such system.
2 Assignments
Litigations
0 Petitions
Accused Products
Abstract
A speech recognition system uses speech recognition models which are specifically trained and optimized for users residing in a particular geographic area or region. The speech models are trained with samples of word variants expected to be used in a natural language by representative members of a population associated with the geographic region or community of users. The speech recognition system is configured to have a real-time response that imitates a dialogue with a human operator.
499 Citations
28 Claims
-
1. A method of optimizing recognition of a speech utterance from a user with a distributed speech processing system comprising the steps of:
-
(a) training one or more speech recognition models for recognizing speech utterances in a first natural language in a first training operation; wherein said speech recognition models are implemented as part of a speech recognition engine executing on a network server system of the distributed speech processing system; wherein said first training operation is based on samples of speech from a group of persons employing said first natural language and which are communicated over a network to the distributed speech processing system from geographic regions served by the distributed speech processing system, such that said speech recognition models are derived and constituted at least in part at said network server system; wherein recognition of speech utterances during a speech recognition process is optimized for a geographic region by using one or more speech models which include variants of words to be uttered by users of the distributed speech processing system; (b) configuring a set of speech recognition operations to be performed by the network server system based on computing resources available to such system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method of optimizing recognition of a speech utterance from a user with a distributed speech processing system comprising the steps of:
-
(a) receiving first speech data from a client device in streaming packets through a network interface of a network server system and/or plurality of servers, said first speech data resulting from a first set of speech recognition operations being performed on the speech utterance by the client device; (b) completing recognition of the speech utterance using software routines executing at the network server system and/or plurality of servers which implement a second set of speech recognition operations; wherein said software routines at the network server system and/or plurality of servers use one or more speech recognition models that are trained based on speech characteristics of a group of persons residing in geographical regions served by the distributed speech processing system; further wherein said speech characteristics from such group of persons are obtained over said network interface such that said speech recognition models are derived and constituted at least in part at said network server system; (c) presenting an electronic agent within a browser of the client device, which electronic agent responds to user queries presented in speech form and assists the user to navigate and select items from an Internet web page; wherein said electronic agent further provides one or more specific suggested queries to the user; (d) providing a real-time response to the user with said electronic agent based on the speech utterance as well as subsequent speech utterances from the user so that an interactive dialog is conducted by the distributed speech processing system. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
-
21. A speech processing system for recognizing a speech utterance from a user comprising:
-
(a) a speech recognition engine; wherein said speech recognition engine executes on a network server system of a distributed speech processing system; (b) one or more speech recognition models useable by the speech recognition engine for recognizing speech utterances in a first language; wherein said one or more speech recognition models have been trained to include additional samples of speech from a group of persons employing said first language that have provided such additional samples over a network to the distributed speech processing system from geographic regions served by the distributed speech processing system such that said speech recognition models are derived and constituted at least in part at said network server system; further wherein recognition of speech utterances by the speech recognition engine is optimized for a geographic region by using one or more speech models which include variants of words to be uttered by users of the distributed client-server system; (c) configuring a set of speech recognition operations to be performed by the network server system for recognizing said speech utterances based on computing resources available to such system. - View Dependent Claims (22, 23, 24, 25, 26, 27)
-
-
28. A method of optimizing recognition of a speech utterance from a user with a distributed speech processing system comprising the steps of:
-
(a) calibrating noise at a client device prior to recognizing speech utterances from such client device; (b) training one or more speech recognition models for recognizing speech utterances in a first natural language in a first training operation; wherein said speech recognition models are implemented as part of a speech recognition engine executing on a network server system of the distributed speech processing system; wherein said first training operation is based on samples of speech from a group of persons employing said first natural language and which are communicated over a network to the distributed speech processing system from geographic regions served by the distributed speech processing system, such that said speech recognition models are derived and constituted at least in part at said network server system; wherein recognition of speech utterances during a speech recognition process is optimized for a geographic region by using one or more speech models which include variants of words to be uttered by users of the distributed speech processing system; (c) recognizing a speech utterance from a user by selecting a speech model from said one or more speech recognition models.
-
Specification