Method for processing speech using dynamic grammars

US 20040236580A1
Filed: 03/02/2004
Published: 11/25/2004
Est. Priority Date: 11/12/1999
Status: Active Grant

First Claim

Patent Images

1. A distributed speech query recognition system adapted for responding to speech-based queries comprising:

a client device including;

i) a speech capture software module for capturing a speech utterance from a user of said client device and partially processing said speech utterance to capture acoustic features;

ii) a communications module for transferring said acoustic features and context information associated with said client device through a network;

wherein said context information is related to items presented within a browser to said user of said client device when said user provides said speech utterance;

a server device including a speech recognition engine software module for generating recognized speech data from said acoustic features;

wherein said speech recognition engine uses a dynamic speech recognition grammar which is loaded based on said context information.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech data is processed with one or more dynamic grammars, to reduce latency and improve accuracy. Different speech grammars are used by a speech recognition process depending on a context experienced by a speaker, and sentence grammars are similarly varied during a natural language process. The methods are useful for distributed speech recognition systems such as a client-server system, typically implemented on an intranet or over the Internet based on user queries at his/her computer, a PDA, or a workstation using a speech input interface.

Citations

30 Claims

1. A distributed speech query recognition system adapted for responding to speech-based queries comprising:
- a client device including;
  
  i) a speech capture software module for capturing a speech utterance from a user of said client device and partially processing said speech utterance to capture acoustic features;
  
  ii) a communications module for transferring said acoustic features and context information associated with said client device through a network;
  
  wherein said context information is related to items presented within a browser to said user of said client device when said user provides said speech utterance;
  
  a server device including a speech recognition engine software module for generating recognized speech data from said acoustic features;
  
  wherein said speech recognition engine uses a dynamic speech recognition grammar which is loaded based on said context information.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The distributed speech query recognition system of claim 1, wherein said speech recognition engine also uses a dynamic speech recognition dictionary of phonemes which is loaded for a context experienced by a speaker using said client device.
  - 3. The distributed speech query recognition system of claim 1, further including a natural language engine for generating a recognized sentence from said recognized speech data.
  - 4. The distributed speech query recognition system of claim 3, wherein said recognized sentence is also determined by a database switched in response to said context.
  - 5. The distributed speech query recognition system of claim 4, wherein said recognized sentence is determined by identifying a relationship of words present in said recognized speech data, including whether such words are near in each other in said speech utterance.
  - 6. The distributed speech query recognition system of claim 1 further including a query response software module for providing an answer to said speaker in response to a question posed in said speech utterance.
  - 7. The distributed speech query recognition system of claim 1, wherein said recognized speech data is generated in real time.

8. A speech query recognition system adapted for responding to speech-based queries system comprising:
- a speech recognition engine for generating recognized speech data from a speech signal resulting from a speech-based query provided by a speaker;
  
  wherein said speech recognition engine uses a limited speech recognition grammar of words which is loaded for a context experienced by said speaker when said speech-based query is made, said context being related to an environment presented within a browser to said speaker at a time when said speaker provides said speech-based query;
  
  a natural language engine which generates recognized speech sentence data corresponding to said speech-based query based on said recognized speech data;
  
  one or more query/response databases for storing question/answer pairs corresponding to said speech-based query;
  
  wherein a set of question/answer pairs is selected and used to determine an answer to said speech-based query based on said context experienced by said speaker;
  
  a query formulation engine adapted for retrieving one or more question/answer pairs from said set of question/answer pairs based on said recognized speech sentence data provided by said natural language engine.
- View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
- - 9. The speech query recognition system of claim 8, wherein said one or more query/response databases are linked to said speech recognition system and include separate independent databases for identifying queries made at different hierarchical levels of an interface presented to said speaker.
  - 10. The speech query recognition system of claim 8, wherein said context is changed in accordance with changes in an options menu on a website page presented to said speaker.
  - 11. The speech query recognition system of claim 8, wherein dictionary files of phonemes are also loaded in accordance with said context.
  - 12. The speech query recognition system of claim 8, wherein said speech recognition engine is distributed between a client device and a server device, and information for said context is communicated from said client device to said server device automatically based on a state of an application program executing on said client device.
  - 13. The speech query recognition system of claim 8, further including a step of identifying said answer to said speaker in real-time.
  - 14. The speech query recognition system of claim 8, wherein said answer is communicated in an audible form to said speaker.
  - 15. The speech query recognition system of claim 8, wherein said limited speech recognition grammar is a reduced set of words which are recognizable by said speech recognition engine.

16. A method of responding to speech-based queries across a distributed speech recognition system including the steps of:
- capturing a speech utterance from a user of a client device;
  
  extracting acoustic features from said speech utterance to perform a partial recognition of said speech utterance;
  
  transferring said acoustic features and context information associated with said client device through a network to a server device;
  
  wherein said context information is related to items presented within a browser to said user of said client device when said user provides said speech utterance;
  
  completing said recognition of said speech utterance at said server device to generate recognized speech data from said acoustic features;
  
  wherein said speech recognition engine at said server device uses a dynamic speech recognition grammar which is loaded based on said context information.
- View Dependent Claims (17, 18, 19, 20, 21, 22)
- - 17. The method of claim 16, wherein said speech recognition engine also uses a dynamic speech recognition dictionary of phonemes which is loaded for a context experienced by a speaker using said client device.
  - 18. The method of claim 16, further including a step:
    - generating a recognized sentence from said recognized speech data using a natural language engine.
  - 19. The method of claim 18, wherein said recognized sentence is also determined by a database switched in response to said context.
  - 20. The method of claim 19, wherein said recognized sentence is determined by identifying a relationship of words present in said recognized speech data, including whether such words are near in each other in said speech utterance.
  - 21. The method of claim 16 further including a step of providing an answer to said speaker in response to a question posed in said speech utterance.
  - 22. The method of claim 16, wherein said recognized speech data is generated in real time.

23. A method of responding to speech-based queries system comprising the steps of:
- generating recognized speech data from a speech signal resulting from a speech-based query provided by a speaker;
  
  wherein said speech recognition engine uses a limited speech recognition grammar of words which is loaded for a context experienced by said speaker when said speech-based query is made, said context being related to an environment presented within a browser to said speaker at a time when said speaker provides said speech-based query;
  
  generates recognized speech sentence data corresponding to said speech-based query based on said recognized speech data;
  
  storing question/answer pairs corresponding to said speech-based query in one or more query/response databases;
  
  wherein a set of question/answer pairs is selected and used to determine an answer to said speech-based query based on said context experienced by said speaker;
  
  retrieving one or more question/answer pairs from said set of question/answer pairs based on said recognized speech sentence data provided by said natural language engine.
- View Dependent Claims (24, 25, 26, 27, 28, 29, 30)
- - 24. The method of claim 23, further including a step of linking said one or more query/response databases to said speech recognition system so that separate independent databases are used for identifying queries made at different hierarchical levels of an interface presented to said speaker.
  - 25. The method of claim 23, further including a step of changing said context in accordance with changes in an options menu on a website page presented to said speaker.
  - 26. The method of claim 23, further including a step of loading dictionary files of phonemes in accordance with said context.
  - 27. The method of claim 23, wherein said speech recognition engine is distributed between a client device and a server device, and information for said context is communicated from said client device to said server device automatically based on a state of an application program executing on said client device.
  - 28. The method of claim 23, further including a step of identifying said answer to said speaker in real-time.
  - 29. The method of claim 28, wherein said answer is communicated in an audible form to said speaker.
  - 30. The method of claim 23, wherein said limited speech recognition grammar is a reduced set of words which are recognizable by said speech recognition engine.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Phoenix Solutions Incorporated (CDC Corporation)
Inventors
Bennett, Ian M.

Granted Patent

US 7,555,431 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/270.1
CPC Class Codes

G06F 16/243   Natural language query form...

G06F 16/24522   Translation of natural lang...

G06F 16/3344   using natural language anal...

G06F 40/216   using statistical methods

G06F 40/237   Lexical tools

G06F 40/30   Semantic analysis

G06F 40/42   Data-driven translation

G06F 40/44   Statistical methods, e.g. p...

G09B 5/04   with audible presentation o...

G09B 7/00   Electrically-operated teach...

G10L 15/142   Hidden Markov Models [HMMs]

G10L 15/18   using natural language mode...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/183   using context dependencies,...

G10L 15/22   Procedures used during a sp...

G10L 15/285   Memory allocation or algori...

G10L 15/30   Distributed recognition, e....

H04M 2250/74   with voice recognition means

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99935   Query augmenting and refini...

Method for processing speech using dynamic grammars

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Method for processing speech using dynamic grammars

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links