Speech based learning/training system using semantic decoding

US 20120265531A1
Filed: 06/18/2012
Published: 10/18/2012
Est. Priority Date: 11/12/1999
Status: Abandoned Application

First Claim

Patent Images

1. A machine executable program for assisting a computing system to effectuate distributed voice query recognition comprising:

a first audio signal receiving routine for receiving user speech utterances, said speech utterances including sentences comprised of one or more words; and

converting said speech utterances into analog speech utterance signals;

a first signal processing routine adapted to convert said analog speech utterance signals into digital representative speech data values, which by themselves are insufficient for completing recognition of said one or more words contained in said speech utterance signals; and

a formatting routine for converting said representative speech data values into a transmission format suitable for transmission over a communications channel to a second processing routine executing on a separate computing systemwherein said representative speech data values are transmitted continuously during said speech utterances within streaming packets and without waiting for silence to be detected and/or said speech utterances to be completed; and

wherein said representative speech data values are used by said second processing routine to compute additional speech data values derived from said representative speech data values to generate a signal sufficient for a speech recognition routine to complete recognition of words articulated in said speech utterance, andwherein the computing system for converting speech utterances into recognized words is distributed between a client and a server;

a natural language processing routine for determining a meaning associated with said words recognized by said speech recognition routine.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An intelligent query system for processing voiced-based queries is disclosed, which uses semantic based processing to identify the question posed by the user by understanding the meaning of the users utterance. Based on identifying the meaning of the utterance, the system selects a single answer that best matches the user'"'"'s query. The answer that is paired to this single question is then retrieved and presented to the user. The system, as implemented, accepts environmental variables selected by the user and is scalable to provide answers to a variety and quantity of user-initiated queries.

Citations

16 Claims

1. A machine executable program for assisting a computing system to effectuate distributed voice query recognition comprising:
- a first audio signal receiving routine for receiving user speech utterances, said speech utterances including sentences comprised of one or more words; and
  
  converting said speech utterances into analog speech utterance signals;
  
  a first signal processing routine adapted to convert said analog speech utterance signals into digital representative speech data values, which by themselves are insufficient for completing recognition of said one or more words contained in said speech utterance signals; and
  
  a formatting routine for converting said representative speech data values into a transmission format suitable for transmission over a communications channel to a second processing routine executing on a separate computing systemwherein said representative speech data values are transmitted continuously during said speech utterances within streaming packets and without waiting for silence to be detected and/or said speech utterances to be completed; and
  
  wherein said representative speech data values are used by said second processing routine to compute additional speech data values derived from said representative speech data values to generate a signal sufficient for a speech recognition routine to complete recognition of words articulated in said speech utterance, andwherein the computing system for converting speech utterances into recognized words is distributed between a client and a server;
  
  a natural language processing routine for determining a meaning associated with said words recognized by said speech recognition routine.
- View Dependent Claims (2)
- - 2. The machine executable program of claim 1 wherein the first signal processing routine is further adapted to separate a user'"'"'s speech utterance signals from other signals representing background noise.

3. A method of performing speech recognition comprising the steps of:
- (a) receiving user speech utterance signals representing speech utterances to be recognized, said speech utterances including sentences comprised of one or more words; and
  
  (b) generating representative speech data values with a first computing device which by themselves are insufficient for completing recognition of said one or more words contained in said speech utterance signals; and
  
  (c) formatting said representative speech data values into a transmission format suitable for transmission over a communications channel from said first computing device to a second computing device; and
  
  wherein said representative speech data values are transmitted continuously during said speech utterances within streaming packets and without waiting for silence to be detected and/or said speech utterances to be completed; and
  
  (d) performing a recognition of said one or more words at said second computing device using said representative speech data values and additional speech data values derived from said representative speech data values to generate recognized text;
  
  (e) performing a natural language processing operation on said recognized text to determine a meaning associated with said sentences in real-time.
- View Dependent Claims (4, 5)
- - 4. The method of claim 3, wherein signal processing functions required to perform said recognition can be allocated and configured between said first computing device and said second computing device as needed based on an evaluation of computing resources available to said first and second computing devices respectively.
  - 5. The method of claim 3, wherein said first computing device is part of a client computing system, and said second computing device is part of a server computing system, and said communications channel is a network.

6. A method of interacting with a user of a natural language query system comprising the steps of:
- a) providing a client side speech recognition engine adapted to recognize a first set of words and/or phrases from a user during an interactive speech session;
  
  wherein said first set of words and/or phrases can include a natural language query presented as continuous natural language spoken data;
  
  and wherein said first set of words and/or phrases for said speech recognition engine is derived from a limited speech recognition grammar of words that are preloaded into the speech recognition engine prior to the time the speech recognition engine receives speech from a user;
  
  b) providing a plurality of databases having query/answer pairs, wherein each database concerns one or more topics which can be responded to by the natural language query system during said interactive speech based session with a user;
  
  c) providing a server side natural language routine adapted to process said first set of words and/or phrases and identify a response to said natural language query based on said query/answer pairs;
  
  wherein said natural language routine is adapted to consider only a subset of said first set of words and/or phrases, and further can consider words and/or phrases in said natural language query which are not present in said query/answer pairs to determine said response;
  
  d) providing an interactive electronic agent coupled to said natural language routine and configured to;
  
  i. provide a prompt to the user during said interactive speech based session with suggestions on queries which can be made to the natural language query system;
  
  ii. provide a confirmation of a substance of said natural language query;
  
  iii. provide said response to the user from the natural language query routine.
- View Dependent Claims (7, 8, 9)
- - 7. The method of claim 6 wherein said subset of words and/or phrases in said natural language query can be assigned different weightings determined by said natural language routine.
  - 8. The method of claim 6 in which the speech recognition engine receives streaming speech data having reduced latency data content before silence is detected and the utterance is complete.
  - 9. The method of claim 6 wherein the interactive electronic agent provides responses adjusted for a context experienced by the user.

10. A speech query recognition system adapted for responding to speech-based queries system comprising:
- (a) a continuous speech recognition engine for generating recognized speech data from a speech signal resulting from a speech-based query provided by a speaker;
  
  (b) wherein said continuous speech recognition engine uses a limited speech recognition grammar of words which is pre-loaded for a context experienced by said speaker before said speech-based query is made, said context being determined automatically by an application program executing for said speaker such that the grammar is available at a time when said speaker provides said speech-based query;
  
  (c) wherein the continuous speech recognition engine is distributed between a client and a server;
  
  (d) a natural language engine which generates recognized speech sentence data corresponding to a meaning of said speech-based query based on said recognized speech data;
  
  (e) a plurality of query/response databases for storing question/answer pairs corresponding to said speech-based query such that queries made at different hierarchical levels of an interface presented to said speaker are associated with separate independent databases(f) wherein a first limited set of question/answer pairs is determined based on said context from a complete set of question/answer pairs supported by such speech query recognition system;
  
  (g) a query formulation engine adapted for retrieving one or more question/answer pairs from said first limited set of question/answer pairs based on said recognized speech sentence data provided by said natural language engine, which natural language engine is further adapted by said context to not consider every possible word or phrase present in said complete set of question/answer pairs and only considers words and phrases present in said first limited set of question/answer pairs to determine said meaning of said recognized speech sentence data;
  
  (h) wherein the speech query recognition system is configured so that said context controls both a limited speech recognition grammar used for speech recognition of the speech-based query and a set of one or more answers to be provided in response thereto.
- View Dependent Claims (11, 12)
- - 11. The speech query recognition system of claim 10, further including a step of identifying said one or more answers to said speaker in real-time.
  - 12. The speech query recognition system of claim 10, wherein said one or more answers are communicated in an audible form to said speaker.

13. A method of responding to speech-based queries system comprising the steps of:
- (a) generating recognized speech data with a continuous speech recognition engine from a speech signal resulting from a speech-based query provided by a speaker;
  
  (b) wherein said continuous speech recognition engine uses a limited speech recognition grammar of words which is pre-loaded for a context experienced by said speaker before said speech-based query is made, said context being determined automatically by an application program executing for said speaker such that the grammar is available at a time when said speaker provides said speech-based query;
  
  (c) wherein the continuous speech recognition engine is distributed between a client and a server;
  
  (d) generating recognized speech sentence data corresponding to said speech-based query based on said recognized speech data;
  
  (e) storing question/answer pairs corresponding to said speech-based query in one or more of a plurality of query/response such that separate independent databases identify queries made at different hierarchical levels of an interface presented to said speaker;
  
  (f) wherein a first limited set of question/answer pairs is selected based on said context from a complete set of question/answer pairs supported by such speech query recognition system;
  
  (g) retrieving one or more question/answer pairs from said first limited set of question/answer pairs based on said recognized speech sentence data provided by a natural language engine, which natural language engine is further adapted by said context to not consider every possible word or phrase present in said complete set of question/answer pairs and only considers words and phrases present in said first limited set of question/answer pairs to determine a meaning of said recognized speech sentence data;
  
  (h) wherein the system is configured so that said context controls both a limited speech recognition grammar used for speech recognition of the speech-based query and a set of one or more answers to be provided in response thereto.
- View Dependent Claims (14, 15)
- - 14. The method of claim 13, further including a step of identifying said one or more answers to said speaker in real-time.
  - 15. The method of claim 14, wherein said one or more answers are communicated in an audible form to said speaker.

16. A method of operating a speech-based lesson tutorial system having question and answer capability, the method comprising the steps of:
- (a) configuring a list of retrievable predefined questions and a corresponding list of predefined answers for said lesson; and
  
  (b) generating recognized speech data from a user query pertaining to said lesson; and
  
  wherein said recognized speech data is generated by a combination of both speech utterance recognition and natural language processing, and wherein processing of said recognized speech data is distributed between client side routines and server side routines;
  
  (c) locating a corresponding predefined question for said user query using said recognized speech data across multiple computing systems so that said more than one lesson file is consulted for said user query; and
  
  (d) converting a corresponding predefined answer for said corresponding predefined question into a form perceptible to the user; and
  
  wherein at least step (b) is performed at least in part while a user is articulating said query so as to emulate a human response time in response to a user query, so that the user perceives interaction with such system in essentially the same way that would be experienced from interacting with a real human, and wherein a limited grammar of words is preloaded into speech recognition engine prior to the time the speech recognition engine receives speech from a user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Bennett, Ian M.

Application Number

US13/526,492
Publication Number

US 20120265531A1
Time in Patent Office

Days
Field of Search
US Class Current

704/254
CPC Class Codes

G06F 16/24522   Translation of natural lang...

G06F 16/3329   Natural language query form...

G06F 16/3344   using natural language anal...

G06F 40/216   using statistical methods

G06F 40/237   Lexical tools

G06F 40/289   Phrasal analysis, e.g. fini...

G06F 40/30   Semantic analysis

G06F 40/42   Data-driven translation

G09B 19/06   Foreign languages with audi...

G09B 5/04   with audible presentation o...

G09B 7/00   Electrically-operated teach...

G09B 7/02   of the type wherein the stu...

G10L 15/142   Hidden Markov Models [HMMs]

G10L 15/18   using natural language mode...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/183   using context dependencies,...

G10L 15/22   Procedures used during a sp...

G10L 15/285   Memory allocation or algori...

G10L 15/30   Distributed recognition, e....

G10L 2015/223   Execution procedure of a sp...

H04M 2250/74 : with voice recognition means

View All

Speech based learning/training system using semantic decoding

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Speech based learning/training system using semantic decoding

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links