×

Systems and methods for responding to natural language speech utterance

DC
  • US 7,640,160 B2
  • Filed: 08/05/2005
  • Issued: 12/29/2009
  • Est. Priority Date: 08/05/2005
  • Status: Active Grant
First Claim
Patent Images

1. A system for processing multi-modal natural language inputs, comprising:

  • a multi-modal voice user interface configured to receive a multi-modal input, the multi-modal input including a natural language utterance and a non-speech input, wherein a transcription module coupled to the multi-modal voice user interface is configured to transcribe the non-speech input to create a non-speech-based transcription;

    a multi-pass speech recognition module configured to transcribe the natural language utterance into text;

    a merging module configured to merge the text of the transcribed utterance and the non-speech-based transcription to create a merged transcription;

    a plurality of domain agents, wherein a context description grammar includes one or more grammar expression entries that one or more of the plurality of domain agents are configured to use to process requests in respective contexts;

    a knowledge-enhanced speech recognition engine configured to determine a most likely context for the multi-modal input, the knowledge-enhanced speech recognition engine further configured to;

    identify one or more contexts that completely or partially match one or more text combinations contained in the merged transcription, wherein identifying the matching contexts includes comparing the text combinations against the grammar expression entries in the context description grammar and against one or more expected contexts stored in a context stack;

    score each of the identified matching contexts; and

    select the matching context having a highest score as the most likely context for the multi-modal input; and

    a response generating module configured to identify one or more of the plurality of domain agents that are configured to process requests in the most likely context for the multi-modal input, the response generating module configured to;

    communicate a request to the identified domain agents, the request formulated using at least one grammar expression entry in the context description grammar; and

    generate a response to the multi-modal input using content gathered as a result of the identified domain agents processing the request.

View all claims
  • 7 Assignments
Timeline View
Assignment View
    ×
    ×