System and method for adaptive language understanding by computers
First Claim
1. A method for adaptive language understanding using multimodal language acquisition, comprising the steps of:
- receiving from a user one or more spoken utterances comprising at least one word;
identifying whether said utterance comprises unknown words not included in a database;
requesting the user to provide semantic information for said identified unknown words;
storing the identified unknown word and creating and storing a new semantic object corresponding to the identified unknown word based on the semantic information received from the user through one or more input modalities.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method are described for adaptive language understanding using multimodal language acquisition in human-computer interaction. Words, phrases, sentences, production rules (syntactic information) as well as their corresponding meanings (semantic information) are stored. New words, phrases, sentences, production rules and their corresponding meanings can be acquired through interaction with users, using different input modalities, such as, speech, typing, pointing, drawing and image capturing. This system therefore acquires language through a natural language and multimodal interaction with users. New language knowledge is acquired in two ways. First, by acquiring new linguistic units, i.e. words or phrases and their corresponding semantics, and second by acquiring new sentences or language rules and their corresponding computer actions. The system represents an adaptive spoken interface capable of interpreting the user'"'"'s spoken commands and sensory inputs and of learning new linguistic concepts and production rules. Such a system and the underlying method can not only be used to build adaptive conversational or dialog systems, but also to build adaptive interactive computer interfaces and operating systems, expert systems and computer games.
-
Citations
31 Claims
-
1. A method for adaptive language understanding using multimodal language acquisition, comprising the steps of:
-
receiving from a user one or more spoken utterances comprising at least one word;
identifying whether said utterance comprises unknown words not included in a database;
requesting the user to provide semantic information for said identified unknown words;
storing the identified unknown word and creating and storing a new semantic object corresponding to the identified unknown word based on the semantic information received from the user through one or more input modalities. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. An adaptive language understanding computer system comprising:
-
a) an automatic speech recognition engine for converting spoken utterances into text strings b) a language understanding module for at least processing spoken utterances having;
i) a rule grammar for storing allowed vocabulary of words, sentences and production rules recognized and understood by the system;
ii) a semantic database for storing semantic objects describing semantic representations of the words; and
iii) a first parser for identifying the semantic interpretation of the recognized and understood spoken utterances;
iv) a command processor for executing appropriate commands or computer actions. c) a new-word detector module for at least processing spoken utterances not allowed by the rule grammar, having;
i) a dictation grammar for storing a vocabulary of words and allowing the speech recognizer to recognize the spoken utterances if the spoken utterances are not allowed in the rule grammar; and
ii) a second parser for identifying words in the spoken utterances not found in the rule grammar as unknown words;
d) a multimodal semantic acquisition module responsive to an input of semantics for the identified unknown words by creating and storing in the semantic database new semantic objects corresponding to the identified unknown words;
e) a dialog processor module for communicating by synthetic voice with the user;
f) one or more input devices selected from a group consisting of microphone, keyboard, mouse, pen tablet and computer video camera, and combinations thereof. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31)
-
Specification