Statistical language model trained with semantic variants
First Claim
1. A method of generating a statistical language model (SLM) grammar for a task domain which includes semantically variant words and phrases, the method comprising the steps of:
- (a) providing a set of content words which can be associated with user questions in the task domain; and
using a computer system;
(b) determining semantic variants for each word in said set of content words;
wherein said semantic variants include at least synonyms;
(c) forming a semantic set of questions related to said user questions based on said synonyms;
(d) performing semantic decoding on said semantic set of questions, to identify a disambiguated set of questions; and
(e) configuring n-gram probabilities for words and phrases in said SLM grammar based on said set of disambiguated questions;
wherein said SLM grammar is configured to recognize semantic variants of questions posed to a natural language speech recognition engine.
1 Assignment
0 Petitions
Accused Products
Abstract
An intelligent query system for processing voiced-based queries is disclosed, which uses a combination of both statistical and semantic based processing to identify the question posed by the user by understanding the meaning of the user'"'"'s utterance. Based on identifying the meaning of the utterance, the system selects a single answer that best matches the user'"'"'s query. The answer that is paired to this single question is then retrieved and presented to the user. The system, as implemented, accepts environmental variables selected by the user and is scalable to provide answers to a variety and quantity of user-initiated queries.
-
Citations
18 Claims
-
1. A method of generating a statistical language model (SLM) grammar for a task domain which includes semantically variant words and phrases, the method comprising the steps of:
-
(a) providing a set of content words which can be associated with user questions in the task domain; and using a computer system; (b) determining semantic variants for each word in said set of content words; wherein said semantic variants include at least synonyms; (c) forming a semantic set of questions related to said user questions based on said synonyms; (d) performing semantic decoding on said semantic set of questions, to identify a disambiguated set of questions; and (e) configuring n-gram probabilities for words and phrases in said SLM grammar based on said set of disambiguated questions; wherein said SLM grammar is configured to recognize semantic variants of questions posed to a natural language speech recognition engine. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A speech processing system that implements a statistical language model (SLM) grammar for a task domain which includes semantically variant words and phrases, comprising:
-
a computing system; and one or more data repositories associated with the computing system, the one or more data repositories storing the SLM grammar, wherein the SLM grammar includes (a) a set of content words which can be associated with user questions in the task domain; (b) a set of semantic variants for each word in said set of content words; wherein said semantic variants include at least synonyms; and (c) a disambiguated set of questions which are based on a semantic set of questions related to said user questions based on said synonyms; wherein the SLM grammar includes n-gram probabilities for words and phrases which are configured based on said set of disambiguated questions; and further wherein said SLM grammar is configured to recognize semantic variants of questions posed to a natural language speech recognition engine.
-
-
15. A method of generating a statistical language model (SLM) grammar for a task domain which includes semantically variant words and phrases, comprising:
using a computing system; determining semantic variants for each word in a set of content words associated with user questions in a task domain using a lexical dictionary, the semantic variants including at least synonyms; forming a semantic set of questions related to the user questions based on the semantic variants; performing semantic decoding on the semantic set of questions to identify a disambiguated set of questions; and configuring n-gram probabilities for words and phrases in the SLM grammar based on the set of disambiguated questions. - View Dependent Claims (16, 17, 18)
Specification