Real time parsing and suggestions from pre-generated corpus with hypernyms
First Claim
1. A method for determining semantics of domain queries comprising:
- receiving an input query string from a remote device via a network connection;
pre-defining an input grammar data structure specific to a chosen domain, the input grammar data structure comprising tokens identified by semantic identifiers and syntactic identifiers;
identifying semantic groups of tokens corresponding to the semantic identifiers within the input grammar data structure having syntactic equivalence as signified by the syntactic identifiers;
assembling each one of the identified semantic groups into a hypernym to obtain a plurality of hypernyms, wherein the hypernym comprises a hypernym data structure including a hypernym identifier combining a corresponding semantic identifier and syntactic identifier of the each one of the identified semantic groups, and tokens of the each one of the identified semantic groups mapped to semantic outputs, wherein at least one of the semantic outputs corresponding to one of the tokens includes a semantic identifier linking to another hypernym;
providing a list of fields for combination with the plurality of hypernyms, wherein the list of fields comprises text fields for input of names and keywords and custom tokens for input of dates and locations;
generating a corpus of valid combinations of hypernyms and fields, based on at least some of the plurality of hypernyms and at least some of the list of fields, according to a set of grammatical rules;
generating a first mapping data structure mapping each valid combination to a partial semantic output by combining the semantic outputs of the hypernym data structure;
determining semantics of the input query string based on a tokenization of the input query string, the corpus, and the first mapping data structure; and
transmitting, to the remote device, a communication associated with a meaning of the input query string based on the semantics.
6 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods of natural language processing in an environment with no existing corpus are disclosed. The method includes defining an input grammar specific to a chosen domain, the input grammar having a domain specific knowledge and general grammatical knowledge. Groups of tokens are identified within the input grammar having syntactic and semantic equivalence. The identified groups are assembled into hypernyms, wherein the hypernyms include a semantic output for each token in the hypernyms. A list of fields is then combined with the hypernyms for combination with the hypernyms. A corpus of possible combinations of hypernyms and fields is created. A data structure mapping each possible combination to a partial semantic output is generated and the data structure is saved for use in later processing.
-
Citations
19 Claims
-
1. A method for determining semantics of domain queries comprising:
-
receiving an input query string from a remote device via a network connection; pre-defining an input grammar data structure specific to a chosen domain, the input grammar data structure comprising tokens identified by semantic identifiers and syntactic identifiers; identifying semantic groups of tokens corresponding to the semantic identifiers within the input grammar data structure having syntactic equivalence as signified by the syntactic identifiers; assembling each one of the identified semantic groups into a hypernym to obtain a plurality of hypernyms, wherein the hypernym comprises a hypernym data structure including a hypernym identifier combining a corresponding semantic identifier and syntactic identifier of the each one of the identified semantic groups, and tokens of the each one of the identified semantic groups mapped to semantic outputs, wherein at least one of the semantic outputs corresponding to one of the tokens includes a semantic identifier linking to another hypernym; providing a list of fields for combination with the plurality of hypernyms, wherein the list of fields comprises text fields for input of names and keywords and custom tokens for input of dates and locations; generating a corpus of valid combinations of hypernyms and fields, based on at least some of the plurality of hypernyms and at least some of the list of fields, according to a set of grammatical rules; generating a first mapping data structure mapping each valid combination to a partial semantic output by combining the semantic outputs of the hypernym data structure; determining semantics of the input query string based on a tokenization of the input query string, the corpus, and the first mapping data structure; and transmitting, to the remote device, a communication associated with a meaning of the input query string based on the semantics. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system, comprising:
-
a processor; and memory storing computer executable instructions that cause the processor to perform acts comprising; receiving an input query string from a remote device via a network connection; pre-defining an input grammar data structure specific to a chosen domain, the input grammar data structure comprising tokens identified by semantic identifiers and syntactic identifiers; identifying semantic groups of tokens corresponding to the semantic identifiers within the input grammar data structure having syntactic equivalence as signified by the syntactic identifiers; assembling each one of the identified semantic groups into a hypernym to obtain a plurality of hypernyms, wherein each hypernym comprises a hypernym data structure including a hypernym identifier combining a corresponding semantic identifier and syntactic identifier of at least one of the identified semantic groups, and tokens of at least one of the identified semantic groups mapped to semantic outputs, wherein at least one of the semantic outputs corresponding to at least one of the tokens includes a semantic identifier linking to another hypernym; providing a list of fields for combination with the plurality of hypernyms, wherein the list of fields comprises text fields for input of names and keywords and custom tokens for input of dates and locations; generating a corpus of valid combinations of hypernyms and fields, based on at least some of the plurality of hypernyms and at least some of the list of fields, according to a set of grammatical rules; generating a first mapping data structure mapping each valid combination to a partial semantic output by combining the semantic outputs of the hypernym data structure; determining semantics of the input query string based on a tokenization of the input query string, the corpus, and the first mapping data structure; and transmitting, to the remote device, a communication associated with a meaning of the input query string based on the semantics. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A non-transitory computer readable medium storing computer executable instructions that when executed by a processor cause the processor to perform acts comprising:
-
receiving an input query string from a remote device via a network connection; pre-defining an input grammar data structure specific to a chosen domain, the input grammar data structure comprising tokens identified by semantic identifiers and syntactic identifiers; identifying semantic groups of tokens corresponding to the semantic identifiers within the input grammar data structure having syntactic equivalence as signified by the syntactic identifiers; assembling each one of the identified semantic groups into a hypernym to obtain a plurality of hypernyms, wherein each hypernym comprises a hypernym data structure including a hypernym identifier combining a corresponding semantic identifier and syntactic identifier of at least one of the identified semantic groups, and tokens of at least one of the identified semantic groups mapped to semantic outputs, wherein at least one of the semantic outputs corresponding to at least one of the tokens includes a semantic identifier linking to another hypernym; providing a list of fields for combination with the plurality of hypernyms, wherein the list of fields comprises text fields for input of names and keywords and custom tokens for input of dates and locations; generating a corpus of valid combinations of hypernyms and fields, based on at least some of the plurality of hypernyms and at least some of the list of fields, according to a set of grammatical rules; generating a first mapping data structure mapping each valid combination to a partial semantic output by combining the semantic outputs of the hypernym data structure; determining semantics of the input query string based on a tokenization of the input query string, the corpus, and the first mapping data structure; and transmitting, to the remote device, a communication associated with a meaning of the input query string based on the semantics. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification