Methods, apparatus, and data structures for facilitating a natural language interface to stored information
First Claim
1. A method for preprocessing a natural language database query, the method comprising:
- a) accepting an alphanumeric string related to the query;
b) parsing the alphanumeric string to generate query words;
c) determining whether any of the query words, or any phrases formed by at least two adjacent query words, match any of a plurality of indexed annotations;
d) if a query word or phrase matches one or more of the plurality of indexed annotations, for each of the plurality of indexed annotations, adding a pattern associated with the indexed annotation to a group associated with the query word or phrase;
e) selecting a pattern from each group of patterns to generate a selection of patterns, wherein the pattern may include an object which may be bound to a set of rows or unbound, and wherein two patterns may be unified if they have objects having the same defined type; and
f) combining the patterns of the selection of patterns to generate a single, connected, lowest cost pattern, wherein, in the act of combining the patterns of the selection of patterns to generate a single, connected, lowest cost pattern, includes;
i) determining start states by determining all possible selections, ii) applying a cost heuristic, based on a database schema, to each of the states, iii) selecting a lowest cost state, and iv) until a single, connected, pattern is found, A) generating successor states of the selected lowest cost state, wherein the act of generating successor states of the selected lowest cost state, includes;
for each pair of patterns in the selected state,
determining possible actions and their costs, wherein the possible actions include a unify entity action, a unify relationship action, and a link patterns via a path action; and
pruning actions based on their costs B) applying a cost heuristic, based on the database schema, to each of the successor states, C) estimating a cost for each of the successor states, D) adding the successor states to a queue of all states, and E) selecting a new lowest cost state from the queue of all states.
3 Assignments
0 Petitions
Accused Products
Abstract
An authoring tool (or process) to facilitate the performance of an annotation function and an indexing function. The annotation function may generate informational annotations and word annotations to a database design schema (e.g., an entity-relationship diagram or “ERD”). The indexing function may analyze the words of the annotations by classifying the words in accordance with a concordance and dictionary, and assign a normalized weight to each word of each of the annotations based on the classification(s) of the word(s) of the annotation.
A query translator (or query translation process) to (i) accept a natural language query from a user interface process, (ii) convert the natural language query to a formal command query (e.g., an SQL query) using the indexed annotations generated by the authoring tool and the database design schema, and (iii) present the formal command query to a database management process for interrogating the relational database.
66 Citations
13 Claims
-
1. A method for preprocessing a natural language database query, the method comprising:
-
a) accepting an alphanumeric string related to the query;
b) parsing the alphanumeric string to generate query words;
c) determining whether any of the query words, or any phrases formed by at least two adjacent query words, match any of a plurality of indexed annotations;
d) if a query word or phrase matches one or more of the plurality of indexed annotations, for each of the plurality of indexed annotations, adding a pattern associated with the indexed annotation to a group associated with the query word or phrase;
e) selecting a pattern from each group of patterns to generate a selection of patterns, wherein the pattern may include an object which may be bound to a set of rows or unbound, and wherein two patterns may be unified if they have objects having the same defined type; and
f) combining the patterns of the selection of patterns to generate a single, connected, lowest cost pattern, wherein, in the act of combining the patterns of the selection of patterns to generate a single, connected, lowest cost pattern, includes;
i) determining start states by determining all possible selections, ii) applying a cost heuristic, based on a database schema, to each of the states, iii) selecting a lowest cost state, and iv) until a single, connected, pattern is found, A) generating successor states of the selected lowest cost state, wherein the act of generating successor states of the selected lowest cost state, includes;
for each pair of patterns in the selected state,
determining possible actions and their costs, wherein the possible actions include a unify entity action, a unify relationship action, and a link patterns via a path action; and
pruning actions based on their costsB) applying a cost heuristic, based on the database schema, to each of the successor states, C) estimating a cost for each of the successor states, D) adding the successor states to a queue of all states, and E) selecting a new lowest cost state from the queue of all states. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for preprocessing a natural language database query the method comprising:
-
a) accepting an alphanumeric string related to the query;
b) parsing the alphanumeric string to generate query words;
c) determining whether any of the query words, or any phrases formed by at least two adjacent query words, match any of a plurality of indexed annotations;
d) if a query word or phrase matches one or more of the plurality of indexed annotations, for each of the plurality of indexed annotations, adding a pattern associated with the indexed annotation to a group associated with the query word or phrase;
e) selecting a pattern from each group of patterns to generate a selection of patterns; and
f) combining the patterns of the selection of patterns to generate a single, connected, lowest cost pattern, wherein, in the act of combining the patterns of the selection of patterns to generate a single, connected, lowest cost pattern, includes;
i) determining start states by determining all possible selections, ii) applying a cost heuristic, based on a database schema, to each of the states, iii) selecting a lowest cost state, and iv) until a single, connected, pattern is found, A) generating successor states of the selected lowest cost state by;
1) determining whether any patterns of the lowest cost state include a nexus and if so, selecting a minimum cost pattern with a nexus, otherwise, selecting a minimum cost pattern, and 2) for each pair of patterns in the selected state including the selected pattern, determining possible actions and their costs, wherein the possible actions include a unify entity action, a unify relationship action, and a link patterns via a path action, wherein the cost of a link patterns via a path action is based on at least one of (i) a number of entities in the path and (ii) a number of relationships in the path, wherein if one of the patterns is an entity table and another of the patterns is a property table, the unified pattern will have the property table patterns, B) applying a cost heuristic, based on the database schema, to each of the successor states, C) estimating a cost for each of the successor states, D) adding the successor states to a queue of all states, and E) selecting a new lowest cost state from the queue of all states.
-
-
7. A method for preprocessing a natural language database query, the method comprising:
-
a) accepting an alphanumeric string related to the query;
b) parsing the alphanumeric string to generate query words;
c) determining whether any of the query words, or any phrases formed by at least two adjacent query words, match any of a plurality of indexed annotations;
d) if a query word or phrase matches one or more of the plurality of indexed annotations, for each of the plurality of indexed annotations, adding a pattern associated with the indexed annotation to a group associated with the query word or phrase;
e) selecting a pattern from each group of patterns to generate a selection of patterns, wherein the pattern may include an object which may be bound to a set of rows or unbound, and wherein two patterns may be unified if they have objects having the same defined type; and
f) combining the patterns of the selection of patterns to generate a single, connected, lowest cost pattern, wherein, in the act of combining the patterns of the selection of patterns to generate a single, connected, lowest cost pattern, includes;
i) determining start states by determining all possible selections, ii) applying a cost heuristic, based on a database schema, to each of the states, iii) selecting a lowest cost state, and iv) until a single, connected, pattern is found, A) generating successor states of the selected lowest cost state by;
1) determining whether any patterns of the lowest cost state include a nexus and if so, selecting a minimum cost pattern with a nexus, otherwise, selecting a minimum cost pattern, and 2) for each pair of patterns in the selected state including the selected pattern, determining possible actions and their costs, wherein the possible actions include a unify entity action, a unify relationship action, and a link patterns via a path action, B) applying a cost heuristic, based on the database schema, to each of the successor states, C) estimating a cost for each of the successor states, D) adding the successor states to a queue of all states, and E) selecting a new lowest cost state from the queue of all states. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A method for preprocessing a natural language database query, the method comprising:
-
a) accepting an alphanumeric string related to the query;
b) parsing the alphanumeric string to generate query words;
c) determining whether any of the query words, or any phrases formed by at least two adjacent query words, match any of a plurality of indexed annotations;
d) if a query word or phrase matches one or more of the plurality of indexed annotations, for each of the plurality of indexed annotations, adding a pattern associated with the indexed annotation to a group associated with the query word or phrase;
e) selecting a pattern from each group of patterns to generate a selection of patterns; and
f) combining the patterns of the selection of patterns to generate a single, connected, lowest cost pattern, wherein, in the act of combining the patterns of the selection of patterns to generate a single, connected, lowest cost pattern, includes;
i) determining start states by determining all possible selections, ii) applying a cost heuristic, based on a database schema, to each of the states, iii) selecting a lowest cost state, and iv) until a single, connected, pattern is found, A) generating successor states of the selected lowest cost state, B) applying a cost heuristic, based on the database schema, to each of the successor states, C) estimating a cost for each of the successor states, wherein the act of estimating a cost of each successor state includes;
determining a known cost, determining an unknown cost, and determining an estimated cost based on the known cost and the unknown cost, the known cost is the cost of the parent state plus the cost of the action of generating the successor state,D) adding the successor states to a queue of all states, and E) selecting a new lowest cost state from the queue of states. - View Dependent Claims (13)
-
Specification