Natural language querying with cascaded conditional random fields
First Claim
1. A computer-implemented method, comprising:
- receiving natural language input at a computing device;
processing the natural language input to generate an output, by;
processing the natural language input in a first conditional random field to obtain a first stage output from the first conditional random field, the processing in the first conditional random field comprises;
extracting at least a first entity and a second entity from the natural language input, wherein the first entity and the second entity appear separately in a natural language query expressed with the received natural language input, and at least one of the first entity and the second entity identifies a column name of a database table; and
labeling the first entity as a database column value for the database table, and labeling the second entity as a database column name for the database table, the first stage output comprising a database column value label associated with the first entity and a database column name label associated with the second entity; and
after processing the natural language input in the first conditional random field, processing the natural language input and the first stage output in a second conditional random field to obtain a second stage output from the second conditional random field, the processing in the second conditional random field comprising identifying at least one relationship between the first entity and the second entity in the first stage output, the second stage output comprising information that represents the at least one relationship,wherein the output comprises the first stage output and the second stage output; and
forming a search query based at least in part on the output, wherein forming the query comprises determining a query type, wherein the query type is at least one of a range query, a logical query, a join query, or an aggregate query, and wherein the query is based at least in part on the query type.
2 Assignments
0 Petitions
Accused Products
Abstract
A natural language query tool comprising cascaded conditional random fields (CRFs) (e.g., a linear-chain CRF and a skip-chain CRF applied sequentially) processes natural language input to produce output that can be used in database searches. For example, cascaded CRFs extract entities from natural language input that correspond to column names or column values in a database, and identify relationships between the extracted entities. A search engine can execute queries based on output from the cascaded CRFs over an inverted index of a database, which can be based on one or more materialized views of the database. Results can be sorted (e.g., according to relevance scores) and presented in a user interface.
-
Citations
26 Claims
-
1. A computer-implemented method, comprising:
-
receiving natural language input at a computing device; processing the natural language input to generate an output, by; processing the natural language input in a first conditional random field to obtain a first stage output from the first conditional random field, the processing in the first conditional random field comprises; extracting at least a first entity and a second entity from the natural language input, wherein the first entity and the second entity appear separately in a natural language query expressed with the received natural language input, and at least one of the first entity and the second entity identifies a column name of a database table; and labeling the first entity as a database column value for the database table, and labeling the second entity as a database column name for the database table, the first stage output comprising a database column value label associated with the first entity and a database column name label associated with the second entity; and after processing the natural language input in the first conditional random field, processing the natural language input and the first stage output in a second conditional random field to obtain a second stage output from the second conditional random field, the processing in the second conditional random field comprising identifying at least one relationship between the first entity and the second entity in the first stage output, the second stage output comprising information that represents the at least one relationship, wherein the output comprises the first stage output and the second stage output; and forming a search query based at least in part on the output, wherein forming the query comprises determining a query type, wherein the query type is at least one of a range query, a logical query, a join query, or an aggregate query, and wherein the query is based at least in part on the query type. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. One or more non-transitory computer-readable storage media having stored thereon computer-executable instructions which when executed by a computer cause the computer to perform a method, the method comprising:
-
receiving natural language input at a computing device; processing the natural language input in cascaded conditional random fields comprising a linear-chain conditional random field and a skip-chain conditional random field to obtain an output from the cascaded conditional random fields, wherein; the linear-chain conditional random field is used to extract entity information; the skip-chain conditional random field is used to extract relationship information; and the skip-chain conditional random field is sequential to the linear-chain conditional random field so that an output of the linear-chain conditional random field is an input to the skip-chain conditional random field, wherein the linear-chain conditional random field and the skip-chain conditional random field are trained using a training dataset comprising entity labels and relationship labels; and based on the output from the cascaded conditional random fields, forming a database query, wherein forming the query comprises determining a query type, wherein the query type is at least one of a range query, a logical query, a join query, or an aggregate query, and wherein the query is based at least in part on the query type. - View Dependent Claims (21, 22, 23, 24, 25)
-
-
26. A computing device comprising one or more processors, one or more output devices, and one or more computer-readable storage media having stored therein computer-executable instructions for performing a method, the method comprising:
-
receiving natural language input at the computing device via a user interface, the natural language input comprising plural terms; assigning a part-of-speech tag to each of the plural terms in the natural language input; processing the natural language input and the part-of-speech tags in a linear-chain conditional random field to obtain a first output from the linear-chain conditional random field, the first output comprising a database column value label and a database column name label, the database column value label indicating that a first term in the natural language input corresponds to a column value in a database, and the database column name label indicating that a second term in the natural language input corresponds to a column name in the database, wherein the first term and the second term appear separately within the natural language input; processing the natural language input and the first output in a skip-chain conditional random field to obtain a second output from the skip-chain conditional random field, the second output comprising relationship information that associates the first term corresponding to the column value in the database with the second term corresponding to the column name in the database; forming a search string based at least in part on the first output, wherein forming the search string comprises determining a query type, wherein the query type is at least one of a range query, a logical query, a join query, or an aggregate query, and wherein the search string is based at least in part on the query type; sending the search string to a search engine; receiving search results identified in the database by the search engine based at least in part on the search string; ordering the search results based on relevance scores for the respective search results, the relevance scores based at least in part on the first output and the second output; and presenting the ordered search results in the user interface.
-
Specification