Natural language querying of a data lake using contextualized knowledge bases
First Claim
1. A method of querying a data lake using natural language, comprising the following steps:
- receiving a natural language query directed to an electronic data lake;
parsing the natural language query to determine a plurality of entities within the natural language query;
identifying the plurality of entities using at least one contextual knowledge base, wherein the plurality of entities are tabulated in at least one of a plurality of data tables by entity type and compared against at least one entry in the at least one contextual knowledge base, wherein at least one phrase of the natural language query is combined, the plurality of entities are soft-matched, and at least one entity above a threshold confidence level is identified, and wherein a relationship table structure knowledge base provides a relationship between at least two of the plurality of entities by determining relational links at least two of the plurality of data tables;
mapping a dependency relationship between the plurality of identified entities to determine relational parts of speech of the plurality of identified entities based on the parsed natural language query;
constructing a structured data query based on the plurality of identified entities and the mapped dependency; and
automatically generating a visual output of a result of the structured data query, wherein a format of the visual output is recommended by a visual recommender knowledge base based on at least;
a number of columns or a number of rows of the result of the structured data query.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of querying a data lake using natural language includes: receiving a natural language query directed to an electronic data lake; parsing the natural language query to determine a plurality of entities within the natural language query; identifying the plurality of entities using at least one contextual knowledge base, wherein the plurality of entities are compared against at least one entry in the at least one contextual knowledge base; mapping a dependency of the plurality of identified entities based on the parsed natural language query; constructing a structured data query based on the plurality of identified entities and the mapped dependency; and automatically generating a visual output of a result of the structured data query based on at least one characteristic from the set of: a data type, a data format, and a data size of the result of the structured data query.
60 Citations
18 Claims
-
1. A method of querying a data lake using natural language, comprising the following steps:
-
receiving a natural language query directed to an electronic data lake; parsing the natural language query to determine a plurality of entities within the natural language query; identifying the plurality of entities using at least one contextual knowledge base, wherein the plurality of entities are tabulated in at least one of a plurality of data tables by entity type and compared against at least one entry in the at least one contextual knowledge base, wherein at least one phrase of the natural language query is combined, the plurality of entities are soft-matched, and at least one entity above a threshold confidence level is identified, and wherein a relationship table structure knowledge base provides a relationship between at least two of the plurality of entities by determining relational links at least two of the plurality of data tables; mapping a dependency relationship between the plurality of identified entities to determine relational parts of speech of the plurality of identified entities based on the parsed natural language query; constructing a structured data query based on the plurality of identified entities and the mapped dependency; and automatically generating a visual output of a result of the structured data query, wherein a format of the visual output is recommended by a visual recommender knowledge base based on at least;
a number of columns or a number of rows of the result of the structured data query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computerized system having a memory and a processor for querying a data lake using natural language, comprising:
-
an electronic data lake; at least one contextual knowledge base; and a user computer device in communication with the electronic data lake and the at least one contextual knowledge base over at least one network, wherein the user computer device has a processor and a non-transitory memory, wherein the processor of the user computer device is configured to; receive a natural language query directed to the electronic data lake; parse the natural language query to determine a plurality of entities within the natural language query; identify the plurality of entities using the at least one contextual knowledge base, wherein the plurality of entities are tabulated in at least one of a plurality of data tables by entity type and compared against at least one entry in the at least one contextual knowledge base, wherein at least one phrase of the natural language query is combined, the plurality of entities are soft-matched, and at least one entity above a threshold confidence level is identified, and wherein a relationship table structure knowledge base provides a relationship between at least two of the plurality of entities by determining relational links at least two of the plurality of data tables; map a dependency relationship between the plurality of identified entities to determine relational parts of speech of the plurality of identified entities based on the parsed natural language query; construct a structured data query based on the plurality of identified entities and the mapped dependency; and automatically generate a visual output of a result of the structured data query on a display of the user computer device, wherein the visual output is recommended by a visual recommender knowledge base based on at least;
a number of columns or a number of rows of the result of the structured data query. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18)
-
Specification