System and method for data flow analysis of complex data filters
First Claim
1. A computer-implemented method for analyzing a structured query language (SQL) type database query, said SQL type database query containing a plurality of query conditions that are used to filter data records of a database, comprising the steps of:
- identifying at least one SQL type query condition from the plurality of query conditions in the SQL type database query;
querying the database based upon the identified query condition;
determining at least one results characteristic associated with the query of the database with the identified query condition, wherein the results characteristic is used to analyze the identified query condition.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for analyzing the data flow of a database query. The database query contains a plurality of query conditions that are used to filter data records of a database. At least one query condition is identified from the plurality of query conditions in the database query. The database is queried based upon the identified query condition. At least one results characteristic is determined that is associated with the query of the database with the identified query condition. The results characteristic is used to analyze the identified query condition.
202 Citations
56 Claims
-
1. A computer-implemented method for analyzing a structured query language (SQL) type database query, said SQL type database query containing a plurality of query conditions that are used to filter data records of a database, comprising the steps of:
-
identifying at least one SQL type query condition from the plurality of query conditions in the SQL type database query;
querying the database based upon the identified query condition;
determining at least one results characteristic associated with the query of the database with the identified query condition, wherein the results characteristic is used to analyze the identified query condition. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
visually depicting the result characteristic proximate to the node associated with the identified query condition.
-
-
8. The method of claim 6 further comprising the steps of:
-
determining results characteristics for each of the query conditions in the database query; and
visually depicting the result characteristic proximate to the node associated with the identified query condition.
-
-
9. The method of claim 6 further comprising the steps of:
-
determining first results characteristics for the database query without the query conditions and visually depicting the first result characteristic proximate to a query originating input node on the graphical query depiction; and
determining second results characteristics for the database query with all of the query conditions and visually depicting the second results characteristic.
-
-
10. The method of claim 6 further comprising the step of:
using the results characteristic to determine whether a node blocks substantially all the records the node receives.
-
11. The method of claim 6 further comprising the step of:
using the results characteristic to determine whether a node blocks all the records the node receives.
-
12. The method of claim 6 further comprising the step of:
using the results characteristic to determine whether a node blocks substantially none of the records the node receives.
-
13. The method of claim 6 further comprising the step of:
using the results characteristic to determine whether a node blocks none of the records the node receives.
-
14. The method of claim 6 further comprising the step of:
using the results characteristic to determine whether the query conditions do not block any record.
-
15. The method of claim 6 further comprising the step of:
-
determining results characteristics for two subsequent query condition nodes;
using the determined results characteristics to determine whether the same number of records are retrieved by the two subsequent query condition nodes.
-
-
16. The method of claim 6 further comprising the steps of:
-
using the results characteristic to determine whether a node blocks all the records the node receives;
using the results characteristic to determine whether a node blocks none of the records the node receives;
using the results characteristic to determine whether the database query does not block any record;
determining results characteristics for two query condition nodes connected in series; and
using the determined results characteristics to determine that the same number of records are retrieved by the two query condition nodes.
-
-
17. The method of claim 1 further comprising the steps of:
-
generating a subset database from the database;
querying the subset database based upon the identified query condition; and
determining at least one results characteristic associated with the query of the subset database with the identified query condition, wherein the results characteristic is used to analyze the identified query condition.
-
-
18. The method of claim 17 further comprising the step of:
generating the subset database by applying statistical designs to select a subset of the data from the database.
-
19. The method of claim 17 further comprising the step of:
generating the subset database by using a subset of observations from the database.
-
20. The method of claim 1 wherein the database query includes structured query language (SQL) query clauses having sections for filtering records from the database, said sections being selected from the group consisting of Where sections, From sections, Having sections and combinations thereof.
-
21. The method of claim 20 further comprising the steps of:
-
identifying whether Where sections are contained within the database query;
separately querying the database based upon each of the identified Where sections;
determining individual results characteristics for each of the identified Where sections resulting from the individual querying of the database, wherein the results characteristics are used to analyze the identified Where sections of the database query.
-
-
22. The method of claim 20 further comprising the steps of:
-
identifying whether From sections are contained within the database query;
separately querying the database based upon each of the identified From sections;
determining individual results characteristics for each of the identified From sections resulting from the individual querying of the database, wherein the results characteristics are used to analyze the identified From sections of the database query.
-
-
23. The method of claim 22 further comprising the steps of:
-
identifying whether Having sections are contained within the database query;
separately querying the database based upon each of the identified Having sections;
determining individual results characteristics for each of the identified Having sections resulting from the individual querying of the database, wherein the results characteristics are used to analyze the identified Having sections of the database query.
-
-
24. The method of claim 20 further comprising the steps of:
optimizing a query of a section based upon the results of a query from a preceding section.
-
25. The method of claim 1 wherein the database is used within a data mining application.
-
26. A computer-implemented system for analyzing a structured query language (SQL) type database query, said SQL type database query containing a plurality of query conditions that are used to filter data records of a database, comprising:
-
a query parser module that identifies at least one SQL type query condition from the plurality of query conditions in the SQL type database query;
a query condition executor module connected to the query parser module that performs a query of the database based upon the identified query condition;
a results analyzer module connected to the query condition executor module that determines at least one results characteristic associated with the query of the database by the identified query condition; and
a results data structure connected to the results analyzer module that stores an association between the identified query condition and the results characteristic, wherein the results characteristic is used to analyze the identified query condition. - View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52)
a graphical interface module connected to the query results analyzer module that displays the results characteristic with a visual graphical depiction of the query.
-
-
30. The system of claim 29 wherein the visual graphical depiction of the query includes the Boolean logic of the database query.
-
31. The system of claim 29 wherein the visual graphical depiction includes nodes associated with query conditions.
-
32. The system of claim 31 wherein the result characteristic is visually depicted proximate to the node associated with the identified query condition.
-
33. The system of claim 31 wherein the results analyzer module determines results characteristics for each of the query conditions in the database query, wherein the results data structure includes storage for associations between the determined results characteristics and their respective query conditions.
-
34. The system of claim 31 wherein the results analyzer module determines first results characteristics for the database query without the query conditions,
wherein the results data structure includes storage for an association between the determined first results characteristics and the database query without the query conditions, wherein the results analyzer module determines second results characteristics for the database query with all of the query conditions, wherein the results data structure includes storage for an association between the determined second results characteristics and the database query with all of the query conditions. -
35. The system of claim 31 wherein the query results analyzer module uses the results characteristic to determine whether a node blocks substantially all the records the node receives.
-
36. The system of claim 31 wherein the query results analyzer module uses the results characteristic to determine whether a node blocks all the records the node receives.
-
37. The system of claim 31 wherein the query results analyzer module uses the results characteristic to determine whether a node blocks substantially none of the records the node receives.
-
38. The system of claim 31 wherein the query results analyzer module uses the results characteristic to determine whether a node blocks none of the records the node receives.
-
39. The system of claim 31 wherein the query results analyzer module uses the results characteristic to determine whether the database query does not block any record.
-
40. The system of claim 31 wherein the query results analyzer module determines results characteristics for two subsequent query condition nodes, wherein the results data structure includes storage for both the two subsequent query condition nodes and their respective determined results characteristics, wherein the query results analyzer module uses the determined results characteristics to determine whether the same number of records are retrieved by the two subsequent query condition nodes.
-
41. The system of claim 31 wherein the query results analyzer module comprises:
-
means for using the results characteristic to determine whether a node blocks all the records the node receives;
means for using the results characteristic to determine whether a node blocks none of the records the node receives;
means for using the results characteristic to determine whether the database query does not block any record;
means for determining results characteristics for two query condition nodes connected in series; and
means for using the determined results characteristics to determine that the same number of records are retrieved by the two query condition nodes.
-
-
42. The system of claim 26 further comprising:
-
a subset database having a subset of records in the database;
wherein the query executor module includes a connection to the subset database in order to query the subset database based upon the identified query condition; and
wherein the query results analyzer module determines at least one results characteristic associated with the query of the subset database with the identified query condition, wherein the results characteristic is used to analyze the identified query condition.
-
-
43. The system of claim 26 wherein the database query includes structured query language (SQL) query clauses having sections for filtering records from the database, said sections being selected from the group consisting of Where sections, From sections, Having sections and combinations thereof.
-
44. The system of claim 26 further comprising:
-
parsing rules that the query parser module uses to identify query conditions and query sections, wherein the rules include identification rules that identify whether Where sections are contained within the database query, wherein the query executor module separately queries the database based upon each of the identified Where sections, wherein individual results characteristics for each of the identified Where sections are generated from the individual querying of the database, wherein the results characteristics are used to analyze the identified Where sections of the database query.
-
-
45. The system of claim 26 further comprising:
-
parsing rules that the query parser module uses to identify query conditions and query sections, wherein the rules include identification rules that identify whether From sections are contained within the database query;
wherein the query executor module separately queries the database based upon each of the identified From sections, wherein individual results characteristics for each of the identified From sections are generated from the individual querying of the database, wherein the results characteristics are used to analyze the identified From sections of the database query.
-
-
46. The system of claim 45 further comprising:
-
parsing rules that the query parser module uses to identify query conditions and query sections, wherein the rules include identification rules that identify whether Having sections are contained within the database query;
wherein the query executor module separately queries the database based upon each of the identified Having sections, wherein individual results characteristics for each of the identified Having sections are generated from the individual querying of the database, wherein the results characteristics are used to analyze the identified Having sections of the database query.
-
-
47. The system of claim 45 further comprising:
an execution optimizer module connected to the query executor module, wherein the execution optimizer module optimizes a query of a section based upon the results of a query from a preceding section.
-
48. The system of claim 26 wherein the database is used within a data mining application.
-
49. The system of claim 26 further comprising:
-
a graphical interface module connected to the query results analyzer module that displays the results characteristic with a visual graphical depiction of the query, wherein the visual graphical depiction includes nodes associated with the query conditions; and
a query language optimization module connected to the graphical interface module, wherein the query language optimization module identifies redundant filter nodes in the network and removes the redundant nodes.
-
-
50. The system of claim 49 further comprising:
a visual optimization module connected to the graphical interface module, wherein the visual optimization module merges nodes that contain similar query conditions.
-
51. The system of claim 26 further comprising:
-
a graphical interface module connected to the query results analyzer module that displays the results characteristic with a visual graphical depiction of the query, wherein the visual graphical depiction includes nodes associated with the query conditions; and
a visual optimization module connected to the graphical interface module, wherein the visual optimization module merges nodes that contain similar query conditions.
-
-
52. The system of claim 26 wherein the database query includes structured query language (SQL) query clauses having sections for filtering records from the database, wherein a section includes a join condition;
-
wherein the join condition is used in filtering records from the database;
wherein the results characteristic is determined based upon the filtering associated with the join condition.
-
-
53. A computer-implemented method for analyzing an SQL database query statement, said SQL database query statement containing a plurality of query conditions that are used to filter data records of a database, comprising the steps of:
-
identifying whether Where sections are contained within the SQL database query statement;
separately querying the database based upon each of the identified Where sections;
determining individual results characteristics for each of the identified Where sections resulting from the individual querying of the database, wherein the results characteristics are used to analyze the identified Where sections of the database query. - View Dependent Claims (54, 55, 56)
identifying whether From sections are contained within the SQL database query statement;
separately querying the database based upon each of the identified From sections;
determining individual results characteristics for each of the identified From sections resulting from the individual querying of the database, wherein the results characteristics are used to analyze the identified From sections of the SQL database query statement.
-
-
55. The method of claim 53 further comprising the steps of:
-
identifying whether Having sections are contained within the SQL database query statement;
separately querying the database based upon each of the identified Having sections;
determining individual results characteristics for each of the identified Having sections resulting from the individual querying of the database, wherein the results characteristics are used to analyze the identified Having sections of the SQL database query statement.
-
-
56. The method of claim 53, wherein the SQL database query statement includes structured query language (SQL) query clauses having sections for filtering records from the database, said sections being selected from the group consisting of Where sections, From sections, Having sections and combinations thereof, said method further comprising the step of:
optimizing a query of a section based upon the results of a query from a preceding section.
Specification