Extraction device, data processing system, and extraction method
First Claim
1. An extraction device for extracting, as a conversion object, a sub query to be converted to a program for processing stream data continuously inputted to a database, from a query including one or more instructions as sub queries to be issued to a database management system for managing the database, the extraction device comprising:
- an input computing unit to receivean input query having one or more sub queries,a maximum memory increase value to indicate an amount memory by which memory usage may increase when processing the streamed data, anda lower limit value of efficiency to indicate a ratio of reduced processing time to increased memory usage, the reduced processing time indicating a difference between a first time to process the stream data using the program and a second time to process the stream data using the database management system, the increased memory usage indicating an amount of memory by which memory usage increases due to processing the stream data using the program compared to processing the stream data using the database management system;
an operation computing unit to calculate, for each sub query in the input query,at least one memory increase amount corresponding to a change in memory usage when the sub query is converted to the program and the program is used to process the stream data,a processing time reduction corresponding to a difference in a time to process the stream data using the program and a time to process the stream data using the database management system, andan efficiency by using the at least one calculated memory increase amount and the processing time reduction;
an extraction computing unit toselect at least one sub query having a calculated efficiency equal to or higher than the lower limit value,integrate a memory increase amount calculated for the selected sub query, andextract the selected sub query as a conversion object on condition that the integrated memory increase amount is equal to or smaller than the maximum memory increase amount;
a graph generation computing unit to parse the input query and generate a graph expressed by a tree structure having at set of one or more nodes, wherein each sub query is a node in the set of one or more nodes; and
a classification computing unit todetermine whether each node in the set of one or more nodes is a first type of node which executes pipeline processing by processing input data and outputting a processing result or a second type of node which does not execute pipeline processing based on a result of parsing the input query,classify nodes in the set of one or more nodes into one or more first node groups and one or more second node groups, wherein nodes in the first node groups are hierarchically connected from a root node and include only nodes from the set of one or more nodes which are the first type of node and nodes in the one or more second node groups include remaining nodes from the set of one or more nodes, andclassify the one or more second node groups into one or more third node groups and one or more fourth node groups, wherein nodes in the one or more third node groups are hierarchically connected from a leaf node and only which are the first type of node and nodes in the one or more fourth node groups include remaining nodes of the one or more second node groups,wherein the extraction computing unit selects first nodes corresponding to sub queries whose efficiencies are equal to or higher than the lower limit value from nodes classified into the first node groups and extracts the first nodes as conversion objects,wherein the extraction computing unit, subsequent to selecting the first nodes and extracting the conversion objects, if the integrated memory increase amount has not reached the maximum memory increase amount, selects second nodes whose efficiencies are equal to or higher than the lower limit value from nodes classified into the third node groups and extracts the second nodes as conversion objects.
1 Assignment
0 Petitions
Accused Products
Abstract
An extraction device for extracting a sub query to be converted to a program for processing stream data continuously inputted to a database, from a query including instructions, as sub queries, to be issued to a database management system. The extraction device includes: an input unit; an operation unit for calculating the memory increase amount in a case of processing the stream data and the processing time to be reduced for each sub query, and calculating the efficiency by using them; and an extraction unit for selecting at least one sub query whose efficiency is equal to or higher than the lower limit value, integrating the memory increase amount calculated for the selected sub query, and on condition that the integrated memory increase amount is equal to or smaller than the maximum memory increase amount, extracting the selected sub query as a conversion object.
24 Citations
4 Claims
-
1. An extraction device for extracting, as a conversion object, a sub query to be converted to a program for processing stream data continuously inputted to a database, from a query including one or more instructions as sub queries to be issued to a database management system for managing the database, the extraction device comprising:
-
an input computing unit to receive an input query having one or more sub queries, a maximum memory increase value to indicate an amount memory by which memory usage may increase when processing the streamed data, and a lower limit value of efficiency to indicate a ratio of reduced processing time to increased memory usage, the reduced processing time indicating a difference between a first time to process the stream data using the program and a second time to process the stream data using the database management system, the increased memory usage indicating an amount of memory by which memory usage increases due to processing the stream data using the program compared to processing the stream data using the database management system; an operation computing unit to calculate, for each sub query in the input query, at least one memory increase amount corresponding to a change in memory usage when the sub query is converted to the program and the program is used to process the stream data, a processing time reduction corresponding to a difference in a time to process the stream data using the program and a time to process the stream data using the database management system, and an efficiency by using the at least one calculated memory increase amount and the processing time reduction; an extraction computing unit to select at least one sub query having a calculated efficiency equal to or higher than the lower limit value, integrate a memory increase amount calculated for the selected sub query, and extract the selected sub query as a conversion object on condition that the integrated memory increase amount is equal to or smaller than the maximum memory increase amount; a graph generation computing unit to parse the input query and generate a graph expressed by a tree structure having at set of one or more nodes, wherein each sub query is a node in the set of one or more nodes; and a classification computing unit to determine whether each node in the set of one or more nodes is a first type of node which executes pipeline processing by processing input data and outputting a processing result or a second type of node which does not execute pipeline processing based on a result of parsing the input query, classify nodes in the set of one or more nodes into one or more first node groups and one or more second node groups, wherein nodes in the first node groups are hierarchically connected from a root node and include only nodes from the set of one or more nodes which are the first type of node and nodes in the one or more second node groups include remaining nodes from the set of one or more nodes, and classify the one or more second node groups into one or more third node groups and one or more fourth node groups, wherein nodes in the one or more third node groups are hierarchically connected from a leaf node and only which are the first type of node and nodes in the one or more fourth node groups include remaining nodes of the one or more second node groups, wherein the extraction computing unit selects first nodes corresponding to sub queries whose efficiencies are equal to or higher than the lower limit value from nodes classified into the first node groups and extracts the first nodes as conversion objects, wherein the extraction computing unit, subsequent to selecting the first nodes and extracting the conversion objects, if the integrated memory increase amount has not reached the maximum memory increase amount, selects second nodes whose efficiencies are equal to or higher than the lower limit value from nodes classified into the third node groups and extracts the second nodes as conversion objects. - View Dependent Claims (2, 3)
-
-
4. A data processing system to extract, as a conversion object, a sub query to be converted to a program for processing stream data continuously inputted to a database, from a query including one or more instructions as sub queries to be issued to a database management system for managing the database, the extraction device comprising:
-
an input computing unit to receive an input query having one or more sub queries, a maximum memory increase value to indicate an amount memory by which memory usage may increase when processing the streamed data, and a lower limit value of efficiency to indicate a ratio of reduced processing time to increased memory usage, the reduced processing time indicating a difference between a first time to process the stream data using the program and a second time to process the stream data using the database management system, the increased memory usage indicating an amount of memory by which memory usage increases due to processing the stream data using the program compared to processing the stream data using the database management system; an operation computing unit to calculate, for each sub query in the input query, at least one memory increase amount corresponding to a change in memory usage when the sub query is converted to the program and the program is used to process the stream data, a processing time reduction corresponding to a difference in a time to process the stream data using the program and a time to process the stream data using the database management system, and an efficiency by using the at least one calculated memory increase amount and the processing time reduction; an extraction computing unit to select at least one sub query having a calculated efficiency equal to or higher than the lower limit value, integrate a memory increase amount calculated for the selected sub query, and extract the selected sub query as a conversion object on condition that the integrated memory increase amount is equal to or smaller than the maximum memory increase amount; a graph generation computing unit to parse the input query and generate a graph expressed by a tree structure having at set of one or more nodes, wherein each sub query is a node in the set of one or more nodes; a classification computing unit to determine whether each node in the set of one or more nodes is a first type of node which executes pipeline processing by processing input data and outputting a processing result or a second type of node which does not execute pipeline processing based on a result of parsing the input query, classify nodes in the set of one or more nodes into one or more first node groups and one or more second node groups, wherein nodes in the first node groups are hierarchically connected from a root node and include only nodes from the set of one or more nodes which are the first type of node and nodes in the one or more second node groups include remaining nodes from the set of one or more nodes, and classify the one or more second node groups into one or more third node groups and one or more fourth node groups, wherein nodes in the one or more third node groups are hierarchically connected from a leaf node and only which are the first type of node and nodes in the one or more fourth node groups include remaining nodes of the one or more second node groups, wherein the extraction computing unit selects first nodes corresponding to sub queries whose efficiencies are equal to or higher than the lower limit value from nodes classified into the first node groups and extracts the first nodes as conversion objects, wherein the extraction computing unit, subsequent to selecting the first nodes and extracting the conversion objects, if the integrated memory increase amount has not reached the maximum memory increase amount, selects second nodes whose efficiencies are equal to or higher than the lower limit value from nodes classified into the third node groups and extracts the second nodes as conversion objects; a conversion device to convert a sub query of the conversion objects extracted by the data processing system to a program for processing the stream data and generating remaining sub queries not to be converted; a first processor for executing the converted program, processing the stream data, and outputting a processing result; and a second processor for executing the remaining queries and processing the processing result and the stream data stored in the database, the second processor including a database management system for managing a database.
-
Specification