Scalable tuning engine
First Claim
Patent Images
1. A computer implemented method for processing sales data on a network of a plurality of computers, comprising:
- receiving data including point of sale data, cost data and product data;
partitioning the received data into data portions each residing in a different section of a data source;
defining a dataflow comprising transformational and numerical steps on a first computer of the plurality of computers;
decomposing, using the first computer, the dataflow along process domains on the first computer by decomposing the dataflow into one or more from a group of distinct executable segments for parallel execution accepting inputs that are the same and distinct executable segments for parallel execution accepting inputs lacking dependencies between each other, wherein the distinct executable segments along process domains include more than one econometric operation and further wherein one process domain includes a modeling segment for generation of a demand model;
decomposing, using the first computer, the dataflow along data domains on the first computer by decomposing the dataflow into distinct executable segments for parallel execution based on dependencies between records within the data indicated by identifying operators and dividing the distinct executable segments along the data domains by demand groups, and wherein demand groups are groupings of highly substitutable products;
executing the distinct executable segments in parallel on a second computer of the plurality of computers and a third computer of the plurality of computers, wherein executing the distinct executable segments comprises;
interpreting a script with programming language statements and generating from an interpretation of the script an executable graph of the dataflow comprising the transformational and numerical steps to process the received data;
executing the executable graph via a graph execution engine that distributes the distinct executable segments among the second computer and the third computer;
reading, in a non-sequential manner during a single reading operation, the data portions from the different sections of the data source to corresponding ones of a plurality of data buffers in parallel to increase a speed of data access, wherein each data buffer corresponds to a distinct executable segment;
monitoring an amount of data in each of the data buffers to determine when a data buffer becomes filled for processing by the corresponding distinct executable segment, wherein at least two of the data buffers become filled and are accessed by the corresponding distinct executable segments at different times;
executing the distinct executable segments on the second computer and the third computer in parallel, wherein each distinct executable segment is responsive to the monitoring and retrieves and processes data from a corresponding data buffer independent of other distinct executable segments when that corresponding data buffer is filled; and
receiving the processed data from each of the distinct executable segments in parallel.
8 Assignments
0 Petitions
Accused Products
Abstract
A computer implemented method for processing data is provided. At least one dataflow comprising transformational and numerical steps is defined. The flow is decomposed into distinct executable segments along process domains. The flow is decomposed into distinct executable segments along data domains. Parallel execution paths are identified across the executable segments. The executable segments are executed across a plurality of execution units.
-
Citations
23 Claims
-
1. A computer implemented method for processing sales data on a network of a plurality of computers, comprising:
-
receiving data including point of sale data, cost data and product data; partitioning the received data into data portions each residing in a different section of a data source; defining a dataflow comprising transformational and numerical steps on a first computer of the plurality of computers; decomposing, using the first computer, the dataflow along process domains on the first computer by decomposing the dataflow into one or more from a group of distinct executable segments for parallel execution accepting inputs that are the same and distinct executable segments for parallel execution accepting inputs lacking dependencies between each other, wherein the distinct executable segments along process domains include more than one econometric operation and further wherein one process domain includes a modeling segment for generation of a demand model; decomposing, using the first computer, the dataflow along data domains on the first computer by decomposing the dataflow into distinct executable segments for parallel execution based on dependencies between records within the data indicated by identifying operators and dividing the distinct executable segments along the data domains by demand groups, and wherein demand groups are groupings of highly substitutable products; executing the distinct executable segments in parallel on a second computer of the plurality of computers and a third computer of the plurality of computers, wherein executing the distinct executable segments comprises; interpreting a script with programming language statements and generating from an interpretation of the script an executable graph of the dataflow comprising the transformational and numerical steps to process the received data; executing the executable graph via a graph execution engine that distributes the distinct executable segments among the second computer and the third computer; reading, in a non-sequential manner during a single reading operation, the data portions from the different sections of the data source to corresponding ones of a plurality of data buffers in parallel to increase a speed of data access, wherein each data buffer corresponds to a distinct executable segment; monitoring an amount of data in each of the data buffers to determine when a data buffer becomes filled for processing by the corresponding distinct executable segment, wherein at least two of the data buffers become filled and are accessed by the corresponding distinct executable segments at different times; executing the distinct executable segments on the second computer and the third computer in parallel, wherein each distinct executable segment is responsive to the monitoring and retrieves and processes data from a corresponding data buffer independent of other distinct executable segments when that corresponding data buffer is filled; and receiving the processed data from each of the distinct executable segments in parallel.
-
-
2. A computer implemented method for processing data, useful in association with a profit optimization system, the data processing method comprising:
-
receiving data including point of sale data, cost data and product data; partitioning the received data into data portions each residing in a different section of a data source; defining a dataflow comprising transformational and numerical steps; decomposing, using a computer, the dataflow along process domains by decomposing the dataflow into one or more from a group of distinct executable segments for parallel execution accepting inputs that are the same and distinct executable segments for parallel execution accepting inputs lacking dependencies between each other, wherein the distinct executable segments along process domains include more than one econometric operation, and further wherein one process domain includes a modeling segment for generation of a demand model; decomposing, using the computer, the dataflow along data domains by decomposing the dataflow into distinct executable segments for parallel execution based on dependencies between records within the data indicated by identifying operators and dividing the distinct executable segments along the data domains by demand groups, and wherein demand groups are groupings of highly substitutable products; executing the distinct executable segments across a plurality of execution units in parallel, wherein executing the distinct executable segments comprises; interpreting a script with programming language statements and generating from an interpretation of the script an executable graph of the dataflow comprising the transformational and numerical steps to process the received data; executing the executable graph via a graph execution engine that distributes the distinct executable segments among the plurality of execution units; reading, in a non-sequential manner during a single reading operation, the data portions from the different sections of the data source to corresponding ones of a plurality of data buffers in parallel to increase a speed of data access, wherein each data buffer corresponds to a distinct executable segment; monitoring an amount of data in each of the data buffers to determine when a data buffer becomes filled for processing by the corresponding distinct executable segment, wherein at least two of the data buffers become filled and are accessed by the corresponding distinct executable segments at different times; executing the distinct executable segments on the plurality of execution units in parallel, wherein each distinct executable segment is responsive to the monitoring and retrieves and processes data from a corresponding data buffer independent of other distinct executable segments when that corresponding data buffer is filled; and receiving the processed data from each of the distinct executable segments in parallel. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. An apparatus for processing data, useful in association with a profit optimization system, the data processing apparatus comprising:
-
a first computer of a plurality of computers, the first computer configured to; receive data including point of sale data, cost data and product data; partition the received data into data portions each residing in a different section of a data source; define a dataflow comprising transformational and numerical steps; decompose the dataflow along process domains by decomposing the dataflow into one or more from a group of distinct executable segments for parallel execution accepting inputs that are the same and distinct executable segments for parallel execution accepting inputs lacking dependencies between each other, wherein the distinct executable segments along process domains include more than one econometric operation, and further wherein one process domain includes a modeling segment for generation of a demand model; decompose the dataflow along data domains by decomposing the dataflow into distinct executable segments for parallel execution based on dependencies between records within the data indicated by identifying operators and dividing the distinct executable segments along the data domains by demand groups, and wherein demand groups are groupings of highly substitutable products; a second computer of the plurality of computers and a third computer of the plurality of computers, wherein the second computer and third computer are configured to execute the distinct executable segments in parallel, wherein the plurality of computers are configured to; interpret a script with programming language statements and generate from an interpretation of the script an executable graph of the dataflow comprising the transformational and numerical steps to process the received data; execute the executable graph via a graph execution engine that distributes the distinct executable segments among the second computer and the third computer; read, in a non-sequential manner during a single reading operation, the data portions from the different sections of the data source to corresponding ones of a plurality of data buffers in parallel to increase a speed of data access, wherein each data buffer corresponds to a distinct executable segment; monitor an amount of data in each of the data buffers to determine when a data buffer becomes filled for processing by the corresponding distinct executable segment, wherein at least two of the data buffers become filled and are accessed by the corresponding distinct executable segments at different times; execute the distinct executable segments on the second computer and the third computer in parallel, wherein each distinct executable segment is responsive to the monitoring and retrieves and processes data from a corresponding data buffer independent of other distinct executable segments when that corresponding data buffer is filled; and receive the processed data from each of the distinct executable segments in parallel. - View Dependent Claims (18, 19, 20, 21, 22, 23)
-
Specification