Optimized storage solution for real-time queries and data modeling
First Claim
Patent Images
1. A method, comprising:
- receiving a set of data from a plurality of devices operating in a manufacturing environment;
separately writing a first portion of the set of data to both a relational database and a distributed storage cluster, the distributed storage cluster comprising a plurality of storage nodes in a distributed computing environment;
upon receiving a query to be processed from the set of data;
analyzing the query to determine an application from which the query was received;
selecting one of the relational database and the distributed storage cluster for processing the query, based on a mapping rule that defines a predefined relationship between a type of the application from which the query was received and the selected one of the relational database and the distributed storage cluster, wherein the mapping rule specifies that queries from applications related to real-time operations are to be processed by the relational database; and
submitting the query to the selected one of the relational database and the distributed storage cluster for execution;
purging the first portion of the set of data from the relational database upon the stored first portion of the set of data in the relational database reaching a first age; and
purging the first portion of the set of data from the distributed storage cluster upon the stored first portion of the set of data in the distributed storage cluster reaching a second age, wherein the first age and the second age are different, and wherein the first age is lower than the second age.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments presented herein provide techniques for managing data in manufacturing systems. One embodiment includes receiving a set of data from a plurality of devices operating in a manufacturing environment. A portion of the set of data is written by a data management application to both a relational database and a distributed storage cluster that includes a plurality of storage nodes in a distributed computing environment. Upon receiving a query to extract a subset of data from the set of data, the query is analyzed to determine attributes of the query. Based, in part on the analysis, one of the relational database and the distributed storage cluster is selected for processing the query.
9 Citations
18 Claims
-
1. A method, comprising:
-
receiving a set of data from a plurality of devices operating in a manufacturing environment; separately writing a first portion of the set of data to both a relational database and a distributed storage cluster, the distributed storage cluster comprising a plurality of storage nodes in a distributed computing environment; upon receiving a query to be processed from the set of data; analyzing the query to determine an application from which the query was received; selecting one of the relational database and the distributed storage cluster for processing the query, based on a mapping rule that defines a predefined relationship between a type of the application from which the query was received and the selected one of the relational database and the distributed storage cluster, wherein the mapping rule specifies that queries from applications related to real-time operations are to be processed by the relational database; and submitting the query to the selected one of the relational database and the distributed storage cluster for execution; purging the first portion of the set of data from the relational database upon the stored first portion of the set of data in the relational database reaching a first age; and purging the first portion of the set of data from the distributed storage cluster upon the stored first portion of the set of data in the distributed storage cluster reaching a second age, wherein the first age and the second age are different, and wherein the first age is lower than the second age. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer-readable medium containing computer program code that, when executed, performs an operation comprising:
-
receiving a set of data from a plurality of devices operating in a manufacturing environment; separately writing a first portion of the set of data to both a relational database and a distributed storage cluster, the distributed storage cluster comprising a plurality of storage nodes in a distributed computing environment; upon receiving a query to be processed from the set of data; analyzing the query to determine an application from which the query was received; selecting one of the relational database and the distributed storage cluster for processing the query, based on a mapping rule that defines a predefined relationship between a type of the application from which the query was received and the selected one of the relational database and the distributed storage cluster, wherein the mapping rule specifies that queries from applications related to real-time operations are to be processed by the relational database; and submitting the query to the selected one of the relational database and the distributed storage cluster for execution; purging the first portion of the set of data from the relational database upon the stored first portion of the set of data in the relational database reaching a first age; and purging the first portion of the set of data from the distributed storage cluster upon the stored first portion of the set of data in the distributed storage cluster reaching a second age, wherein the first age and the second age are different, and wherein the first age is lower than the second age. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A manufacturing system comprising:
-
a plurality of tools for manufacturing one or more semi-conductor devices; a first storage system comprising a relational database; a second storage system comprising a distributed storage cluster, the distributed storage cluster comprising a plurality of storage nodes in a distributed computing environment; at least one processor; and a memory containing a program that, when executed by the at least one processor, performs an operation comprising; receiving a set of data from the plurality of tools; separately writing a first portion of the set of data to both the relational database and the distributed storage cluster; upon receiving a query to be processed from the set of data; analyzing the query to determine an application from which the query was received; selecting one of the relational database and the distributed storage cluster for processing the query, based on a mapping rule that defines a predefined relationship between a type of the application from which the query was received and the selected one of the relational database and the distributed storage cluster, wherein the mapping rule specifies that queries from applications related to real-time operations are to be processed by the relational database; and submitting the query to the selected one of the relational database and the distributed storage cluster for execution; purging the first portion of the set of data from the relational database upon the stored first portion of the set of data in the relational database reaching a first age; and purging the first portion of the set of data from the distributed storage cluster upon the stored first portion of the set of data in the distributed storage cluster reaching a second age, wherein the first age and the second age are different, and wherein the first age is lower than the second age. - View Dependent Claims (17, 18)
-
Specification