×

Casual modeling of multi-dimensional hierarchical metric cubes

  • US 10,360,527 B2
  • Filed: 11/10/2010
  • Issued: 07/23/2019
  • Est. Priority Date: 11/10/2010
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for constructing and using a causal graphical analysis tool, the method comprising:

  • storing, in a database accessible by a computer system, multidimensional data representing metrics related to an organization, the metrics being used by the computer system to perform prediction analysis on events associated with the organization;

    constructing, by a processor of the computer system, from said multidimensional data, a causal graphical model representing a multidimensional hierarchical data structure, the multidimensional hierarchical data structure including one or more dimensions, each dimension having a plurality of nodes hierarchically arranged into a number of levels, each node having associated metric data, and including edges between nodes, each edge representing a directed relationship between metrics of connected nodes;

    said causal graphical model constructing comprising;

    acquiring, by the processor, first data corresponding to a first frontier representing a cut at a dimension level of the multi-dimensional hierarchical data structure, said first data being aggregated and segmented along each of the multiple dimensions;

    performing, by the processor, modeling on the first data to obtain a first model and a corresponding first statistic, the performing the modeling including;

    receiving, by the processor, input historical time series data {Xt}t=1, . . . , M, where each Xt is a p-dimensional vector, M and p are integer numbers;

    receiving, via a user interface, specification of one or more metrics constraints, one or more dimensional constraints or both metrics constraints and dimensional constraints to control construction of said causal graphical network;

    setting a graph G to (V, E), where the V is a set of p features, and the E is a set of edges between the features;

    for each feature y, which belongs to the set V, running a regression on every y in terms of past lagged variables, Xt−

    d
    , . . . , Xt−

    1
    , for all features x, which belongs to the set V;

    for each feature x, which belongs to the set V, placing an edge directed from the x to they into set E if x is selected as a group by the regression; and

    iterating through said multi-dimensional hierarchical data structure to further expand a frontier dimension of the first frontier to obtain a new frontier level, wherein at each iteration;

    gathering, by the processor, further data corresponding to the expanded new frontier level, said further data being aggregated and segmented along each of the multiple dimensions;

    applying, by the processor, the data modeling on the further data to obtain a further model and a corresponding further statistic;

    comparing, by the processor, the first statistic of the first model and the further statistic of the further model;

    setting, by the processor, the further model to be the causal graphical model and setting the new frontier level to be the first frontier in response to determining that the further statistic improves the first model statistic;

    repeating, by the processor, the iterating and new frontier level expanding until there are no new frontier dimensions to further expand;

    outputting, by the processor, the causal graphical model as structured data providing an expanded frontier level of statistically significant metric relationships and learned impacts between metric measures for conducting a causal analysis;

    predicting, by the processor, future values of metrics of the causal graphical model based on an inference using the first data and the metrics of the causal graphical model;

    identifying, by the processor, a main metric that includes predicted future values that deviates away from a mean of the main metric with respect to time;

    determining, by the processor, a causal relationship between each metric of the causal graphical model and the main metric based on a comparison of each metric of the causal graphical model with the main metric, wherein each causal relationship indicates an effect of the main metric on a deviation distance between the metric of the causal graphical model with a respective desired value;

    identifying, by the processor, a set of candidate metrics from the metrics of the causal graphical model, wherein each candidate metrics corresponds to a deviation distance below a threshold;

    determining, by the processor, causal relations and associated measures of strengths at the expanded frontier level in the outputted structured data by calculating a causal strength, per each candidate relation between each candidate metric and the main metric, or a pair of metrics in the cut in the frontier, as a weighted sum of causal strengths of causal relations whose dimension nodes are equal or descendents of the each candidate relation, wherein each weighted sum is a result of an application of a weight on a candidate metric, and each weight is determined by a ratio between a value of a target metric at the first frontier and an aggregated value of target metrics at the first frontier;

    receiving, by the processor and via the user interface, a request for a prediction analysis on a first candidate relation associated with a first candidate metric and a second candidate relation associated with a second candidate metric;

    aggregating, by the processor, a first causal strength of the first relation and a second causal strength of the second relation, wherein the first causal strength is a first weighted sum of the first candidate metric, the second causal strength is a second weighted sum of the second candidate metric, and aggregating the first causal strength and the second causal strength instead of the first candidate metric and the second candidate metric provides an indication of a predicted effect on an event caused by activities associated with the first candidate relation and the second candidate relation instead of effects on the event caused by the first candidate metric and the second candidate metric; and

    outputting, by the processor, the aggregated causal strengths on the user interface.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×