Visualization suggestion application programming interface

US 9,830,370 B2
Filed: 09/18/2014
Issued: 11/28/2017
Est. Priority Date: 09/18/2014
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

accessing a dataset and a user selection of at least one column of the dataset by at least one hardware processor, the dataset including at least one online analytical processing (OLAP) cube with each column of the cube classified as a measure or classified as a dimension;

determining, by the at least one hardware processor, which type of analysis to perform on each of the unselected columns of the dataset based on;

the classification of the at least one column selected by a user; and

the unselected column being classified as a dimension and a cardinality of the dimension satisfying specified criteria;

analyzing the dataset, by the at least one hardware processor, to generate a score for each unselected column of the dataset based on a degree of dependency between each of the unselected columns and the at least one selected column;

iteratively displaying a ranking of the unselected columns according to the scores, and accessing a user selection of one more column by the at least one hardware processor until a threshold number of columns has been selected;

accessing the selected columns of the dataset by the at least one hardware processor; and

selecting, by the at least one hardware processor, a specified number of visualization configurations compatible with the selected columns from a set of visualization configurations and providing the compatible visualization configurations to a user, wherein the selecting is based at least in part on a quantity of the selected columns and based at least in part on at least one constraint of a visualization configuration of the specified number of visualization configurations.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A dataset and some user selected columns of the dataset are received by a statistical analysis module for analysis. The statistical analysis module generates a score for each unselected column of the dataset based on statistical analysis of the unselected columns and all or a subset of the selected columns. A ranking of the unselected columns is presented to the user for selection of one additional column of the dataset, after which the remaining unselected columns are re-ranked according to their associated scores and once again displayed to the user. The user may continue selecting from among the ranked columns until a threshold number of columns has been selected, at which point the user may deselect a selected column in order to continue selecting additional columns. A visualization suggestion application program interface then matches the selected columns with compatible visualization configurations and presents some of these visualizations to the user.

Citations

17 Claims

1. A method comprising:
- accessing a dataset and a user selection of at least one column of the dataset by at least one hardware processor, the dataset including at least one online analytical processing (OLAP) cube with each column of the cube classified as a measure or classified as a dimension;
  
  determining, by the at least one hardware processor, which type of analysis to perform on each of the unselected columns of the dataset based on;
  
  the classification of the at least one column selected by a user; and
  
  the unselected column being classified as a dimension and a cardinality of the dimension satisfying specified criteria;
  
  analyzing the dataset, by the at least one hardware processor, to generate a score for each unselected column of the dataset based on a degree of dependency between each of the unselected columns and the at least one selected column;
  
  iteratively displaying a ranking of the unselected columns according to the scores, and accessing a user selection of one more column by the at least one hardware processor until a threshold number of columns has been selected;
  
  accessing the selected columns of the dataset by the at least one hardware processor; and
  
  selecting, by the at least one hardware processor, a specified number of visualization configurations compatible with the selected columns from a set of visualization configurations and providing the compatible visualization configurations to a user, wherein the selecting is based at least in part on a quantity of the selected columns and based at least in part on at least one constraint of a visualization configuration of the specified number of visualization configurations.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, further comprising:
    - accessing user input by the at least one hardware processor, the input including a deselection of one of the selected columns based on the threshold number of columns being selected.
  - 3. The method of claim 1, further comprising:
    - aggregating over unselected dimensions of the cube based on the unselected dimensions having cardinality less than 10;
      
      analyzing, by the at least one hardware processor, the dataset by performing an analysis of variance (ANOVA) test on the unselected columns of the dataset and on aggregated data; and
      
      generating a score for each unselected column of the dataset based on an effect size of the ANOVA test.
  - 4. The method of claim 1, further comprising:
    - aggregating over unselected dimensions of the cube based on the unselected dimensions having cardinality of at least 20;
      
      analyzing, by the at least one hardware processor, the dataset by performing a correlation coefficient test on the unselected columns of the dataset and on aggregated data; and
      
      generating a score for each unselected column of the dataset based on a p-value of the correlation coefficient test.
  - 5. The method of claim 1, further comprising:
    - determining, by the at least one hardware processor, that multiple types of analysis be performed on an unselected column of the dataset;
      
      generating a score for said unselected column based on an average of multiple scores generated for said unselected column by the multiple types of analysis; and
      
      generating a null score for an unselected column based on the unselected column being classified as a dimension and the cardinality of the dimension failing to satisfy the specified criteria.
  - 6. The method of claim 1, wherein a visualization configuration specifies how a set of columns are arranged and represented in a chart and includes constraints regarding the columns, the method further comprising:
    - determining, by the at least one hardware processor, that a visualization configuration is compatible with the selected columns based on;
      
      a number of selected columns being equal to a number of columns in the visualization;
      
      a number of selected columns classified as dimensions being equal to a number of columns classified as dimensions in the visualization;
      
      a number of selected columns classified as measures being equal to a number of columns classified as measures in the visualization; and
      
      the selected columns satisfying constraints of the visualization regarding columns.

7. A system comprising:
- at least one processors;
  
  a memory coupled to the at least one processor, the memory including instructions which, when executed by the at least one processor cause the system to perform operations comprising;
  
  accessing a dataset and a user selection of at least one column of the dataset the dataset including at least one online analytical processing (OLAP) cube with each column of the cube classified as a measure or classified as a dimension;
  
  determining, by the at least one processor, which type of analysis to perform on each of the unselected columns of the dataset based on;
  
  the classification of the at least one column selected by a user; and
  
  the unselected column being classified as a dimension and a cardinality of the dimension satisfying specified criteria;
  
  analyzing the dataset, by the at least one processor, to generate a score for each unselected column of the dataset based on a degree of dependency between each of the unselected columns and the at least one selected column;
  
  iteratively display a ranking of the unselected columns according to the scores, and accessing a user selection of one more column by the at least one processor until a threshold number of columns has been selected;
  
  accessing the selected columns of the dataset;
  
  selecting a specified number of visualization configurations compatible with the selected columns from a set of visualizations, wherein the selecting is based at least in part on a quantity of the selected columns and based at least in part on at least one constraint of a visualization configuration of the specified number of visualization configurations; and
  
  providing the compatible visualization configurations to a user.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The system of claim 7, the operations further comprising accessing, by the at least one processor, user input including a deselection of one of the selected columns based on the threshold number of columns being selected.
  - 9. The system of claim 7, the operations further comprising:
    - aggregating over unselected dimensions of the cube based on the unselected dimension having cardinality less than 10;
      
      analyzing the dataset, by the at least one processor, by performing an analysis of variance (ANOVA) test on the unselected columns of the dataset and on aggregated data; and
      
      generating a score for each unselected column of the dataset based on an effect size of the ANOVA test.
  - 10. The system of claim 7, the operations further comprising:
    - aggregating over unselected dimensions of the cube based on the unselected dimension having cardinality of at least 20;
      
      analyzing the dataset, by the at least one processor, by performing a correlation coefficient test on the unselected columns of the dataset and on aggregated data; and
      
      generating a score for each unselected column of the dataset based on a p-value of the correlation coefficient test.
  - 11. The system of claim 7, the operations further comprising:
    - determining that multiple types of analysis be performed on an unselected column of the dataset;
      
      generating a score for said unselected column based on an average of multiple scores generated for said unselected column by the multiple types of analysis; and
      
      generating a null score for an unselected column based on the at least one column selected by the user being classified as a dimension and the cardinality of the dimension failing to satisfy the specified criteria.
  - 12. The system of claim 7, wherein:
    - a visualization configuration specifies how a set of columns are arranged and represented in a chart and includes constraints regarding the columns; and
      
      the at least one processor is used to determine that a visualization configuration is compatible with the selected columns based on;
      
      a number of selected columns being equal to a number of columns in the visualization;
      
      a number of selected columns classified as dimensions being equal to a number of columns classified as dimensions in the visualization;
      
      a number of selected columns classified as measures being equal to a number of columns classified as measures in the visualization; and
      
      the selected columns satisfy constraints of the visualization regarding columns.

13. A non-transitory machine-readable storage medium including instructions that, when executed on at least one processor of a machine, cause the machine to perform the operations comprising:
- accessing a dataset and a user selection of at least one column of the dataset by at least one hardware processor, the dataset including at least one online analytical processing (OLAP) cube with each column of the cube classified as a measure or classified as a dimension;
  
  determining, by the at least one hardware processor, which type of analysis to perform on each of the unselected columns of the dataset based on;
  
  the classification of the at least one column selected by a user; and
  
  the unselected column being classified as a dimension and a cardinality of the dimension satisfying specified criteria;
  
  analyzing the dataset, by the at least one hardware processor, to generate a score for each unselected column of the dataset based on a degree of dependency between each of the unselected columns and the at least one selected column;
  
  iteratively displaying a ranking of the unselected columns according to the scores, and accessing a user selection of one more column by the at least one hardware processor until a threshold number of columns has been selected;
  
  accessing the selected columns of the dataset by the at least one hardware processor; and
  
  selecting, by the at least one hardware processor, a specified number of visualization configurations compatible with the selected columns from a set of visualization configurations and providing the compatible visualization configurations to a user, wherein the selecting is based at least in part on a quantity of the selected columns and based at least in part on at least one constraint of a visualization configuration of the specified number of visualization configurations.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The non-transitory machine-readable storage medium of claim 13, wherein the operations further comprise:
    - accessing user input by the at least one hardware processor, the input including a deselection of one of the selected columns based on the threshold number of columns being selected.
  - 15. The non-transitory machine-readable storage medium of claim 13, wherein the operations further comprise:
    - aggregating over unselected dimensions of the cube based on an unselected dimension having cardinality less than 10;
      
      analyzing the dataset, by the at least one hardware processor, by performing an analysis of variance (ANOVA) test on the unselected columns of the dataset and on aggregated data; and
      
      generating a score for each unselected column of the dataset based on an effect size of the ANOVA test.
  - 16. The non-transitory machine-readable storage medium of claim 13, wherein the operations further comprise:
    - determining, by the at least one hardware processor, that multiple types of analysis be performed on an unselected column of the dataset;
      
      generating a score for said unselected column based on an average of multiple scores generated for said unselected column by the multiple types of analysis; and
      
      generating a null score for an unselected column based on the at least one column selected by the user being classified as a dimension and the cardinality of the dimension failing to satisfy the specified criteria.
  - 17. The non-transitory machine-readable storage medium of claim 13, wherein:
    - a visualization configuration specifies how a set of columns should be arranged and represented in a chart and includes constraints regarding the columns; and
      
      the operations further comprise determining, by the at least one hardware processor, that a visualization configuration is compatible with the selected columns based on;
      
      a number of selected columns being equal to a number of columns in the visualization;
      
      a number of selected columns classified as dimensions being equal to a number of columns classified as dimensions in the visualization;
      
      a number of selected columns classified as measures being equal to a number of columns classified as measures in the visualization; and
      
      the selected columns satisfy constraints of the visualization regarding columns.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Business Objects Incorporated (SAP SE)
Original Assignee
Business Objects Incorporated (SAP SE)
Inventors
Wong, Johnson, Moser, Flavia, Kumar, Viren
Primary Examiner(s)
Corrielus, Jean M

Application Number

US14/490,172
Publication Number

US 20160085835A1
Time in Patent Office

1,167 Days
Field of Search

707752
US Class Current
CPC Class Codes

G06F 16/221   Column-oriented storage; Ma...

G06F 16/248   Presentation of query results

G06F 16/254   Extract, transform and load...

G06F 16/26   Visual data mining; Browsin...

G06F 16/283   Multi-dimensional databases...

G06F 16/287   Visualization; Browsing

Visualization suggestion application programming interface

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Visualization suggestion application programming interface

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links