Techniques for facilitating the joining of datasets
First Claim
1. A method comprising, at a computer system:
- generating first profile metadata for each column of a first plurality of columns in a first dataset stored a first data source;
generating second profile metadata for each column of a second plurality of columns in a second dataset stored a second data source;
identifying, based on the first profile metadata and the second profile metadata, a plurality of column pairs between the first dataset and the second dataset, wherein columns in each of the plurality of column pairs have a relationship;
determining one or more recommendations for blending each of one or more column pairs of the plurality of column pairs that have the relationship;
determining one or more types of join functions that can be applied to each of the one or more column pairs based on the one or more recommendations for blending each of the one or more column pairs;
generating a first graphical interface to display each of the one or more types of join functions that can be applied to each of the one or more column pairs that have the relationship, wherein each of the one or more types of join functions are displayed showing a diagram of a type of join function for joining the first dataset with the second dataset by columns in a different column pair of the plurality of column pairs;
receiving input corresponding to selection of a first type of join function of the one or more types of join functions; and
generating a second graphical interface to display a third dataset based on joining, according to the first type of join function, the first dataset at a first column within a first column pair with the second dataset at a second column in the first column pair.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are disclosed for a system that provides an intuitive way for merging or joining data from different datasets. The system may provide graphical interfaces to enable a user to combine or join datasets identified as having a relationship. In at least one embodiment, the system can determine options for joining datasets, such as by a left join, right join, or outer join. A graphical interface may display a visual representation (e.g., a “Glenn Diagram”) illustrate options for joining datasets based on identifying a relationship between the data sets. The representation may further illustrate one or more types of joins and information about the data, such as rows where data may be joined based on the type of join function for the relationship by columns. The visual representation may indicate where the datasets can be joined, such that they are not overlapping.
-
Citations
20 Claims
-
1. A method comprising, at a computer system:
-
generating first profile metadata for each column of a first plurality of columns in a first dataset stored a first data source; generating second profile metadata for each column of a second plurality of columns in a second dataset stored a second data source; identifying, based on the first profile metadata and the second profile metadata, a plurality of column pairs between the first dataset and the second dataset, wherein columns in each of the plurality of column pairs have a relationship; determining one or more recommendations for blending each of one or more column pairs of the plurality of column pairs that have the relationship; determining one or more types of join functions that can be applied to each of the one or more column pairs based on the one or more recommendations for blending each of the one or more column pairs; generating a first graphical interface to display each of the one or more types of join functions that can be applied to each of the one or more column pairs that have the relationship, wherein each of the one or more types of join functions are displayed showing a diagram of a type of join function for joining the first dataset with the second dataset by columns in a different column pair of the plurality of column pairs; receiving input corresponding to selection of a first type of join function of the one or more types of join functions; and generating a second graphical interface to display a third dataset based on joining, according to the first type of join function, the first dataset at a first column within a first column pair with the second dataset at a second column in the first column pair. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
one or more processors; and a memory accessible to the one or more processors, the memory comprising instructions that, when executed by the one or more processors, cause the one or more processors to; generate first profile metadata for each column of a first plurality of columns in a first dataset stored a first data source; generate second profile metadata for each column of a second plurality of columns in a second dataset stored a second data source; identify, based on the first profile metadata and the second profile metadata, a plurality of column pairs between the first dataset and the second dataset, wherein columns in each of the plurality of column pairs have a relationship; determining one or more recommendations for blending each of one or more column pairs of the plurality of column pairs that have the relationship; determine one or more types of join functions that can be applied to each of the one or more column pairs based on the one or more recommendations for blending each of the one or more column pairs; generate a first graphical interface to display each of the one or more types of join functions that can be applied to each of the one or more column pairs that have the relationship, wherein each of the one or more types of join functions are displayed showing a diagram of a type of join function for joining the first dataset with the second dataset by columns in a different column pair of the plurality of column pairs; receive input corresponding to selection of a first type of join function of the one or more types of join functions; and generate a second graphical interface to display a third dataset based on joining, according to the first type of join function, the first dataset at a first column within a first column pair with the second dataset at a second column in the first column pair. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A non-transitory computer readable medium storing one or more instructions that are executable by one or more processors to cause the one or more processors to:
-
generate first profile metadata for each column of a first plurality of columns in a first dataset stored a first data source; generate second profile metadata for each column of a second plurality of columns in a second dataset stored a second data source; identify, based on the first profile metadata and the second profile metadata, a plurality of column pairs between the first dataset and the second dataset, wherein columns in each of the plurality of column pairs have a relationship; determine one or more recommendations for blending each of one or more column pairs of the plurality of column pairs that have the relationship; determine one or more types of join functions that can be applied to each of the one or more column pairs based on the one or more recommendations for blending each of the one or more column pairs; generate a first graphical interface to display each of the one or more types of join functions that can be applied to each of the one or more column pairs that have the relationship, wherein each of the one or more types of join functions are displayed showing a diagram of a type of join function for joining the first dataset with the second dataset by columns in a different column pair of the plurality of column pairs; receive input corresponding to selection of a first type of join function of the one or more types of join functions; and generate a second graphical interface to display a third dataset based on joining, according to the first type of join function, the first dataset at a first column within a first column pair with the second dataset at a second column in the first column pair. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification