Similarity and ranking of databases based on database metadata
First Claim
1. A computer program product for determining a similarity of databases, the computer program product comprising:
- a computer-readable storage device having computer readable program code embodied therewith, the computer readable program code comprising;
(a) computer readable program code configured to select a first database and a second database from a plurality of databases that includes at least three databases;
(b) computer readable program code configured to identify one or more terms found in the first database and found in the second database of the plurality of databases, wherein each term of the one or more terms is comprised of metadata of a structure of a database of the plurality of databases, and wherein terms of the one or more terms found in both the first database and the second database are one or more common terms;
(c) computer readable program code configured to determine for a common term of the one or more common terms, a quantity of databases of the plurality of databases, in which a common term of the one or more common terms is found, wherein the quantity of databases of the plurality of databases in which the common term of the one or more common terms is found, constitutes a group, and wherein a range of groups includes each quantity of databases from a group of two databases to a group of a quantity of the plurality of databases; and
(d) computer readable program code configured to determine a similarity score between the first database and the second database of the plurality of databases based on a tuple formed from the quantity of the one or more common terms found in each group of databases of the plurality of databases.
1 Assignment
0 Petitions
Accused Products
Abstract
A processor selects a first database and a second database from a plurality of databases. The processor determines one or more terms found in the first and second database, wherein each term of the one or more terms includes metadata of a database of the plurality of databases. The processor identifies one or more common terms between the first database and the second database and determines the one or more common terms found in each of a plurality of groups of databases of the plurality of databases, wherein each group of databases corresponds to a number of databases which constitute the group of databases. The processor determines a similarity score between the first database and the second database of the plurality of databases based on the one or more common terms found in each group of databases of the plurality of databases.
-
Citations
20 Claims
-
1. A computer program product for determining a similarity of databases, the computer program product comprising:
-
a computer-readable storage device having computer readable program code embodied therewith, the computer readable program code comprising; (a) computer readable program code configured to select a first database and a second database from a plurality of databases that includes at least three databases; (b) computer readable program code configured to identify one or more terms found in the first database and found in the second database of the plurality of databases, wherein each term of the one or more terms is comprised of metadata of a structure of a database of the plurality of databases, and wherein terms of the one or more terms found in both the first database and the second database are one or more common terms; (c) computer readable program code configured to determine for a common term of the one or more common terms, a quantity of databases of the plurality of databases, in which a common term of the one or more common terms is found, wherein the quantity of databases of the plurality of databases in which the common term of the one or more common terms is found, constitutes a group, and wherein a range of groups includes each quantity of databases from a group of two databases to a group of a quantity of the plurality of databases; and (d) computer readable program code configured to determine a similarity score between the first database and the second database of the plurality of databases based on a tuple formed from the quantity of the one or more common terms found in each group of databases of the plurality of databases. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer system for determining a similarity of databases, the computer system comprising:
-
one or more computer processors; one or more computer readable storage devices; and program instructions stored on the computer readable storage media for execution by at least one of the one or more processors, the program instructions comprising; (a) program instructions to select a first database and a second database from a plurality of databases that includes at least three databases; (b) program instructions to identify one or more terms found in the first database and found in the second database, wherein each term of the one or more terms is comprised of metadata of a structure of the first database of the plurality of databases, and wherein terms of the one or more terms found in both the first database and the second database are one or more common terms; (c) program instructions to determine for a common term of the one or more common terms, a quantity of databases of the plurality of databases, in which a common term of the one or more common terms is found, wherein the quantity of databases of the plurality of databases in which the common term of the one or more common terms is found, constitutes a group, and wherein a range of groups includes each quantity of databases from a group of two databases to a group of a quantity of the plurality of databases; and (d) program instructions to determine a similarity score between the first database and the second database of the plurality of databases based on a tuple formed from the quantity of the one or more common terms found in each group of databases of the plurality of databases. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product for determining a similarity of databases to search criteria, the computer program product comprising:
-
a computer-readable storage device having computer readable program code embodied therewith, the computer readable program code comprising; (a) computer readable program code configured to receive search criteria, wherein the search criteria includes one or more terms; (b) computer readable program code configured to determine the one or more terms found in both the search criteria and a first database of a plurality of databases, wherein the one or more terms found in both the search criteria and a first database are one or more common terms; (c) computer readable program code configured to determine for a common term of the one or more common terms, a quantity of the one or more common terms found in each of a plurality of groups of databases of the plurality of databases, wherein a group of databases of the plurality of groups of databases includes a quantity of databases in which a common term of the one or more common terms is found, and wherein a range of groups includes each quantity of databases from a group of two databases to a group of a quantity of the plurality of databases; and (d) computer readable program code configured to determine a similarity score of the first database of the plurality of databases based a tuple formed from on the quantity of the one or more common terms found in each group of databases of the plurality of databases, wherein the similarity of the first database to the search criteria is based on the similarity score. - View Dependent Claims (14, 15, 16)
-
-
17. A computer system for determining a similarity of databases to search criteria, the method comprising:
-
one or more computer processors; one or more computer readable storage devices; and program instructions stored on the computer readable storage media for execution by at least one of the one or more processors, the program instructions comprising; (a) computer readable program code configured to receive search criteria, wherein the search criteria includes one or more terms; (b) computer readable program code configured to determine the one or more terms found in both the search criteria and a first database of a plurality of databases, wherein the one or more terms found in both the search criteria and a first database are one or more common terms; (c) computer readable program code configured to determine for a common term of the one or more common terms, a quantity of the one or more common terms found in each of a plurality of groups of databases of the plurality of databases, wherein a group of databases of the plurality of groups of databases includes a quantity of databases in which a common term of the one or more common terms is found, and wherein a range of groups includes each quantity of databases from a group of two databases to a group of a quantity of the plurality of databases; and (d) computer readable program code configured to determine a similarity score of the first database of the plurality of databases based a tuple formed from on the quantity of the one or more common terms found in each group of databases of the plurality of databases, wherein the similarity of the first database to the search criteria is based on the similarity score. - View Dependent Claims (18, 19, 20)
-
Specification