Similarity calculation system, method of calculating similarity, and program
First Claim
1. A similarity calculation system for increasing the efficiency of a computer when performing searching, comprising:
- at least one processor; and
at least one memory device that stores a plurality of instructions, which when executed by the at least one processor, causes the at least one processor to operate to;
acquire a query vector;
acquire a plurality of target vectors;
calculate a similarity between each of the plurality of target vectors belonging to any one of the plurality of clusters and the query vector,calculate, for each of the plurality of target vectors, a calculation amount to be estimated when calculating the similarity between the each of the plurality of target vectors and the query vector,cluster the plurality of target vectors based on the calculation amount to be estimated for each of the plurality of target vectors,wherein, in the calculation, the processor calculates a number of non-zero elements of each of the plurality of target vectors as the estimated calculation amount,wherein, in the clustering, the processor clusters the plurality of target vectors so that a difference in a total sum of the calculated calculation amounts for all of the plurality of target vectors belonging to each of the plurality of clusters among the plurality of clusters decreases,wherein, in the clustering, the processor clusters the plurality of target vectors by generating a graph comprising;
a plurality of first nodes that correspond to each of the plurality of target vectors and that has the calculation amount estimated for a corresponding one of the plurality of target vectors as a weight,a plurality of second nodes corresponding to an element type of the plurality of target vectors, anda plurality of edges connecting each of the plurality of first nodes to any one of the plurality of second nodes, and by dividing the generated graph based on the weight of each of the plurality of first nodes.
2 Assignments
0 Petitions
Accused Products
Abstract
Provided is a similarity calculation system for equalizing the time for calculating a similarity between target vectors and a query vector. The similarity calculation system includes target vector acquisition part for acquiring a plurality of target vectors, and clustering part for clustering the plurality of target vectors based on a calculation amount to be estimated for each of the plurality of target vectors, the calculation amount being estimated when calculating a similarity between each of the plurality of target vectors and a given reference query vector, so that a difference in total calculation amount for a similarity between all of the target vectors belonging to each of a plurality of clusters and the given reference query vector among the plurality of clusters decreases.
8 Citations
6 Claims
-
1. A similarity calculation system for increasing the efficiency of a computer when performing searching, comprising:
-
at least one processor; and at least one memory device that stores a plurality of instructions, which when executed by the at least one processor, causes the at least one processor to operate to; acquire a query vector; acquire a plurality of target vectors; calculate a similarity between each of the plurality of target vectors belonging to any one of the plurality of clusters and the query vector, calculate, for each of the plurality of target vectors, a calculation amount to be estimated when calculating the similarity between the each of the plurality of target vectors and the query vector, cluster the plurality of target vectors based on the calculation amount to be estimated for each of the plurality of target vectors, wherein, in the calculation, the processor calculates a number of non-zero elements of each of the plurality of target vectors as the estimated calculation amount, wherein, in the clustering, the processor clusters the plurality of target vectors so that a difference in a total sum of the calculated calculation amounts for all of the plurality of target vectors belonging to each of the plurality of clusters among the plurality of clusters decreases, wherein, in the clustering, the processor clusters the plurality of target vectors by generating a graph comprising; a plurality of first nodes that correspond to each of the plurality of target vectors and that has the calculation amount estimated for a corresponding one of the plurality of target vectors as a weight, a plurality of second nodes corresponding to an element type of the plurality of target vectors, and a plurality of edges connecting each of the plurality of first nodes to any one of the plurality of second nodes, and by dividing the generated graph based on the weight of each of the plurality of first nodes. - View Dependent Claims (2, 3, 4)
-
-
5. A method of calculating a similarity among target vectors for increasing the efficiency of a computer when performing searching, comprising:
-
acquiring a query vector; acquiring, with at least one processor operating with a memory device in a server, a plurality of target vectors; calculating a similarity between each of the plurality of target vectors belonging to any one of the plurality of clusters and the query vector, calculating, for each of the plurality of target vectors, a calculation amount to be estimated when calculating the similarity between the each of the plurality of target vectors and the query vector, by calculating a number of non-zero elements of each of the plurality of target vectors as the estimated calculation amount; clustering, with the at least one processor operating with the memory device in the server, the plurality of target vectors based on the calculation amount to be estimated for each of the plurality of target vectors such that the processor clusters the plurality of target vectors so that a difference in a total sum of the calculated calculation amounts for all of the plurality of target vectors belonging to each of the plurality of clusters among the plurality of clusters decreases, clustering the plurality of target vectors b generating a graph, the graph comprising; a plurality of first nodes that correspond to each of the plurality of target vectors and that has the calculation amount estimated for a corresponding one of the plurality of target vectors as a weight, a plurality of second nodes corresponding to an element type of the plurality of target vectors, and a plurality of edges connecting each of the plurality of first nodes to any one of the plurality of second nodes, and by dividing the generated graph based on the weight of each of the plurality of first nodes.
-
-
6. A computer-readable non-transitory storage medium storing a plurality of instructions for calculating a similarity among target vectors for increasing the efficiency of a computer when performing searching, wherein when executed by at least one processor, the plurality of instructions cause the at least one processor to:
-
acquire a query vector; acquire a plurality of target vectors; calculate a similarity between each of the plurality of target vectors belonging to any one of the plurality of clusters and the query vector, calculate, for each of the plurality of target vectors, a calculation amount to be estimated when calculating the similarity between the each of the plurality of target vectors and the query vector, cluster the plurality of target vectors based on the calculation amount to be estimated for each of the plurality of target vectors, wherein, in the calculation, the processor calculates a number of non-zero elements of each of the plurality of target vectors as the estimated calculation amount, wherein, in the clustering, the processor clusters the plurality of target vectors so that a difference in a total sun of the calculated calculation amounts for all of the plurality of target vectors belonging to each of the plurality of clusters among the plurality of clusters decreases, wherein, in the clustering, the processor clusters the plurality of target vectors by generating a graph comprising; a plurality of first nodes that correspond to each of the plurality of target vectors and that has the calculation amount estimated for a corresponding one of the plurality of target vectors as a weight, a plurality of second nodes corresponding to an element type of the plurality of target vectors, and a plurality of edges connecting each of the plurality of first nodes to any one of the plurality of second nodes, and by dividing the generated graph based on the weight of each of the plurality of first nodes.
-
Specification