Method and system for comparing information contents
First Claim
Patent Images
1. A computer implemented method for detecting whether a received information content is identical to any of a plurality of stored information contents, comprising the steps of:
- calculating a plurality of unique parameter values by applying an algorithm that calculates, for each of a plurality of stored information contents, a unique parametric value to a predetermined precision set by an organization, wherein each unique parametric value represents one of the plurality of stored information contents;
storing the plurality at parameter values;
receiving a new information content;
responsive to receiving said new information content, applying said algorithm to said received information content to calculate a parametric value representing the received information content;
comparing the parameter value representing the received information content with each of said unique plurality of stored parameter values; and
indicating that the received information content is identical to a stored information content if the corresponding parameters values are equal;
wherein said algorithm is;
where “
R”
stands for the parameter that uniquely represents the received information content, the numerical value of “
R”
may be within zero and one, the factor “
n”
represents the position order of the constituent characters of the received information content, and the factor “
a”
represents a unique value for the constituent characters in the received information content.
1 Assignment
0 Petitions
Accused Products
Abstract
The method and system disclosed herein provide for detecting duplicate information contents such as emails, before storing them in the system, in a fast and reliable way. A parameter that uniquely represents each information content may be determined, and the comparison process of the information contents may be efficiently carried out on the parameters, rather than on the actual information contents.
21 Citations
14 Claims
-
1. A computer implemented method for detecting whether a received information content is identical to any of a plurality of stored information contents, comprising the steps of:
-
calculating a plurality of unique parameter values by applying an algorithm that calculates, for each of a plurality of stored information contents, a unique parametric value to a predetermined precision set by an organization, wherein each unique parametric value represents one of the plurality of stored information contents; storing the plurality at parameter values; receiving a new information content; responsive to receiving said new information content, applying said algorithm to said received information content to calculate a parametric value representing the received information content; comparing the parameter value representing the received information content with each of said unique plurality of stored parameter values; and indicating that the received information content is identical to a stored information content if the corresponding parameters values are equal; wherein said algorithm is; where “
R”
stands for the parameter that uniquely represents the received information content, the numerical value of “
R”
may be within zero and one, the factor “
n”
represents the position order of the constituent characters of the received information content, and the factor “
a”
represents a unique value for the constituent characters in the received information content.- View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer implemented method for comparing a plurality of information contents, comprising the steps of:
-
calculating a plurality of parameter values by applying an algorithm that calculates each of a plurality of stored information contents to a predetermined precision, each parametric value representing one of the plurality of information contents; comparing the plurality of parameter values, such that equality between a pair of the plurality of parameter values indicates that corresponding pair of the plurality of information contents is identical; wherein said algorithm is; where “
R”
stands for the parameter that uniquely represents the received information content, the numerical value of “
R”
may be within zero and one, the factor “
n”
represents the position order of the constituent characters of the received information content, and the factor “
a”
represents a unique value for the constituent characters in the received information content.- View Dependent Claims (7, 8, 9, 10)
-
-
11. A computer readable medium embodying a computer implemented method for comparing a plurality of information contents, the computer implemented method comprising the steps of:
-
calculating a plurality of parameter values by applying an algorithm that calculates each of a plurality of stored information contents to a predetermined precision, each parametric value representing one of the plurality of information contents; comparing the plurality of parameter values, such that equality between a pair of the plurality of parameter values indicates that corresponding pair of the plurality of information contents is identical; wherein said algorithm is; where “
R”
stands for the parameter that uniquely represents the received information content, the numerical value of “
R”
may be within zero and one, the factor “
n”
represents the position order of the constituent characters of the received information content, and the factor “
a”
represents a unique value for the constituent characters in the received information content.
-
-
12. A system for comparing a plurality of information contents, comprising:
-
at least one user terminal; means for calculating a plurality of parameter values by applying an algorithm that calculates each of a plurality of stored information contents to a predetermined precision, each parametric value representing one of the plurality of information contents; means for comparing the plurality of parameter values, such that equality between a pair of the plurality of parameter values indicates that corresponding pair of the plurality of information contents is identical; and at least one database containing the plurality of information contents and the plurality of parameters; wherein said algorithm is; where “
R”
stands for the parameter that uniquely represents the received information content, the numerical value of “
R”
may be within zero and one, the factor “
n”
represents the position order of the constituent characters of the received information content, and the factor “
a”
represents a unique value for the constituent characters in the received information content.- View Dependent Claims (13, 14)
-
Specification