Methods and systems for analyzing data related to possible online fraud
First Claim
1. A method, comprising:
- periodically collecting, with a computer, from a plurality of different sources, a set of data related to a web site, wherein the set of data comprises a web page on the web site;
dividing, with the computer, the set of data into a plurality of components, the plurality of components including at least an Internet Protocol (“
IP”
) address associated with the web site and a body field comprising text;
analyzing at least two of the components, wherein analyzing the at least two of the plurality of components comprises;
analyzing the text of the body field to identify at least one of a pre-defined blacklisted term and a brand name;
identifying a domain of the web site;
identifying an Internet Protocol (“
IP”
) block assigned to the domain; and
comparing the IP address of the web site with the IP block assigned to the domain;
assigning at least one score to one or more of the analyzed components; and
categorizing the web site as a possibly fraudulent web site, based at least in part on the at least one score.
9 Assignments
0 Petitions
Accused Products
Abstract
Various embodiments of the invention provide methods, systems and software for analyzing data. In particular embodiments, for example, a set of data about a web site may be analyzed to determine whether the web site is likely to be illegitimate (e.g., to be involved in a fraudulent scheme, such as a phishing scheme, the sale of gray market goods, etc.). In an exemplary embodiment, a set of data may be divided into a plurality of components (each of which, in some cases, may be considered a separate data set). Merely by way of example, a set of data may comprise data gathered from a plurality of data sources, and/or each component may comprise data gathered from one of the plurality of data source. As another example, a set of data may comprise a document with a plurality of sections, and each component may comprise one of the plurality of sections. Those skilled in the art will appreciate that the analysis of another component may comprise certain tests and/or evaluations, and that the analysis of another component may comprise different tests and/or evaluations. In other cases, the analysis of each component may comprise similar tests and/or evaluations. The variety of tests and/or evaluations generally will be implementation specific.
229 Citations
19 Claims
-
1. A method, comprising:
-
periodically collecting, with a computer, from a plurality of different sources, a set of data related to a web site, wherein the set of data comprises a web page on the web site; dividing, with the computer, the set of data into a plurality of components, the plurality of components including at least an Internet Protocol (“
IP”
) address associated with the web site and a body field comprising text;analyzing at least two of the components, wherein analyzing the at least two of the plurality of components comprises; analyzing the text of the body field to identify at least one of a pre-defined blacklisted term and a brand name; identifying a domain of the web site; identifying an Internet Protocol (“
IP”
) block assigned to the domain; andcomparing the IP address of the web site with the IP block assigned to the domain; assigning at least one score to one or more of the analyzed components; and categorizing the web site as a possibly fraudulent web site, based at least in part on the at least one score. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer system, comprising a hardware processor and a set of instructions executable by the hardware processor, the set of instructions comprising:
-
instructions for periodically collecting, from a plurality of different sources, a set of data related to a web site, wherein the set of data comprises a web page on the web site; instructions for dividing the set of data into a plurality of components, the plurality of components comprising an Internet Protocol (“
IP”
) address associated with the web site and a body field comprising text;instructions for analyzing at least two of the plurality of components, comprising; instructions for analyzing the text of the body field to identify at least one of a pre-defined blacklisted term and a brand name; instructions for identifying a domain of the web site; instructions for identifying an Internet Protocol (“
IP”
) block assigned to the domain; andinstructions for comparing the IP address of the web site with the IP block assigned to the domain; instructions for assigning at least one score to one or more of the analyzed components; and instructions for categorizing the web site as a possibly fraudulent web site, based at least in part on the at least one score. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A software program embodied on a non-transitory computer readable medium, the software program comprising a set of instructions executable by one or more computers, the set of instructions comprising:
-
instructions for periodically collecting, from a plurality of different sources, a set of data related to a web site, wherein the set of data comprises a web page on the web site; instructions for dividing the set of data into a plurality of components, wherein the plurality of components comprises an Internet Protocol (“
IP”
) address associated with the web site and a body field comprising text;instructions for analyzing at least two of the plurality of components, comprising; instructions for analyzing the text of the body field to identify at least one of a pre-defined blacklisted term and a brand name; instructions for identifying a domain of the web site; instructions for identifying an Internet Protocol (“
IP”
) block assigned to the domain; andinstructions for comparing the IP address of the web site with the IP block assigned to the domain; instructions for assigning at least one score to at least some of the analyzed components; and instructions for categorizing the web site as a possibly fraudulent web site, based at least in part on the at least one score. - View Dependent Claims (16, 17, 18, 19)
-
Specification