Method and system for identifying business listing characteristics
First Claim
1. A computer implemented method for identifying business listing characteristics, the method comprising:
- determining, using one or more processors, a first frequency value of a business listing characteristic within a first plurality of business listings over a first time period, wherein the first plurality of business listings are associated with a particular geographical region;
comparing, by the one or more processors, the first frequency value with a normal frequency value of the business listing characteristic previously determined from a second plurality of business listings over a second time period;
identifying, by the one or more processors, an anomaly in the first frequency value when a difference between the first frequency value and the normal frequency value is greater than a predetermined threshold; and
identifying, by the one or more processors, the business listing characteristic as a suspicious characteristic in response to identifying the anomaly, wherein the suspicious characteristic includes at least one term selected from a plurality of terms used in the identified business listing;
determining, by the one or more processors, that a ratio of the at least one term to other terms from the plurality of terms is greater than a predetermined ratio; and
identifying, by the one or more processors, the business listing as a spam listing is in response to the ratio being greater than the predetermined ratio.
2 Assignments
0 Petitions
Accused Products
Abstract
Aspects of the disclosure provide for detection of spam attacks. In order to filter spam listings among business listings associated with a particular geographical region, a method and system operate to analyze the frequency of particular characteristics of the business listings, such as words within a listing title, phone numbers, or websites, to identify a normal frequency of each characteristic. Business listings may be periodically analyzed to identify anomalous increases in the frequency of particular characteristics. Characteristics that exhibit these anomalies may be identified as suspicious characteristics, and listings that contain suspicious characteristics may be identified as possible spam listings.
-
Citations
11 Claims
-
1. A computer implemented method for identifying business listing characteristics, the method comprising:
-
determining, using one or more processors, a first frequency value of a business listing characteristic within a first plurality of business listings over a first time period, wherein the first plurality of business listings are associated with a particular geographical region; comparing, by the one or more processors, the first frequency value with a normal frequency value of the business listing characteristic previously determined from a second plurality of business listings over a second time period; identifying, by the one or more processors, an anomaly in the first frequency value when a difference between the first frequency value and the normal frequency value is greater than a predetermined threshold; and identifying, by the one or more processors, the business listing characteristic as a suspicious characteristic in response to identifying the anomaly, wherein the suspicious characteristic includes at least one term selected from a plurality of terms used in the identified business listing; determining, by the one or more processors, that a ratio of the at least one term to other terms from the plurality of terms is greater than a predetermined ratio; and identifying, by the one or more processors, the business listing as a spam listing is in response to the ratio being greater than the predetermined ratio. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A processing system for identifying business listing characteristics, the processing system comprising:
-
one or more processors; and one or more memories coupled to the one or more processors for storing instructions and a first plurality of business listings, wherein the first plurality of business listings are associated with a particular geographical region; wherein the one or more processors are configured to execute the instructions stored in the one or more memories in order to; determine a frequency value for a characteristic associated with the first plurality of business listings over a particular time period; identify an anomalous frequency value where a difference between the frequency value and a normal frequency value previously determined for the characteristic from a second plurality of business listings is greater than a predetermined threshold; and in response to identifying the anomalous frequency value, identify the characteristic as a suspicious characteristic, wherein the suspicious characteristic includes at least one term selected from a plurality of terms used in the identified business listing; determine that a ratio of the at least one term to other terms from the plurality of terms is greater than a predetermined ratio; and identify the business listing as a spam listing is in response to the ratio being greater than the predetermined ratio. - View Dependent Claims (7, 8, 9)
-
-
10. A non-transitory computer readable storage medium containing instructions that, when executed by one or more processors, cause the one or more processors to perform a method, the method comprising:
-
determining a first frequency value of a business listing characteristic within a first plurality of business listings over a first time period, wherein the first plurality of business listings are associated with a particular geographical region; comparing the first frequency value with a normal frequency value of the business listing characteristic previously determined from a second plurality of business listings over a second time period; identifying an anomaly in the first frequency value when a difference between the first frequency value and the normal frequency value is greater than a predetermined threshold; and identifying the business listing characteristic as a suspicious characteristic in response to identifying the anomaly, wherein the suspicious characteristic includes at least one term selected from a plurality of terms used in the identified business listing; determining that a ratio of the at least one term to other terms from the plurality of terms is greater than a predetermined ratio; and identifying the business listing as a spam listing is in response to the ratio being greater than the predetermined ratio. - View Dependent Claims (11)
-
Specification