×

Method and system for discovering suspicious account groups

  • US 9,684,649 B2
  • Filed: 12/28/2012
  • Issued: 06/20/2017
  • Est. Priority Date: 08/21/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method for discovering suspicious account groups, comprising:

  • under a control of at least one hardware processor,receiving a monitoring website table and at least one monitored vocabulary set containing a plurality of elements;

    downloading a first group of accounts and one or more post contents corresponding to each account of the first group of accounts from the monitoring website during a first time interval;

    establishing a language model, for each account of the first group of accounts, according to the one or more post contents from each account of the first group of accounts during the first time interval, to describe a linguistic fashion for each account, the language model being expressed at least partly as a probability of an occurrence of at least one element of the at least one monitored vocabulary set in an account;

    comparing a similarity among a first group of language models of the first group of accounts to cluster the first group of accounts;

    downloading newly added data including a second group of accounts and one or more post contents corresponding to each account of the second group of accounts from the monitoring website during a second time interval;

    obtaining one or more homonyms synonyms in the newly added data of at least one element of the at least one monitored vocabulary set corresponding to the first group of accounts, comprising the sub-steps offetching one or more features through a previous feature window and a next feature window of each monitored vocabulary in the at least one monitored vocabulary set; and

    converting a weight of an original word of the at least one monitored vocabulary set into a corresponding weight of a homonym synonym;

    updating the first group of language models with the one or more homonyms synonyms;

    integrating the first and the second groups of accounts to create an integrated group of accounts;

    rebuilding a language model for each of the integrated group of accounts to create a second group of language models based on the step of updating the first group of language models with the one or more homonyms synonyms;

    clustering the integrated group of accounts according to the determined similarity among the integrated group of accounts based on the second group of language models;

    determining at least one suspicious account group after the step of clustering according to a level of homogeneity among at least account groups of the integrated group of accounts; and

    determining interaction connection among accounts of the integrated group of accounts based on a result of the step of identifying at least one suspicious account group.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×