×

SYSTEMS AND METHODS OF DE-DUPLICATING SIMILAR NEWS FEED ITEMS

  • US 20160103916A1
  • Filed: 10/10/2014
  • Published: 04/14/2016
  • Est. Priority Date: 10/10/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method of de-duplicating similar news feed items, the method including:

  • assembling a set of news feed items from a plurality of electronic sources;

    preprocessing the set to qualify some of the news feed items to return based on common company-name mentions and common token occurrences;

    pairwise determining a resemblance measure for the qualified news feed items based on sequence alignment between news feed item pairs;

    constructing a graph of news feed item pairs with the resemblance measure above a threshold and representing the resemblance measure as edges between nodes representing the news feed item pairs, thereby forming connected node pairs; and

    determining similar news feed items by clustering the connected node pairs into strongly connected components.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×