×

System and method for detecting personal experience event reports from user generated internet content

  • US 8,612,455 B2
  • Filed: 10/05/2011
  • Issued: 12/17/2013
  • Est. Priority Date: 10/06/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method, implementable on a computing device, for detecting personal experience event reports from user generated content on the Internet, the method comprising:

  • filtering a collection of Internet posts to include only said Internet posts containing personal experience terms;

    further filtering said filtered Internet posts by removing said Internet posts with non-personal experience terms;

    analyzing said Internet posts to define segments in said Internet posts, wherein said segments at least contain terms consistent with user generation of a personal experience event report associated with a pre-defined search subject;

    scoring each of said segments, wherein said score indicates a likelihood that said Internet post associated with said segment represents a user generated said personal experience report associated with said pre-defined search subject; and

    storing at least indications of said Internet posts with associated said scores above a pre-defined threshold in a searchable personal experience database;

    and wherein said analyzing also comprises;

    filtering said Internet posts to remove said Internet posts that do not at least contain said terms from each of a minimum number of term categories associated with said pre-defined subject;

    detecting a pair of anchors from two anchor categories, wherein said anchor categories are also term categories and represent two essential components of said user generated personal experience reports;

    defining a basic said segment as a shortest section of text between said pair of anchors;

    when said shortest section of text does not include at least one said term from each of said minimum number of term categories, expanding said basic segment to extend beyond said shortest section of text to include at least one said term from each of said minimum number of term categories;

    calculating a density value for said terms in said basic segment;

    expanding said basic segment to include a nearest said term not included in said basic segment;

    recalculating said density value for said expanded basic segment;

    iteratively repeating said expanding and recalculating until said recalculated density value is less than a previously calculated said density value; and

    defining said expanded basic segment as a final segment.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×