×

IDENTIFYING CONTENT OF INTEREST

  • US 20070294784A1
  • Filed: 06/08/2007
  • Published: 12/20/2007
  • Est. Priority Date: 06/08/2006
  • Status: Active Grant
First Claim
Patent Images

1. A method of generating a marker set comprising markers that identify a desired type of text, the method comprising:

  • selecting a seed marker set comprising at least one seed marker;

    generating a seed corpus from a first reference corpus, wherein the seed corpus comprises a plurality of textual units, and wherein each of the plurality of textual units included in the seed corpus comprises at least one instance of a seed marker included in the seed marker set;

    generating a statistical value describing the seed marker set and the seed corpus; and

    generating a revised seed marker set.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×