×

Systems and methods regarding keyword extraction

  • US 8,874,568 B2
  • Filed: 11/02/2011
  • Issued: 10/28/2014
  • Est. Priority Date: 11/05/2010
  • Status: Active Grant
First Claim
Patent Images

1. A computer system comprising one or more processors that function as:

  • (a) a preprocessing unit that extracts text from a webpage to produce at least a first set of candidate keywords, applies language processing to produce at least a second set of candidate keywords, and combines said first and second sets of candidate keywords into a first candidate pool;

    (b) a candidate extraction unit that receives data from said preprocessing unit describing at least said first candidate pool and produces a second candidate pool;

    (c) a feature extraction unit that receives data describing at least said second candidate pool and analyzes said second candidate pool for general features and linguistic features, wherein said general features include number of times a term appears in the text extracted from the webpage; and

    (d) a classification unit that receives said data describing at least said second candidate pool and related data from said feature extraction unit, and determines a likelihood of each candidate in said second candidate pool being a primary or secondary keyword.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×