×

Header-token driven automatic text segmentation

  • US 8,631,005 B2
  • Filed: 12/28/2006
  • Issued: 01/14/2014
  • Est. Priority Date: 12/28/2006
  • Status: Active Grant
First Claim
Patent Images

1. A method of automatic text segmentation, the method comprising:

  • estimating, for each token in a set of tokens in a description, through use of a machine having one or more processors, a probability that the token is irrelevant;

    associating, with each token in the set of tokens in the description, one of a first value, a second value, or a third value, based on whether, respectively,the token occurs in a header of the description,a lexical association exists between the token and a token in the header, orthe lexical association is absent and the token does not occur in the header;

    iterating, through use of the machine, over a plurality of groups of sequential tokens in the description, in each iteration,selecting a group,computing a relevance value of the selected group based, at least in part, on at least one estimated probability of one or more tokens outside the selected group and on values associated with one or more tokens in the selected group; and

    indicating, through use of the machine, one of the plurality of groups as having a greatest relevance value.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×