×

Behavior-driven multilingual stemming

  • US 8,793,120 B1
  • Filed: 10/28/2010
  • Issued: 07/29/2014
  • Est. Priority Date: 10/28/2010
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method of stemming terms using behavioral data, comprising:

  • under control of one or more computer systems configured with executable instructions,capturing behavioral data for a plurality of users with respect to a plurality of terms;

    obtaining a rule set for stemming in a language, the language including the plurality of terms;

    obtaining a word to be stemmed;

    in response to determining that only one rule of the rule set is to be used to stem the obtained word, stemming the obtained word using only one rule;

    orin response to determining that more than one rule of the rule set is to be used to stem the obtained word;

    determining a set of forms of the obtained word;

    determining an output set of forms corresponding to the set of forms, wherein each rule of the more than one rule corresponds to one of the forms in the output set of forms,determining, based at least in part upon the captured behavioral data, a relative measurement value of each form in the output set of forms, wherein each of the relative measurement values corresponds to an indication of a frequency of use of a corresponding one of the forms in the output set of forms, andselecting, based at least in part upon the relative measurement values, at least one form in the output set of forms to be used as a stem for the obtained word.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×