×

Language independent stemming

  • US 8,015,175 B2
  • Filed: 03/16/2007
  • Issued: 09/06/2011
  • Est. Priority Date: 03/16/2007
  • Status: Active Grant
First Claim
Patent Images

1. A method for stemming a word for use in a text search system running in a computing system, the method comprising the steps of:

  • (a) calling a stemming algorithm to process a word;

    (b) parsing the word through a main routine of said stemming algorithm;

    wherein said main routine determines all possible prefixes and suffixes for the word;

    (c) parsing a remaining portion of the word through a recursive subroutine called from within said main routine, wherein said recursive subroutine determines all possible roots and infixes of the remaining portion of the word;

    (d) assigning through a cost calculator function of said stemming algorithm a cost for each of said prefixes, suffixes, roots, and infixes found;

    (e) sequencing by said stemming algorithm said prefixes, suffixes, roots, and infixes found into one or more unique paths that traverse the word;

    (f) adding up by said stemming algorithm a total cost for each of said one or more unique paths to determine a least cost path;

    (g) outputting by said stemming algorithm one or more roots found in said least cost path as a stem for the word;

    (h) performing a search with the text search system using said one or more roots for the word instead of the word itself in both a querying and a indexing phases of the search.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×