×

Segmenting text for searching

  • US 8,423,350 B1
  • Filed: 05/21/2009
  • Issued: 04/16/2013
  • Est. Priority Date: 05/21/2009
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method comprising:

  • receiving, at a computer system comprising a processor, text;

    segmenting, at the computer system, the text into one or more unigrams;

    filtering, at the computer system, the one or more unigrams to identify one or more core unigrams; and

    generating, at the computer system, a searchable data structure, wherein the generating includes, for each specific unigram of the one or more core unigrams;

    identifying a stem of the specific unigram,indexing the stem,obtaining (i) grammar information for the specific unigram, (ii) language information for the specific unigram, and (iii) description information for the specific unigram, andassociating one or more n-grams with the indexed stem,wherein each of the one or more n-grams is derived from the text, the grammar information for the specific unigram, the language information for the specific unigram, and the description information for the specific unigram, andwherein each of the one or more n-grams includes a core unigram that is related to the indexed stem.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×