×

Method and system for fast indexing and searching of text in compound-word languages

  • US 5,778,361 A
  • Filed: 09/29/1995
  • Issued: 07/07/1998
  • Est. Priority Date: 09/29/1995
  • Status: Expired due to Term
First Claim
Patent Images

1. A method in a computer system for generating a search result that identifies objects that satisfy a search criteria, the computer system having a collection of objects and a plurality of terms, each object containing one or more of the terms, the objects being represented in different tyes of symbols in a compound word language such as Japanese or Chinese, the method comprising the computer-implemented steps of:

  • creating a content-index that contains, for each of the plurality of terms, a reference to each object that contains the term, by;

    creating a preliminary index term of a first or second type of symbol for each plurality of terms delimited by a word separator or a character type transition;

    for each preliminary index term of the first type, utilizing the preliminary index term as an index term;

    for each preliminary index term of the second type, step indexing the symbols in the preliminary index term to create a plurality of index terms of a length equal to or less than a predetermined step size, the plurality of index terms comprising a collection of substrings of symbols selected from the preliminary index term that begins with one of the symbols in the preliminary index term and extends to a length of either the end of the preliminary index term or to the number of symbols of the predetermined step size, whichever is smaller;

    creating the content-index by associating the object with each of its index terms; and

    after creating the content-index, using the content-index to generate the search result.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×