×

Selection of a set of optimal n-grams for indexing string data in a DBMS system under space constraints introduced by the system

  • US 7,478,081 B2
  • Filed: 11/05/2004
  • Issued: 01/13/2009
  • Est. Priority Date: 11/05/2004
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for selecting a set of n-grams for indexing string data in a database management system (DBMS) in relation to resources available to the DBMS, comprising:

  • providing a set of candidate n-grams, each n-gram comprising a sequence of characters;

    receiving an n-gram space constraint to define an amount “

    k”

    of the set of candidate n-grams eligible for a minimal set of n-grams, the n-gram space constraint based on resources available to the DBMS;

    comparing each of the candidate n-grams from the provided set of candidate n-grams with sample queries and database records to determine a benefit associated with the candidate n-grams in reducing false hits;

    selecting the minimal set of n-grams, the minimal set of n-grams having a highest total benefit and being subject to the n-gram space constraint;

    selecting an updated minimal set of n-grams responsive to receiving an updated n-gram space constraint, the updated minimal set of n-grams having a highest total benefit and being subject to the updated n-gram space constraint, wherein the updated minimal set of n-grams consists of no more than “

    k”

    n-grams; and

    ,generating an index, based on the minimal set of selected n-grams or the updated minimal set of n-grams, that indexes string data contained in the database records.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×