×

SELECTION OF A SET OF OPTIMAL N-GRAMS FOR INDEXING STRING DATA IN A DBMS SYSTEM UNDER SPACE CONSTRAINTS INTRODUCED BY THE SYSTEM

  • US 20090063404A1
  • Filed: 11/04/2008
  • Published: 03/05/2009
  • Est. Priority Date: 11/05/2004
  • Status: Active Grant
First Claim
Patent Images

1. A system for selecting a set of n-grams for indexing string data in a database management system (DBMS) in relation to resources available to the DBMS, comprising:

  • means for providing a set of candidate n-grams, each n-gram comprising a sequence of characters;

    means for receiving an n-gram space constraint to define an amount “

    k”

    of the set of candidate n-grams eligible for a minimal set of n-grams, the n-gram space constraint based on resources available to the DBMS;

    means for comparing each of the candidate n-grams from the provided set of candidate n-grams with sample queries and database records to determine a benefit associated with the candidate n-grams in reducing false hits;

    means for selecting the minimal set of n-grams, the minimal set of n-grams having a highest total benefit and being subject to the n-gram space constraint;

    means for selecting an updated minimal set of n-grams responsive to receiving an updated n-gram space constraint, the updated minimal set of n-grams having a highest total benefit and being subject to the updated n-gram space constraint, wherein the updated minimal set of n-grams consists of no more than “

    k”

    n-grams; and

    means for generating an index, based on the minimal set of selected n-grams or the updated minimal set of n-grams, that indexes string data contained in the database records.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×