Dynamic optimization of multi-feature queries

US 20030208484A1
Filed: 05/01/2002
Published: 11/06/2003
Est. Priority Date: 05/01/2002
Status: Active Grant

First Claim

Patent Images

1. A method for optimally performing a similarity search for a query object using at least one data stream, each of the at least one data stream for a feature attribute and being a list in distance order, comprising the steps of:

determining a query plan using a cost-aware model; and

executing the query plan to obtain at least one object using at least one of the at least one data stream;

wherein information related to the similarity search is returned once the distance of the first of the at least one obtained object is at most equal to a threshold value based on an aggregate distance of highest distances of objects obtained from each of the at least one data stream.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention provides an elegant solution for processing multi-feature queries, which considers the differing access costs associated with each feature. Access cost is a critical factor in determining how individual features should be processed in terms of retrieving through sorted or random access, and, hence, in minimizing the overall query response time. The present invention operates dynamically during query processing and seeks to minimize the total query cost in terms of number of features retrieved and cost for access. It works by evaluating different combinations of feature access plans (sorted and random access) according to the number of retrieved features and forward access costs, and it selects the lowest cost plan. Experimental results on practical data show a significant speed-up in multi-features queries using the proposed solution.

91 Citations

View as Search Results

14 Claims

1. A method for optimally performing a similarity search for a query object using at least one data stream, each of the at least one data stream for a feature attribute and being a list in distance order, comprising the steps of:
- determining a query plan using a cost-aware model; and
  
  executing the query plan to obtain at least one object using at least one of the at least one data stream;
  
  wherein information related to the similarity search is returned once the distance of the first of the at least one obtained object is at most equal to a threshold value based on an aggregate distance of highest distances of objects obtained from each of the at least one data stream.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein determining the query plan comprises identifying an access sequence.
  - 3. The method of claim 2, wherein identifying the access sequence comprises:
    - determining a location on an equi-threshold line with the lowest estimated cost; and
      
      generating the query plan to reach the location on the equi-threshold line.
  - 4. The method of claim 1, wherein each of the at least one data stream has a cost estimator used by the cost-aware model.
  - 5. The method of claim 4, wherein each cost estimator used by the cost-aware model includes a sequential cost function and a random cost function.
  - 6. The method of claim 4, wherein at least one cost estimator used by the cost-aware model is based on a correlation sample.
  - 7. The method of claim 4, wherein at least one cost estimator used by the cost-aware model is based on extrapolation from past costs.

8. A program storage device readable by a machine, tangibly embodying a program of instructions executable on the machine to perform method steps for optimally performing a similarity search for a query object using at least one data stream, each of the at least one data stream for a feature attribute and being a list in distance order, the method steps comprising:
- determining a query plan using a cost-aware model; and
  
  executing the query plan to obtain at least one object using at least one of the at least one data stream;
  
  wherein information related to the similarity search is returned once the distance of the first of the at least one obtained object is at most equal to a threshold value based on an aggregate distance of highest distances of objects obtained from each of the at least one data stream.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The program storage device of claim 8, wherein determining the query plan comprises identifying an access sequence.
  - 10. The program storage device of claim 9, wherein identifying the access sequence comprises:
    - determining a location on an equi-threshold line with the lowest estimated cost; and
      
      generating the query plan to reach the location on the equi-threshold line.
  - 11. The program storage device of claim 8, wherein each of the at least one data stream has a cost estimator used by the cost-aware model.
  - 12. The program storage device of claim 11, wherein each cost estimator used by the cost-aware model includes a sequential cost function and a random cost function.
  - 13. The program storage device of claim 11, wherein at least one cost estimator used by the cost-aware model is based on a correlation sample.
  - 14. The program storage device of claim 11, wherein at least one cost estimator used by the cost-aware model is based on extrapolation from past costs.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
SINOEAST CONCEPT LIMITED (Tencent Holdings Limited)
Original Assignee
International Business Machines Corporation
Inventors
Chang, Yuan-Chi, Lang, Christian Alexander, Smith, John Richard

Granted Patent

US 6,917,932 B2
Time in Patent Office

Days
Field of Search
US Class Current

707/5
CPC Class Codes

G06F 16/24542   Plan optimisation

G06F 16/24549   Run-time optimisation

G06F 16/24568   Data stream processing; Con...

Y10S 707/99931   Database or file accessing

Y10S 707/99932   Access augmentation or opti...

Y10S 707/99945   Object-oriented database st...

Dynamic optimization of multi-feature queries

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

91 Citations

14 Claims

Specification

Use Cases

Quick Links

Others

Dynamic optimization of multi-feature queries

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

91 Citations

14 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others