Method and apparatus for facilitating query reformulation

US 6,175,829 B1
Filed: 04/22/1998
Issued: 01/16/2001
Est. Priority Date: 04/22/1998
Status: Expired due to Term

First Claim

Patent Images

1. A method of querying a database of images, comprising the steps of:

(a) verifying a query by determining feedback information regarding;

(i) a maximum number of query matches, a minimum number of query matches and an estimated number of query matches;

(ii) alternative semantic-based query elements and alternative cognition-based query elements for elements of the query; and

(b) providing the feedback information to a user in a user window prior to processing the query.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for verifying a query to provide feedback to users for query reformulation. By utilizing selectivity statistics for semantic and visual characteristics of image objects, query verification “examines” user queries and allows users to reformulate queries through system feedback. Feedback information provided to the user includes (1) the maximum and minimum number of matches for the query; (2) alternatives for both semantic and visual-based query elements; and (3) estimated numbers of matching images. Additional types of feedback information may also be provided. With this feedback, the users know if the query criteria is too tight (i.e. too few matches will be retrieved) or too loose (i.e. too many matches will be retrieved) so that they can relax, refine, or reformulate queries or leave queries unchanged accordingly. Only after queries are verified to have a high possibility of meaningful results, are the queries processed. Through this type of systematic feedback, users can query and explore multimedia databases, for example, by honing in on target images without expensive query processing. This provides a reduction in expensive query processing and system load.

Citations

53 Claims

1. A method of querying a database of images, comprising the steps of:
- (a) verifying a query by determining feedback information regarding;
  
  (i) a maximum number of query matches, a minimum number of query matches and an estimated number of query matches;
  
  (ii) alternative semantic-based query elements and alternative cognition-based query elements for elements of the query; and
  
  (b) providing the feedback information to a user in a user window prior to processing the query.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
- - 2. The method according to claim 1, further comprising the steps of:
3. The method according to claim 1, wherein the step of verifying the query further comprises determining the feedback information regarding co-occurrences for the elements of the query, co-occurrences being objects appearing in a same image as the elements of the query.
4. The method according to claim 1, wherein the step of verifying the query further comprises determining the feedback information regarding selectivity for each of the elements of the query, the selectivity being a number of occurrences of the elements of the query in the database.
5. The method according to claim 1, further comprising the steps of:
- specifying the query prior to the step of verifying the query; and
  
  generating a query statement in Cognition and Semantics Query Language (CSQL), subsequent to specifying the query.
6. The method according to claim 1, wherein the step of verifying the query further comprises determining the feedback information regarding a similarity value between each of the alternative semantic-based query elements and the elements of the query and between each of the alternative cognition-based query elements and the elements of the query.
7. The method according to claim 1, wherein the step of verifying the query further comprises determining the feedback information regarding selectivity for each of the alternative semantic-based query elements and for each of the alternative cognition-based query elements, the selectivity being a number of occurrences of each of the alternative query elements in the database.
8. The method according to claim 3, wherein the step of verifying the query further comprises determining the feedback information regarding selectivity for each of the co-occurrences, the selectivity being a number of occurrences of each of the co-occurrences in the database.
9. The method according to claim 1, wherein the alternative cognition-based query elements comprise color and shape query elements.
10. The method according to claim 7, wherein the selectivity for the alternative semantic-based query elements is determined from a hierarchical structure of statistics regarding image object semantics, the statistics being stored in an index.
11. The method according to claim 10, wherein the hierarchical structure comprises indices for basic predicates and indices, derived from the indices for basic predicates, for derived predicates.
12. The method according to claim 7, wherein the selectivity for the alternative cognition-based query elements is based on image classification statistics, wherein the statistics account for color and shape query elements of the alternative cognition-based query elements.
13. The method according to claim 6, wherein the similarity value is calculated between predicates requiring identical semantics (“
- is”
  
  ) in accordance with the formula;
14. The method according to claim 6, wherein the similarity value is calculated between a predicate requiring identical semantics (“
- is”
  
  ) and a predicate requiring similar semantics (“
  
  s_like”
  
  ), where a is a term in the “
  
  is”
  
  predicate and in the “
  
  s_like”
  
  predicate, in accordance with the formula;
  
  $\frac{\sum_{i = 1}^{n} 1 - Distance (α, s_{i})}{n},$ where n is a number of synonyms of α and
  
  where each s_i, 1≦
  
  i≦
  
  n, is one of the synonyms.
15. The method according to claim 6, wherein the similarity value is calculated between a predicate requiring identical semantics (“
- is”
  
  ) and a predicate requiring semantic generalization (“
  
  is_a”
  
  ), in accordance with the formula;
  
  $\frac{\sum_{i = 1}^{m} (1 - Distance (α, s_{i}))}{m},$ where α
  
  is a term of the “
  
  is”
  
  predicate and β
  
  is a term of the “
  
  is_a”
  
  predicate, β
  
  is a hypernym of α and
  
  m is the number of hypernyms of β
  
  , excluding α
  
  .
16. The method according to claim 6, wherein the similarity value is calculated between a predicate requiring similar semantics (“
- s_like”
  
  ), and a predicate requiring semantic generalization (“
  
  is_a”
  
  ) in accordance with the formula;
  
  $\frac{\sum_{i = 1}^{n} \sum_{j = 1}^{m} (1 - Distance (h_{j}, s_{i}))}{m \times n},$ where α
  
  is a term of the “
  
  s_like”
  
  predicate and β
  
  is a term of the “
  
  is_a”
  
  predicate, where each h_j, 1≦
  
  j≦
  
  m is a hypernym of β
  
  , m being the number of hypernyms, and each s_i, 1≦
  
  i≦
  
  n is a synonym of α
  
  , n being the number of synonyms.
17. The method according to claim 1, further comprising the steps of:
- reformulating the query using the alternative query elements; and
  
  ranking the alternative query elements based on similarity values between the elements of the query and each of the alternative query elements.
18. The method according to claim 1, wherein the maximum number of query matches is calculated in accordance with the following steps when the elements of the query are cognition-based query elements:
- (i) identifying all major shapes and colors in the query; and
  
  (ii) summing selectivity for each shape, color pair in the query to produce the maximum number of query matches, the selectivity being a number of occurrences of each shape, color pair in the database.
19. The method according to claim 1, wherein the minimum number of query matches is calculated in accordance with the following steps when the elements of the query are cognition-based query elements:
- (i) identifying a primary shape and color in the query; and
  
  (ii) selecting a selectivity for a shape, color pair, comprising the primary shape and color in the query, as the minimum number of query matches, the selectivity being a number of occurrences of the shape, color pair in the database.
20. The method according to claim 1, wherein the step of verifying a query by determining the feedback information regarding the estimated number of matches takes into account weighing shape versus color, the estimated number of matches being determined in accordance with the following steps when the elements of the query are cognition-based query elements:
- (i) for each one of the images I in the database, extracting major shapes and colors;
  
  (ii) for each one of the images I in the database and for each shape-color pair, <
  
  s_i, c_j>
  
  , identifying a significance of the shape, ranking(s_i), with respect to a shape similarity between s_iand I, and a number of pixels, num(c_j), in each I having c_j;
  
  (iii) for each of the shape-color pairs, <
  
  s_i, c_j>
  
  in each I, computing a weight value in accordance with the formula $\begin{matrix} w (s_{i}, c_{j}) = α \times (1 - \frac{ranking (s_{i})}{\sum_{\forall_{s_{k}} in the image} (ranking (s_{k}))}) + \\ (1 - α) \times \frac{num (c_{j})}{# of pixels in the image} \end{matrix}$ where α
  
  is a user specified parameter denoting a ratio of importance of shape-based comparisons to color-based comparisons; and
  
  (iv) calculating the number of estimated number of matches in accordance with the formula;
  
  $\sum_{\forall 〈 s_{i} c_{j} 〉 \in I} w (s_{i}, c_{j}) \times Matches (〈 s_{i}, c_{j} 〉),$ where Matches (<
  
  s_i, c_j>
  
  ) is a selectivity for the shape-color pair <
  
  s_l, c_l>
  
  ), the selectivity being a number of occurrences of each shape, color pair in the database.
21. The method according to claim 1, wherein the estimated number of query matches is determined in accordance with the following formula when the elements of the query are semantics-based query elements:
- $\sum_{i \geq 2} (P (image has i objects) \times P (one object is object1 and other looks like object2  image has i objects)) = \sum_{i \geq 2} (\frac{# of images with i objects}{total # of images} \times permutation (i, 2) \times P (object1) \times P (object2)) = \sum_{i \geq 2} (\frac{# of images with i objects}{total # of images} \times i! \times (i - 1)! \times \frac{# of object1}{total # of objects} \times \frac{# of object2}{total # of objects}),$ where object1 and object2 are ones of the elements of the query and i is the number of objects in the images in the database.
22. The method according to claim 1, wherein the maximum number of query matches is equal to a minimum of selectivity for each of the elements of the query when the elements of the query are semantics-based query elements, the selectivity being a number of occurrences of the elements of the query in the database.
23. The method according to claim 1, wherein the minimum number of query matches is determined in accordance with the following formula when the elements of the query are semantics-based query elements:
- $(a) (\sum ObjectSemanticsSelectivity + \sum ObjectCognitionSelectivity) - \sum_{N = 1}^{M} (ObjectContainmentImageSelectivity (N))$ $when \sum_{N = 1}^{M} (ObjectContainmentImageSelectivity (N) < (\sum ObjectSemanticsSelectivity + \sum ObjectCognitionSelectivity; and (b) 0 when  \sum_{N = 1}^{M} ObjectContainmentImageSelectivity (N) \geq (\sum ObjectSemanticsSelectivity + \sum ObjectCognitionSelectivity),$ where M is the number of objects associated with the query, the ObjectSemanticsSelectivity is a selectivity for a semantics-based condition, ObjectCognitionSelectivity is a selectivity for a cognition-based condition and ObjectContainmentImageSelectivity is a selectivity for a condition containing M objects, where the selectivity is a number of occurrences of the condition in the database.
24. The method according to claim 1, wherein the database comprises one of a multimedia database, a hypermedia database and a database on the World Wide Web.

25. An apparatus for querying a database of images, the apparatus comprising:
- a computer system for verifying a query by determining feedback information regarding;
  
  (i) a maximum number of query matches, a minimum number of query matches and an estimated number of query matches;
  
  (ii) alternative semantic-based query elements and alternative cognition-based query elements for elements of the query; and
  
  a display, wherein the feedback information is displayed on the display prior to processing the query.
- View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53)
- - 26. The apparatus according to claim 25, wherein the computer system further comprises:
    - a query verifier for providing the feedback information regarding the maximum number of query matches, the minimum number of query matches and the estimated number of query matches.
  - 27. The apparatus according to claim 26, wherein the computer system further comprises:
28. The apparatus according to claim 26, wherein a user of the user workstation uses the feedback information to choose one of reformulating the query to form a reformulated query, narrowing the query to form a narrowed query, broadening the query to form a broadened query and keeping the query without changes, and wherein the computer system processes the chosen one of the reformulated query, the narrowed query, the broadened query and the query.
29. The apparatus according to claim 26, wherein the computer system further comprises:
- an image semantics editor for assigning image semantics to the images in the database to provide the alternative semantics-based query elements, said image semantics being stored in an image metadata database along with visual characteristics of the images in the database.
30. The apparatus according to claim 29, wherein the computer system further comprises:
- a textual data query processor for processing the query when the query concerns image or object semantics stored as textual data.
31. The apparatus according to claim 26, wherein query verifier of the computer system verifies the query for further providing the feedback information regarding co-occurrences for the elements of the query, co-occurrences being objects appearing in a same image as the elements of the query.
32. The apparatus according to claim 26, wherein the query verifier of the computer system verifies the query by further providing the feedback information regarding selectivity for each of the elements of the query, the selectivity being a number of occurrences of the elements of the query in the database.
33. The apparatus according to claim 25, wherein the user of the user workstation specifies the query prior to the computer system verifying the query and wherein the computer system generates a query statement in Cognition and Semantics Query Language (CSQL), subsequent to the user specifying the query.
34. The apparatus according to claim 26, wherein the query verifier of the computer system verifies the query by further providing the feedback information regarding a similarity value between each of the alternative semantic-based query elements and the elements of the query and between each of the alternative cognition-based query elements and the elements of the query.
35. The apparatus according to claim 29, wherein the query verifier of the computer system verifies the query by further providing the feedback information regarding selectivity for each of the alternative semantic-based query elements and for each of the alternative cognition-based query elements, the selectivity being a number of occurrences of each of the alternative query elements in the database.
36. The apparatus according to claim 31, wherein the query verifier of the computer system verifies the query by further providing the feedback information regarding selectivity for each of the co-occurrences, the selectivity being a number of occurrences of each of the co-occurrences in the database.
37. The apparatus according to claim 25, wherein the alternative cognition-based query elements comprise color and shape query elements.
38. The apparatus according to claim 35, wherein the selectivity for the alternative semantic-based query elements is based on a hierarchical structure of statistics regarding image object semantic.
39. The apparatus according to claim 38, wherein the hierarchical structure comprises indices for basic predicates and indices, derived from the indices for basic predicates, for derived predicates.
40. The apparatus according to claim 38, wherein the computer system further comprises a semantics index for building the hierarchical structure of statistics, the structure being stored in an image semantics database, wherein the query verifier uses the structure stored in the image semantics database and the image metadata database to provide the selectivity.
41. The apparatus according to claim 35, wherein the selectivity for the alternative cognition-based query elements is based on image classification statistics, stored in the image metadata database, wherein the statistics account for color and shape query elements of the alternative cognition-based query elements.
42. The apparatus according to claim 34, wherein the similarity value is calculated between predicates requiring identical semantics (“
- is”
  
  ) in accordance with the formula;
43. The apparatus according to claim 34, wherein the similarity value is calculated between a predicate requiring identical semantics (“
- is”
  
  ) and a predicate requiring similar semantics (“
  
  s_like”
  
  ), where α
  
  is a term in the “
  
  is”
  
  predicate and in the “
  
  s_like”
  
  predicate, in accordance with the formula;
  
  $\frac{\sum_{i = 1}^{n} 1 - Distance (α, s_{i})}{n},$ where n is a number of synonyms of α and
  
  where each s_i, 1≦
  
  i≦
  
  n, is one of the synonyms.
44. The apparatus according to claim 34, wherein the similarity value is calculated between a predicate requiring identical semantics (“
- is”
  
  ) and a predicate requiring semantic generalization (“
  
  is_a”
  
  ), in accordance with the formula;
  
  $\frac{\sum_{i = 1}^{m} (1 - Distance (α, s_{i}))}{m},$ where α
  
  is a term of the “
  
  is”
  
  predicate and β
  
  is a term of the “
  
  is_a”
  
  predicate, β
  
  is a hypernym of α and
  
  m is the number of hypernyms of β
  
  , excluding α
  
  .
45. The apparatus according to claim 34, wherein the similarity value is calculated between a predicate requiring similar semantics (“
- s_like”
  
  ), and a predicate requiring semantic generalization (“
  
  is_a”
  
  ) in accordance with the formula;
  
  $\frac{\sum_{i = 1}^{n} \sum_{j = 1}^{m} (1 - Distance (h_{j}, s_{i}))}{m \times n},$ where α
  
  is a term of the “
  
  s_like”
  
  predicate and β
  
  is a term of the “
  
  is_a”
  
  predicate, where each h_j, 1≦
  
  j≦
  
  m is a hypernym of β
  
  , m being the number of hypernyms, and each s_i, 1≦
  
  i≦
  
  n is a synonym of α
  
  , n being the number of synonyms.
46. The apparatus according to claim 25, wherein the computer system reformulates the query using the alternative query elements and ranks the alternative query elements based on similarity values between the elements of the query and each of the alternative query elements.
47. The apparatus according to claim 25, wherein the computer system calculates the maximum number of query matches, when the elements of the query are cognition-based query elements, by:
- (i) identifying all major shapes and colors in the query; and
  
  (ii) summing selectivity for each shape, color pair in the query to produce the maximum number of query matches, the selectivity being a number of occurrences of each shape, color pair in the database.
48. The apparatus according to claim 25, wherein the computer system calculates the minimum number of query matches, when the elements of the query are cognition-based query elements, by:
- (i) identifying a primary shape and color in the query; and
  
  (ii) selecting a selectivity for a shape, color pair, comprising the primary shape and color in the query, as the minimum number of query matches, the selectivity being a number of occurrences of the shape, color pair in the database.
49. The apparatus according to claim 25, the computer system calculates the estimated number of matches when the elements of the query are cognition-based query elements, taking into account weighing shape versus color, by:
- (i) for each one of the images I in the database, extracting major shapes and colors;
  
  (ii) for each one of the images I in the database and for each shape-color pair, <
  
  s_i, c_j>
  
  , identifying a significance of the shape, ranking(s_i), with respect to a shape similarity between s_iand I, and a number of pixels, num(c_j), in each I having c_j;
  
  (iii) for each of the shape-color pairs, <
  
  s_i, c_j>
  
  in each I, computing a weight value in accordance with the formula $\begin{matrix} w (s_{i}, c_{j}) = α \times (1 - \frac{ranking (s_{i})}{\sum_{\forall_{s_{k}} in the image} (ranking (s_{k}))}) + \\ (1 - α) \times \frac{num (c_{j})}{# of pixels in the image} \end{matrix}$ where α
  
  is a user specified parameter denoting a ratio of importance of shape-based comparisons to color-based comparisons; and
  
  (iv) calculating the number of estimated number of matches in accordance with the formula;
  
  $\sum_{\forall 〈 s_{i} c_{j} 〉 \in I} w (s_{i}, c_{j}) \times Matches (〈 s_{i}, c_{j} 〉),$ where Matches (<
  
  s_i, c_j>
  
  ) is a selectivity value for the shape-color pair <
  
  s_l, c_l>
  
  ), the selectivity being a number of occurrences of each shape, color pair in the database.
50. The apparatus according to claim 25, wherein the computer system determines the estimated number of query matches in accordance with the following formula when the elements of the query are semantics-based query elements:
- $\sum_{i \geq 2} (P (image has i objects) \times P (one object is object1 and other looks like object2  image has i objects)) = \sum_{i \geq 2} (\frac{# of images with i objects}{total # of images} \times permutation (i, 2) \times P (object1) \times P (object2)) = \sum_{i \geq 2} (\frac{# of images with i objects}{total # of images} \times i! \times (i - 1)! \times \frac{# of object1}{total # of objects} \times \frac{# of object2}{total # of objects}),$ where object1 and object2 are ones of the elements of the query and i is the number of objects in the images in the database.
51. The apparatus according to claim 25, wherein the computer system determines the maximum number of query matches to be equal to a minimum of selectivity for each of the elements of the query when the elements of the query are semantics-based query elements, the selectivity being a number of occurrences of the elements of the query in the database.
52. The apparatus according to claim 25, wherein the computer system determines the minimum number of query matches in accordance with the following formula when the elements of the query are semantics-based query elements:
- $(a) (\sum ObjectSemanticsSelectivity + \sum ObjectCognitionSelectivity) - \sum_{N = 1}^{M} (ObjectContainmentImageSelectivity (N))$ $when \sum_{N = 1}^{M} (ObjectContainmentImageSelectivity (N) < (\sum ObjectSemanticsSelectivity + \sum ObjectCognitionSelectivity); and (b) 0 when  \sum_{N = 1}^{M} ObjectContainmentImageSelectivity (N) \geq (\sum ObjectSemanticsSelectivity + \sum ObjectCognitionSelectivity),$ where M is the number of objects associated with the query, the ObjectSemanticsSelectivity is a selectivity for a semantics-based condition, ObjectCognitionSelectivity is a selectivity for a cognition-based condition and ObjectContainmentImageSelectivity is a selectivity for a condition containing M objects, where the selectivity is a number of occurrences of the condition in the database.
53. The system according to claim 25, wherein the database of images comprises one of a multimedia database, a hypermedia database and a database on the World Wide Web.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NEC Corporation
Original Assignee
NEC USA Inc (NEC Corporation)
Inventors
Candan, K. Selcuk, Li, Wen-Syan
Primary Examiner(s)
Alam, Hosain T.
Assistant Examiner(s)
CORRIELUS, JEAN M

Application Number

US09/064,069
Time in Patent Office

1,000 Days
Field of Search

707/3, 707/1, 707/104, 382/230
US Class Current

1/1
CPC Class Codes

G06F 16/532   Query formulation, e.g. gra...

G06F 16/58   Retrieval characterised by ...

G06F 16/583   using metadata automaticall...

Y10S 707/99931   Database or file accessing

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99945   Object-oriented database st...

Method and apparatus for facilitating query reformulation

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

53 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for facilitating query reformulation

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

53 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links