Process of dynamic taxonomy for browsing and retrieving information in large heterogeneous data bases
DCFirst Claim
1. A method for retrieving information from databases, said databases being structured or unstructured, said databases being homogeneous or heterogeneous, wherein retrieval is performed through visual queries on dynamic taxonomies, said dynamic taxonomies being an organization of concepts that ranges from a most general concept to a most specific concept, said concepts and their generalization or specialization relationships being called an intension, items in said databases being classified under one or more concepts, said items and their classification being called an extension, said method comprising:
- using a computer for providing a taxonomy for said retrieval;
using the computer for operating on a selected subset of interest of said taxonomy in order to refine said retrieval, said selected subset of interest being specified by using the computer for combining selected taxonomy concepts through boolean operations or being specified through querying methods, said querying methods retrieving classified items according to different selection criteria;
providing a reduced taxonomy for said selected subset of interest, said reduced taxonomy being derived from said taxonomy by using the computer for eliminating from the extension of said taxonomy all items not in said selected subset of interest and by pruning concepts under which no item in said selected subset of interest is classified; and
using the computer for iteratively repeating said steps of operating on a selected subset of interest and of providing a reduced taxonomy to further refine said retrieval, wherein;
said step of pruning concepts includes eliminating from the taxonomy all the concepts under which no item in the selected subset of interest is classified, or preventing said concepts from being selected in order to specify interest sets;
said step of providing a reduced taxonomy either reports only the concepts belonging to the reduced taxonomy or, for each such concept also reports how many items in the interest set are classified under the concept;
said intension is organized as a hierarchy of concepts or as a directed acyclic graph of concepts, thereby allowing a concept to have multiple fathers;
items in said classification are classified programmatically or automatically;
in said extension, there exists at least one item such that said item is classified under at least two different concepts such that each of said two concepts is neither an ancestor nor a descendant of the other concept in the intension;
zero or more concepts represent a tag cloud, said tag cloud having as descendants all or parts of the terms or phrases that is derived from the items, each tag cloud and each of descendents is used as a dynamic taxonomy concept to define a subset of interest possibly in combination with other clouds or concepts or querying methods, each tag cloud and each of descendants is used as a dynamic taxonomy concept to summarize a subset of interest; and
said method to construct and provide through a reduced taxonomy the relationships between any two concepts based on the classification by using the computer, a relationship between any two concepts existing if at least one item is classified (1) under a first concept or any descendants of the first concept, and (2) under a second concept, or any descendants of the second concept.
2 Assignments
Litigations
0 Petitions
Reexamination
Accused Products
Abstract
A process is disclosed for retrieving information in large heterogeneous data bases, wherein information retrieval through visual querying/browsing is supported by dynamic taxonomies; the process providing the steps of: initially showing (F1) a complete taxonomy for the retrieval; refining (F2) the retrieval through a selection of subsets of interest, where the refining is performed by selecting concepts in the taxonomy and combining them through boolean operations; showing (F3) a reduced taxonomy for the selected set; and further refining (F4) the retrieval through an iterative execution of the refining and showing steps.
-
Citations
49 Claims
-
1. A method for retrieving information from databases, said databases being structured or unstructured, said databases being homogeneous or heterogeneous, wherein retrieval is performed through visual queries on dynamic taxonomies, said dynamic taxonomies being an organization of concepts that ranges from a most general concept to a most specific concept, said concepts and their generalization or specialization relationships being called an intension, items in said databases being classified under one or more concepts, said items and their classification being called an extension, said method comprising:
-
using a computer for providing a taxonomy for said retrieval; using the computer for operating on a selected subset of interest of said taxonomy in order to refine said retrieval, said selected subset of interest being specified by using the computer for combining selected taxonomy concepts through boolean operations or being specified through querying methods, said querying methods retrieving classified items according to different selection criteria; providing a reduced taxonomy for said selected subset of interest, said reduced taxonomy being derived from said taxonomy by using the computer for eliminating from the extension of said taxonomy all items not in said selected subset of interest and by pruning concepts under which no item in said selected subset of interest is classified; and using the computer for iteratively repeating said steps of operating on a selected subset of interest and of providing a reduced taxonomy to further refine said retrieval, wherein; said step of pruning concepts includes eliminating from the taxonomy all the concepts under which no item in the selected subset of interest is classified, or preventing said concepts from being selected in order to specify interest sets; said step of providing a reduced taxonomy either reports only the concepts belonging to the reduced taxonomy or, for each such concept also reports how many items in the interest set are classified under the concept; said intension is organized as a hierarchy of concepts or as a directed acyclic graph of concepts, thereby allowing a concept to have multiple fathers; items in said classification are classified programmatically or automatically; in said extension, there exists at least one item such that said item is classified under at least two different concepts such that each of said two concepts is neither an ancestor nor a descendant of the other concept in the intension; zero or more concepts represent a tag cloud, said tag cloud having as descendants all or parts of the terms or phrases that is derived from the items, each tag cloud and each of descendents is used as a dynamic taxonomy concept to define a subset of interest possibly in combination with other clouds or concepts or querying methods, each tag cloud and each of descendants is used as a dynamic taxonomy concept to summarize a subset of interest; and said method to construct and provide through a reduced taxonomy the relationships between any two concepts based on the classification by using the computer, a relationship between any two concepts existing if at least one item is classified (1) under a first concept or any descendants of the first concept, and (2) under a second concept, or any descendants of the second concept. - View Dependent Claims (2, 3, 4, 5, 6, 8)
-
-
7. A method for retrieving items from electronic catalogs, for applications such as electronic commerce or electronic auctions, wherein retrieval is performed through visual queries on dynamic taxonomies, said dynamic taxonomies being an organization of concepts that ranges from a most general concept to a most specific concept, said concepts and their generalization or specialization relationships being called an intension, said concepts also comprising features such as price, items in said electronic catalogs to be classified under one or more concepts, said items and their classification being called an extension, said method comprising:
-
using a computer for providing a taxonomy for said retrieval; using the computer for operating on a selected subset of interest of said taxonomy in order to refine said retrieval, said selected subset of interest being specified by using the computer for combining selected taxonomy concepts through boolean operations, or being specified through querying methods, said querying methods retrieving classified items according to different selection criteria; providing a reduced taxonomy for said selected subset of interest, said reduced taxonomy being derived from said taxonomy by using the computer for eliminating from the extension of said taxonomy all items not in said selected subset of interest and pruning concepts under which no item in said selected subset of interest is classified; and using the computer for iteratively repeating said steps of operating on a selected subset of interest and of providing a reduced taxonomy to further refine said retrieval, wherein; said hierarchical organization of concepts for said electronic catalogs comprises a set of features, each of said features being a descendant concept of the root concept of said organization, each of said features having as descendants in the taxonomy a set of concepts, each concept in said set of concepts representing either a single value or a set of values for said feature; said items in said electronic catalogs are classified, for each said feature, under zero or more concepts representing either a single value or a set of values for that feature; said step of providing a reduced taxonomy either reports only the concepts belonging to the reduced taxonomy or, for each such concept also reports how many items in the interest set are classified under the concept; in said extension, there exists at least one item such that said item is classified under at least two different concepts such that each of said two concepts is neither an ancestor nor a descendant of the other concept in the intension; zero or more concepts represent a tag cloud, said tag cloud having as descendants all or parts of the terms or phrases that is derived from the items, each tag cloud and each of descendants is used as a dynamic taxonomy concept to define a subset of interest possibly in combination with other clouds or concepts or querying methods, each to cloud and each of descendants is used as a dynamic taxonomy concept to summarize a subset of interest; and said step of pruning of concepts includes eliminating from the taxonomy the concepts under which no item in the selected subset of interest is classified, or preventing such concepts from being selected in order to specify interest sets.
-
-
9. A method for retrieving information from databases, said databases being structured or unstructured, said databases being homogeneous or heterogeneous, wherein retrieval is performed through visual queries on dynamic taxonomies, said dynamic taxonomies being an organization of concepts that ranges from a most general concept to a most specific concept, said concepts and their organization being called an intension, items in said databases being classified under one or more concepts, said items and their classification being called an extension, said method comprising, given an initial current subset of interest:
-
using a computer for providing a reduced taxonomy for the current subset of interest; using the computer for refining the current subset of interest of said reduced taxonomy with the combination of one or more taxonomy concepts through Boolean operations; and using the computer for iteratively repeating said steps of providing a reduced taxonomy for the current subset of interest to further refine said retrieval and of refining the current subset of interest, wherein; said initial subset of interest includes all the items in the extension of the dynamic taxonomy or a subset of them; said reduced taxonomy being derived from said taxonomy by using the computer for pruning concepts under which no item in said current subset of interest is classified; said step of pruning concepts includes eliminating from the taxonomy all the concepts under which no item in the current subset of interest is classified, or preventing said concepts from being displayed, or preventing said concepts from being selected in order to refine interest sets; said step of providing a reduced taxonomy either reports only the concepts belonging to the reduced taxonomy or, for each such concept also reports how many items in the current interest set are classified under the concept; said intension is organized as a hierarchy of concepts or as a directed acyclic graph of concepts, thereby allowing a concept to have multiple fathers; in said extension, there exists at least one item such that said item is classified under at least two different concepts such that each of said two concepts is neither an ancestor nor a descendant of the other concept in the intension. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 30, 32, 33, 34, 39, 40, 41, 44, 47)
-
-
28. A method for retrieving items from electronic catalogs, for applications including electronic commerce or electronic auctions, wherein retrieval is performed through visual queries on dynamic taxonomies, said dynamic taxonomies being an organization of concepts that ranges from a most general concept to a most specific concept, said concepts and their generalization or specialization relationships being called an intension, said concepts also comprising features including price, items in said electronic catalogs are classified under one or more concepts, said items and their classification being called an extension, said method comprising:
-
using a computer for providing a taxonomy for said retrieval; using the computer for operating on a selected subset of interest of said taxonomy in order to refine said retrieval, said selected subset of interest being specified by using the computer for combining selected taxonomy concepts through boolean operations, or being specified through querying methods, said querying methods retrieving classified items according to different selection criteria; providing a reduced taxonomy for said selected subset of interest, said reduced taxonomy being derived from said taxonomy by using the computer for pruning concepts under which no item in said selected subset of interest is classified; and using the computer for iteratively repeating said steps of operating on a selected subset of interest and of providing a reduced taxonomy to further refine said retrieval, wherein; said hierarchical organization of concepts for said electronic catalogs comprises a set of features, each of said features being a descendant concept of the root concept of said organization, each of said features having as descendants in the taxonomy a set of concepts, each concept in said set of concepts representing either a single value or a set of values for said feature; said items in said electronic catalogs are classified, for each said feature, under zero or more concepts representing either a single value or a set of values for that feature; said step of providing a reduced taxonomy either reports only the concepts belonging to the reduced taxonomy or, for each such concept also reports how many items in the interest set are classified under the concept; in said extension, there exists at least one item such that said item is classified under at least two different concepts such that each of said two concepts is neither an ancestor nor a descendant of the other concept in the intension; and said step of pruning of concepts includes eliminating from the taxonomy the concepts under which no item in the selected subset of interest is classified, or preventing said concepts from being displayed, or preventing such concepts from being selected in order to specify interest sets.
-
-
29. A method for retrieving items from electronic catalogs, for applications including electronic commerce or electronic auctions, wherein retrieval is performed through visual queries on dynamic taxonomies, said dynamic taxonomies being an organization of concepts that ranges from a most general concept to a most specific concept, said concepts and their organization being called an intension, said concepts also comprising features including price, items in said electronic catalogs are classified under one or more concepts, said items and their classification being called an extension, said method comprising, given an initial current subset of interest:
-
using a computer for providing a reduced taxonomy for the subset of interest; using the computer for refining the current subset of interest of said reduced taxonomy with the combination of one or more taxonomy concepts through Boolean operations; and using the computer for iteratively repeating said steps of providing a reduced taxonomy for the current subset of interest to further refine said retrieval and of refining the current subset of interest, wherein; said initial subset of interest includes all the items in the extension of the dynamic taxonomy, or a subset of them; said reduced taxonomy being derived from said taxonomy by using the computer for pruning concepts under which no item in said current subset of interest is classified; said step of pruning concepts includes eliminating from the taxonomy all the concepts under which no item in the current subset of interest is classified, or preventing said concepts from being displayed, or preventing said concepts from being selected in order to refine interest sets; said step of providing a reduced taxonomy either reports only the concepts belonging to the reduced taxonomy or, for each such concept, also reports how many items in the current interest set are classified under the concept; said hierarchical organization of concepts for said electronic catalogs comprises a set of features, each of said features being a descendant concept of the root concept of said organization, each of said features having as descendants in the taxonomy a set of concepts, each concept in said set of concepts representing either a single value or a set of values for said feature; said items in said electronic catalogs are classified, for each said feature, under zero or more concepts representing either a single value or a set of values for that feature; in said extension, there exists at least one item such that said item is classified under at least two different concepts such that each of said two concepts is neither an ancestor nor a descendant of the other concept in the intension. - View Dependent Claims (42, 45, 48)
-
-
31. A method for using a computer for retrieving association rules from databases, said databases being structured or unstructured, said databases being homogeneous or heterogeneous wherein retrieval is performed through visual queries on dynamic taxonomies, said dynamic taxonomies being an organization of concepts that ranges from a most general concept to a most specific concept, said concepts and their organization being called an intension, items in said databases being classified under one or more concepts, said items and their classification being called an extension, an association rule defining a probabilistic correlation relationship between the antecedent, said antecedent being defined by one concept in the taxonomy or by a boolean combination of concepts in the taxonomy, and the consequent, said consequent being defined by one concept in the taxonomy or by a boolean combination of concepts in the taxonomy, said method comprising, given an initial current subset of interest:
-
using a computer for providing a reduced taxonomy for the subset of interest; using the computer for refining the current subset of interest of said reduced taxonomy with the combination of one or more taxonomy concepts through Boolean operations, the entire Boolean combination of concepts that defines the refined subset of interest being called the conceptual focus; and using the computer for iteratively repeating said steps of providing a reduced taxonomy for the current subset of interest to further refine said retrieval and of refining the current subset of interest, wherein; said initial subset of interest includes all the items in the extension of the dynamic taxonomy, or a subset of them; said reduced taxonomy being derived from said taxonomy by using the computer for pruning concepts under which no item in said current subset of interest is classified; said step of pruning of concepts includes eliminating from the taxonomy the concepts under which no item in the selected subset of interest is classified, or preventing said concepts from being selected in order to specify interest sets; said intension is organized as a hierarchy of concepts or as a directed acyclic graph of concepts, thereby allowing a concept to have multiple fathers; for each concept in said reduced taxonomy, two association rules exist, the first association rule having the conceptual focus as the antecedent of said first association rule and having said concept in the reduced taxonomy as the consequent of said first association rule, the second association rule having said concept in the reduced taxonomy as the antecedent of said second association rule and having the conceptual focus as the consequent of said second association rule; in said step of providing a reduced taxonomy, for an association rule in the reduced taxonomy, a measure of confidence is provided, said measure of confidence being computed as the ratio between the number of items in the intersection of the antecedent and consequent of said association rule over the number of items in the antecedent of said association rule, or said measure is not provided; in said step of providing a reduced taxonomy, for an association rule in the reduced taxonomy, a measure of support is provided, said support being expressed as the number of items in the intersection of the antecedent and consequent of said association rule over the total number of items, or said measure is not provided; in said extension, there exists at least one item such that said item is classified under at least two different concepts such that each of said two concepts is neither an ancestor nor a descendant of the other concept in the intension; and in said step of providing a reduced taxonomy, for an association rule in the reduced taxonomy, a measure of the statistical significance of how the subordinate probability of the consequent of said association rule with respect to the antecedent of said association rule deviates from independence of said consequent and antecedent of said association rule, is provided or said measure is not provided. - View Dependent Claims (36, 37, 38, 43, 46, 49)
-
-
35. A method for the statistical comparison of different subsets of a database, said database being structured or unstructured, said database being homogeneous or heterogeneous, said database being described by a dynamic taxonomy, said dynamic taxonomy being an organization of concepts that ranges from a most general concept to a most specific concept, said concepts and their organization being called an intension, items in said databases being classified under one or more concepts, said items and their classification being called an extension, said method comprising:
-
using a computer for initially providing a view for each of said subsets, said view being a reduced taxonomy derived from the initial taxonomy by setting a specific focus, or selected subset of interest; using a computer for providing for each concept in each view, a measure of statistical deviation from uniformity for the subset represented by said concept with respect to the same concept in the first view only or in each of the other views including the first view, said measure of deviation only being a raw measure of deviation or including additional measures including the statistical significance of such deviation, said first view being used as a reference view; and using a computer for combining one or more concepts by Boolean operations in an expression in any of said views, said expression used to refine the selected subset of interest of each views; and
repeating said steps of selecting subsets of interest and showing views, wherein;said reduced taxonomy being derived from said taxonomy by using the computer for pruning concepts under which no item in said current subset of interest is classified; said step of pruning of concepts includes eliminating from the taxonomy the concepts under which no item in the selected subset of interest is classified, or preventing said concepts from being selected in order to specify interest sets; said intension is organized as a hierarchy of concepts or as a directed acyclic graph of concepts, thereby allowing a concept to have multiple fathers; and in said extension, there exists at least one item such that said item is classified under at least two different concepts such that each of said two concepts is neither an ancestor nor a descendant of the other concept in the intension.
-
Specification