Method and system for information retrieval with clustering
First Claim
1. A computer program product, residing on a computer-readable medium, for use in accessing information associated with a collection of items, the computer program product comprising instructions for causing a computer to:
- access a plurality of items and a first plurality of properties, each item being associated with one or more properties from the first plurality of properties;
obtain an original result set of at least one item from the plurality of items in response to a search query;
identify a second plurality of properties from the first plurality of properties, wherein each of the second plurality of properties is associated with at least one item in the original result set;
group a third plurality of properties selected from the second plurality of properties into one or more clusters of properties by applying a similarity measure to assign more similar properties to the same cluster and less similar properties to distinct clusters, wherein at least one cluster includes two or more properties; and
provide a response to the search query including a representation of at least one cluster of the one or more clusters of properties.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems that enable searching with clustering in information access systems are described. The methods of clustering operate on a collection of materials wherein each item in the collection may be associated with one or more properties. An original subset of materials is selected from the collection and relevant properties associated with the subset of materials are clustered into property clusters. Each property cluster generally contains properties that are more similar to each other than to properties in a different property cluster. The property clusters can be used to respond to the query. A mapping function can be used to identify a set of materials that correspond to each property cluster based on the associations between individual items and properties. The property clusters can also be used for iterative query refinement.
-
Citations
66 Claims
-
1. A computer program product, residing on a computer-readable medium, for use in accessing information associated with a collection of items, the computer program product comprising instructions for causing a computer to:
-
access a plurality of items and a first plurality of properties, each item being associated with one or more properties from the first plurality of properties; obtain an original result set of at least one item from the plurality of items in response to a search query; identify a second plurality of properties from the first plurality of properties, wherein each of the second plurality of properties is associated with at least one item in the original result set; group a third plurality of properties selected from the second plurality of properties into one or more clusters of properties by applying a similarity measure to assign more similar properties to the same cluster and less similar properties to distinct clusters, wherein at least one cluster includes two or more properties; and provide a response to the search query including a representation of at least one cluster of the one or more clusters of properties. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
33. A computer-implemented method for accessing information associated with a database including a plurality of items and a first plurality of properties, each item being associated with one or more properties from the first plurality of properties, comprising:
-
obtaining a result set of items from the plurality of items for the search query; identifying a second plurality of properties from the plurality of properties, wherein each property in the second plurality of properties is associated with at least one item in the result set; grouping a third plurality of properties selected from the second plurality of properties into one or more clusters of properties by applying a similarity measure to assign more similar properties to the same cluster and less similar properties to distinct clusters, wherein at least one cluster includes two or more properties; and providing a response to the search query including at least one cluster of properties. - View Dependent Claims (34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64)
-
-
65. A computer program product, residing on a computer-readable medium, for use in accessing information associated with a collection of items, the computer program product comprising instructions for causing a computer to:
-
access a plurality of items and a first plurality of properties, wherein the properties are information-bearing values, each item being associated with one or more properties from the first plurality of properties, wherein the one or more properties associated with a item are contained in or describe the item; obtain an original result set of at least one item from the plurality of items in response to a search query; identify a second plurality of properties from the first plurality of properties, wherein each of the second plurality of properties is associated with at least one item in the original result set; group a third plurality of properties selected from the second plurality of properties into one or more clusters of properties by applying a similarity measure to assign more similar properties to the same cluster and less similar properties to distinct clusters, wherein at least one cluster includes two or more properties; and provide a response to the search query including a representation of at least one cluster of the one or more clusters of properties.
-
-
66. An information access system, comprising:
-
a first stored collection including a plurality of materials; a second stored collection including a first plurality of properties, wherein each item in the plurality of materials is associated with at least one property from the first plurality of properties; logic to perform a search against the first stored collection to obtain a result set of materials from the collection that match the search query; logic to derive a second plurality of properties from the second stored collection wherein each property in the second plurality of properties is associated with at least one item in the result set; logic to group a third plurality of properties into one or more clusters of properties by applying a similarity measure to assign more similar properties to the same cluster and less similar properties to distinct clusters, wherein at least one cluster includes two or more properties; and logic to provide a response to the search query including at least one cluster of the one or more clusters of properties.
-
Specification