Clustered Information Processing and Searching with Structured-Unstructured Database Bridge
First Claim
1. A method for indexing and classifying related information, comprising:
- using a computing system, measuring the similarity of or distance between a plurality of individual resources using a hybrid distance measurement;
clustering the plurality of individual resources into a plurality of clusters using the hybrid distance measurement; and
storing the plurality of clusters in both a structured and an unstructured data repository on the computing system or another computing system.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for indexing information and for performing searches are disclosed. In these systems and methods information is “ingested” into the system by clustering the information using a clustering algorithm such as k-means or k-medoids clustering. During the clustering process, a hybrid distance measurement is used that allows the systems and methods to determine similarity across a number of different types of information. Once the information is clustered, it is stored and “mirrored” both in a structured (e.g., relational) data repository and in an unstructured data repository. Methods according to the invention allow the retrieval of both direct search results and search results including related concepts. After clustered information is stored, future searches can be performed by searching the stored results in whichever data repository is most appropriate for the context.
45 Citations
22 Claims
-
1. A method for indexing and classifying related information, comprising:
-
using a computing system, measuring the similarity of or distance between a plurality of individual resources using a hybrid distance measurement; clustering the plurality of individual resources into a plurality of clusters using the hybrid distance measurement; and storing the plurality of clusters in both a structured and an unstructured data repository on the computing system or another computing system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for retrieving data, comprising:
-
receiving a query at a first computing system; automatically directing the query to either a structured results repository or an unstructured results repository, the structured and unstructured results repositories being associated with the first computing system or other computing systems and containing essentially the same information clustered into one or more clusters to reflect relationships between data elements in the one or more clusters; searching the structured or the unstructured results repositories in accordance with said automatically directing; and returning a result set to the query including at least a portion of the contents of at least one of the clusters. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A system for information storage and retrieval, comprising:
-
a search and information retrieval computing system, the search and retrieval computing system having associated with it an unstructured results repository and a structured results repository, the unstructured and structured results repositories essentially the same information clustered into one or more clusters to reflect relationships between data elements in the one or more clusters; an information clustering and storage routine running on the search and information retrieval computing system that processes new data received at the search and information retrieval computing system, the information clustering and storage routine determining a hybrid distance measurement for the new data, clustering the new data into one or more clusters based on the hybrid distance measurement, and storing the one or more clusters in the structured and unstructured results repositories; and an information retrieval routine running on the searching and information retrieval computing system that processes a query received at the search and information retrieval computing system, determine whether the query is best searched using the structured or the unstructured results repository, execute a search of either the structured or the unstructured results repository based on the determination, and return results including at least a portion of the contents of at least one of the clusters. - View Dependent Claims (15, 16, 17)
-
-
18. A method for indexing, searching, and retrieving information, comprising:
-
using a first computing system, measuring the similarity of or distance between a plurality of individual resources using a hybrid distance measurement; clustering the plurality of individual resources into a plurality of clusters using the hybrid distance measurement; storing the plurality of clusters in both a structured and an unstructured data repository on the computing system or another computing system; receiving a query at the first computing system or a second computing system in communication with a first computing system; automatically directing the query to either the structured repository or an unstructured repository; searching the structured or the unstructured repositories in accordance with said automatically directing; and returning a result set to the query including at least a portion of the contents of at least one of the clusters. - View Dependent Claims (19, 20, 21, 22)
-
Specification