Identifying Information Related to a Particular Entity from Electronic Sources
First Claim
1. A method for identifying information about a particular entity comprising:
- receiving electronic documents selected based on one or more search terms from a plurality of terms related to the particular entity;
determining one or more feature vectors for each received electronic document, wherein each feature vector is determined based on the associated electronic document;
clustering the received electronic documents into a first set of clusters of documents based on the similarity among the determined feature vectors; and
determining a rank for each cluster of documents in the first set of clusters of documents based on one or more ranking terms from the plurality of terms related to the particular entity, wherein the one or more ranking terms contain at least one term from the plurality of terms for the particular entity that is not in the one or more search terms.
2 Assignments
0 Petitions
Accused Products
Abstract
Presented are systems, apparatuses, articles of manufacture, and methods for identifying information about a particular entity including receiving electronic documents selected based on one or more search terms from a plurality of terms related to the particular entity, determining one or more feature vectors for each received electronic document, where each feature vector is determined based on the associated electronic document, clustering the received electronic documents into a first set of clusters of documents based on the similarity among the determined feature vectors, and determining a rank for each cluster of documents in the first set of clusters of documents based on one or more ranking terms from the plurality of terms related to the particular entity, where the one or more ranking terms contain at least one term from the plurality of terms for the particular entity that is not in the one or more search terms.
-
Citations
45 Claims
-
1. A method for identifying information about a particular entity comprising:
-
receiving electronic documents selected based on one or more search terms from a plurality of terms related to the particular entity; determining one or more feature vectors for each received electronic document, wherein each feature vector is determined based on the associated electronic document; clustering the received electronic documents into a first set of clusters of documents based on the similarity among the determined feature vectors; and determining a rank for each cluster of documents in the first set of clusters of documents based on one or more ranking terms from the plurality of terms related to the particular entity, wherein the one or more ranking terms contain at least one term from the plurality of terms for the particular entity that is not in the one or more search terms. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A system for identifying information about a particular entity comprising:
-
a harvesting module configured to receive electronic documents selected based on one or more search terms from a plurality of terms related to the particular entity; a feature extracting module configured to determine one or more feature vectors associated with each received electronic document, wherein each feature vector is determined based on the associated electronic document; a clustering module configured to cluster the received electronic documents into a first set of clusters of documents based on the similarity among the determined feature vectors; and a ranking module configured to determine a rank for each cluster of documents in the first set of clusters of documents based on one or more ranking terms from the plurality of terms related to the particular entity, wherein the one or more ranking terms contain at least one term from the plurality of terms for the particular entity that is not in the one or more search terms. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A computer readable medium including instructions that, when executed, cause a computer to perform a method for identifying information about a particular entity, the method comprising:
-
receiving electronic documents selected based on one or more search terms from a plurality of terms related to the particular entity; determining one or more feature vectors for each received electronic document, wherein each feature vector is determined based on the associated electronic document; clustering the received electronic documents into a first set of clusters of documents based on the similarity among the determined feature vectors; and determining a rank for each cluster of documents in the first set of clusters of documents based on one or more ranking terms from the plurality of terms related to the particular entity, wherein the one or more ranking terms contain at least one term from the plurality of terms for the particular entity that is not in the one or more search terms. - View Dependent Claims (32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44)
-
-
45. An apparatus for identifying information about a particular entity comprising:
-
means for receiving electronic documents selected based on one or more search terms from a plurality of terms related to the particular entity; means for determining one or more feature vectors for each received electronic document, wherein each feature vector is determined based on the associated electronic document; means for clustering the received electronic documents into a first set of clusters of documents based on the similarity among the determined feature vectors; and means for determining a rank for each cluster of documents in the first set of clusters of documents based on one or more ranking terms from the plurality of terms related to the particular entity, wherein the one or more ranking terms contain at least one term from the plurality of terms for the particular entity that is not in the one or more search terms.
-
Specification