Entity Assessment and Ranking
First Claim
1. In a processing device in communication with a document repository, a method for assessing entities, the method comprising:
- retrieving a first set of documents from the document repository based on a query, the first set of documents having first metadata values corresponding to a plurality of metadata attributes;
characterizing the first set of documents based on the first metadata values to provide a first document set characterization;
determining at least one candidate entity based on the first set of documents;
for each of the at least one candidate entity, retrieving a second set of documents from the document repository based on the query and the candidate entity, the second set of documents having second metadata values corresponding to the plurality of metadata attributes;
for each of the at least one candidate entity, characterizing the second set of documents based on the second metadata values to provide a second document set characterization; and
for each of the at least one candidate entity, comparing the second document set characterization with the first document set characterization to determine a corresponding degree of similarity between the first document set characterization and the second document set characterization.
2 Assignments
0 Petitions
Accused Products
Abstract
General entity retrieval and ranking is described. A first set of documents is retrieved from one or more document repositories based on a query formed according to the topic. The first set of documents is characterized based on its first set of metadata values. One or more candidate entities are identified based on the first set of documents and the original query is thereafter augmented according to a candidate entity. The second set of documents resulting from the augmented query is then characterized in a similar manner. For each candidate entity, the first and second document set characterizations are compared to determine their degree of similarity. Increasingly similar document set characterizations indicates that the candidate entity is increasingly relevant to the original query. Repeating this process for each of the one or more candidate entities can give rise to rankings according to the respective degrees of similarity.
-
Citations
18 Claims
-
1. In a processing device in communication with a document repository, a method for assessing entities, the method comprising:
-
retrieving a first set of documents from the document repository based on a query, the first set of documents having first metadata values corresponding to a plurality of metadata attributes; characterizing the first set of documents based on the first metadata values to provide a first document set characterization; determining at least one candidate entity based on the first set of documents; for each of the at least one candidate entity, retrieving a second set of documents from the document repository based on the query and the candidate entity, the second set of documents having second metadata values corresponding to the plurality of metadata attributes; for each of the at least one candidate entity, characterizing the second set of documents based on the second metadata values to provide a second document set characterization; and for each of the at least one candidate entity, comparing the second document set characterization with the first document set characterization to determine a corresponding degree of similarity between the first document set characterization and the second document set characterization. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An apparatus comprising:
-
at least one processor; and at least one storage device comprising instructions that, when executed, cause the at least one processor to; retrieve a first set of documents from a document repository based on a query, the first set of documents having first metadata values corresponding to a plurality of metadata attributes; characterize the first set of documents based on the first metadata values to provide a first document set characterization; determine at least one candidate entity based on the first set of documents; for each of the at least one candidate entity, retrieve a second set of documents from the document repository based on the query and the candidate entity, the second set of documents having second metadata values corresponding to the plurality of metadata attributes; for each of the at least one candidate entity, characterize the second set of documents based on the second metadata values to provide a second document set characterization; and for each of the at least one candidate entity, compare the second document set characterization with the first document set characterization to determine a corresponding degree of similarity between the first document set characterization and the second document set characterization. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification