System and method for cross-document coreference
First Claim
Patent Images
1. A method for coreferencing a plurality of documents comprising the steps of:
- providing a name list for names extracted from documents to be coreferenced prior to or upon entry of a query by a user;
placing each name of the name list into canonical form;
sorting the names of the name list into mergable names and exclusive sets;
comparing contexts of the mergable names against the exclusive sets to merge the mergable names to the exclusive sets which exceed a predetermined threshold to form an aggregated cross-document name list; and
referencing the aggregated cross-document name list to provide the user with coreferenced names across the plurality of documents which refer to a same entity in accordance with the query.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for coreferencing a plurality of documents includes the steps of providing a name list for names extracted from documents to be coreferenced upon entry of a query by a user, sorting the names of the list of names into mergable names and exclusive sets, comparing contexts of the mergable names against the exclusive sets to merge the mergable names to the exclusive sets exceeding a predetermined threshold to form an aggregated cross-document name list and referencing the aggregated cross-document name list to provide the user with coreferenced names across the plurality of documents which refer to a same entity in accordance with the query.
130 Citations
25 Claims
-
1. A method for coreferencing a plurality of documents comprising the steps of:
-
providing a name list for names extracted from documents to be coreferenced prior to or upon entry of a query by a user;
placing each name of the name list into canonical form;
sorting the names of the name list into mergable names and exclusive sets;
comparing contexts of the mergable names against the exclusive sets to merge the mergable names to the exclusive sets which exceed a predetermined threshold to form an aggregated cross-document name list; and
referencing the aggregated cross-document name list to provide the user with coreferenced names across the plurality of documents which refer to a same entity in accordance with the query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for coreferencing a plurality of documents, the method steps comprising:
-
providing a name list for names extracted from documents to be coreferenced;
placing each name of the name list into canonical form;
sorting the names of the name list into mergable names and exclusive sets; and
comparing contexts of the mergable names against the exclusive sets to merge the mergable names to the exclusive sets which exceed a predetermined threshold to form an aggregated cross-document name list for the names in the name list. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A method for searching a plurality of documents for an entity having a plurality of variant names comprising the steps of:
-
providing a name list for names extracted from documents to be coreferenced prior to or upon entry of a search query by a user including a name of the entity;
placing each name of the name list into canonical form;
sorting the names of the name list into mergable names and exclusive sets;
comparing contexts of the mergable names against the exclusive sets to merge the mergable names to the exclusive sets which exceed a predetermined threshold to form an aggregated cross-document name list, the aggregated cross-document name list including a list of variant names for the entity; and
providing a list of documents to the user referencing the variant names and the name of the entity used for the search query. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25)
-
Specification