EXTRACTING INFORMATION ABOUT REFERENCES TO ENTITIES ROM A PLURALITY OF ELECTRONIC DOCUMENTS
First Claim
1. A method of extracting information about references to entities from a plurality of electronic documents, the method comprising:
- applying at least one document quality measure to each of the plurality of electronic documents;
recognizing the references to entities in the plurality of electronic documents;
using at least one reference quality measure for each of the references to entities;
computing at least one topical category associated with each of the references to entities;
finding at least one co-occurring term associated with each of the references to entities; and
characterizing each of the references to entities by at least one characteristic category.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides a method and system of extracting information about references to entities from a plurality of electronic documents. In an exemplary embodiment, the method and system include (1) applying at least one document quality measure to each of the plurality of electronic documents, (2) recognizing the references to entities in the plurality of electronic documents, (3) using at least one reference quality measure for each of the references to entities, (4) computing at least one topical category associated with each of the references to entities, (5) finding at least one co-occurring term associated with each of the references to entities, and (6) characterizing each of the references to entities by at least one characteristic category.
-
Citations
35 Claims
-
1. A method of extracting information about references to entities from a plurality of electronic documents, the method comprising:
-
applying at least one document quality measure to each of the plurality of electronic documents;
recognizing the references to entities in the plurality of electronic documents;
using at least one reference quality measure for each of the references to entities;
computing at least one topical category associated with each of the references to entities;
finding at least one co-occurring term associated with each of the references to entities; and
characterizing each of the references to entities by at least one characteristic category. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A system of extracting information about references to entities from a plurality of electronic documents, the system comprising:
-
an applying module configured to apply at least one document quality measure to each of the plurality of electronic documents;
a recognizing module configured to recognize the references to entities in the plurality of electronic documents;
a using module configured to use at least one reference quality measure for each of the references to entities;
a computing module configured to compute at least one topical category associated with each of the references to entities;
a finding module configured to find at least one co-occurring term associated with each of the references to entities; and
a characterizing module configured to characterize each of the references to entities by at least one characteristic category.
-
-
35. A computer program product usable with a programmable computer having readable program code embodied therein of extracting information about references to entities from a plurality of electronic documents, the computer program product comprising:
-
computer readable code for applying at least one document quality measure to each of the plurality of electronic documents;
computer readable code for recognizing the references to entities in the plurality of electronic documents;
computer readable code for using at least one reference quality measure for each of the references to entities;
computer readable code for computing at least one topical category associated with each of the references to entities;
computer readable code for finding at least one co-occurring term associated with each of the references to entities; and
computer readable code for characterizing each of the references to entities by at least one characteristic category.
-
Specification