System and method for global identification in a collection of documents
First Claim
Patent Images
1. A method of machine-based identification of one or more information objects in a document and in a document storage, the information objects in the document and information objects in the document storage corresponding to the same real world object, the method comprising:
- (a) searching for global identification patterns and for a combination of the global identification patterns in the document;
(b) searching for the same global identification patterns and their combinations in the document storage;
(c) finding matching pairs of the information objects, one information object from the document and at least one information object from the document storage for the same combination of the patterns;
(d) ascertaining consistency of the matching pairs and determining which said one or more information objects in the document are suitable for merging into the document storage; and
(e) adding the one or more information objects from the document to the document storage.
5 Assignments
0 Petitions
Accused Products
Abstract
Techniques for machine-based identification of objects extracted from text documents in natural language are disclosed. Text documents with extracted objects are presented in a form of Resource Description Framework (RDF) graphs with the nodes correspondent to the objects and arcs correspondent to the relations between objects. Identification of objects is implemented using specific combinations of patterns which define features of the objects.
-
Citations
21 Claims
-
1. A method of machine-based identification of one or more information objects in a document and in a document storage, the information objects in the document and information objects in the document storage corresponding to the same real world object, the method comprising:
-
(a) searching for global identification patterns and for a combination of the global identification patterns in the document; (b) searching for the same global identification patterns and their combinations in the document storage; (c) finding matching pairs of the information objects, one information object from the document and at least one information object from the document storage for the same combination of the patterns; (d) ascertaining consistency of the matching pairs and determining which said one or more information objects in the document are suitable for merging into the document storage; and (e) adding the one or more information objects from the document to the document storage. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A platform for machine-based identification of one or more information objects in a document and in a document storage, the information objects in the document and in the document storage corresponding to the same real world object, the platform comprising:
-
at least one local, remote, distributed or web-based computing device; and
a memory locally or remotely coupled to the computing device and storing instructions which, responsive to execution on the computing device, cause the computing device to perform;(a) searching for global identification patterns and for a combination of the global identification patterns in the document; (b) searching for the same global identification patterns and their combinations in the document storage; (c) finding matching pairs of the information objects from the document and the document storage for the same combination of the patterns; (d) ascertaining consistency of the matching pairs and determining which said one or more information objects in the document are suitable for merging into the document storage; and (e) adding the one or more information objects from the document to the document storage. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A computer-readable medium storing processor-readable instructions for machine-based identification of one or more information objects in a document and in a document storage, the information objects in the document and in the document storage corresponding to the same real world object, the instructions which, responsive to execution in a computing device, cause the computing device to perform:
-
(a) searching for global identification patterns and for a combination of the global identification patterns in the document; (b) searching for the same global identification patterns and their combinations in the document storage; (c) finding matching pairs of the information objects from the document and the document storage for the same combination of the patterns; (d) ascertaining consistency of the matching pairs and determining which said one or more information objects in the document are suitable for merging into the document storage; and (e) adding the one or more information objects in the document to the document storage. - View Dependent Claims (19, 20, 21)
-
Specification