SYSTEMS, METHODS, AND APPARATUS FOR PROCESSING DOCUMENTS TO IDENTIFY STRUCTURES
First Claim
1. An apparatus for electronically identifying and compiling chemical structures found in a storage facility comprising one or more electronic files, the apparatus comprising:
- (a) a memory for storing a code defining a set of instructions; and
(b) a processor for executing the set of instructions, wherein the code comprises an optical structure recognition module configured to;
(i) identify a plurality of candidate chemical structures in the one or more electronic files of the storage facility, wherein at least one of the electronic files comprises non-embedded images of chemical structures identifiable by the optical structure recognition module;
(ii) for each identified candidate, derive a chemical structure object with an associated set of properties including number of carbons;
(iii) for each derived chemical structure object, apply one or more filters, including a filter to eliminate objects identified as having less than a selected number of carbons; and
(iv) store objects not eliminated by the one or more filters in a searchable electronic compendium of identified objects.
4 Assignments
0 Petitions
Accused Products
Abstract
In various embodiments, multiple heterogeneous documents are processed to identify structures, such as chemical structures, contained therein, including non-embedded structures. Also described is a graphical user interface that permits a user to search for a structure or substructure within a set of electronic documents, then displays the matching structures as well as the actual pages of the documents on which the matching structures are found. Display of the actual pages allows the user to verify the matches and provides helpful context for the user.
36 Citations
15 Claims
-
1. An apparatus for electronically identifying and compiling chemical structures found in a storage facility comprising one or more electronic files, the apparatus comprising:
-
(a) a memory for storing a code defining a set of instructions; and (b) a processor for executing the set of instructions, wherein the code comprises an optical structure recognition module configured to; (i) identify a plurality of candidate chemical structures in the one or more electronic files of the storage facility, wherein at least one of the electronic files comprises non-embedded images of chemical structures identifiable by the optical structure recognition module; (ii) for each identified candidate, derive a chemical structure object with an associated set of properties including number of carbons; (iii) for each derived chemical structure object, apply one or more filters, including a filter to eliminate objects identified as having less than a selected number of carbons; and (iv) store objects not eliminated by the one or more filters in a searchable electronic compendium of identified objects. - View Dependent Claims (2, 3, 4, 5, 6, 15)
-
-
7. An apparatus for displaying one or more chemical structures found in an electronic search of a storage facility comprising one or more electronic files, the apparatus comprising:
-
(a) a memory for storing a code defining a set of instructions; and (b) a processor for executing the set of instructions, wherein the code comprises a graphical user interface module configured to; (i) in a first designated location of a graphical user interface (GUI) display, display one or more chemical structures or substructures derived from an electronic search of a storage facility comprising one or more electronic files, wherein each of the displayed structures or substructures matches or contains a user-identified chemical structure or substructure; (ii) in a second designated location of the GUI display, display a list of, and/or icon(s) representing, one or more electronic files from the storage facility, each file containing one or more of the structures or substructures displayed in the first designated location of the GUI display; and (iii) in a third designated location of the GUI display, display a page, or portion thereof, of a selected one of the electronic files listed and/or represented in the second designated location of the GUI display, wherein the displayed page contains a selected one of the chemical structures or substructures displayed in the first designated location of the GUI display. - View Dependent Claims (8, 9, 10)
-
-
11. A method for electronically identifying and compiling chemical structures found in a storage facility comprising one or more electronic files, the method comprising:
-
creating and/or downloading electronic files for storage in a storage facility, wherein creating electronic files optionally comprises electronically scanning paper documents; identifying a plurality of candidate chemical structures in the electronic files of the storage facility using an optical structure recognition module, wherein at least one of the electronic files comprises non-embedded images of chemical structures identifiable by the optical structure recognition module; for each identified candidate, deriving a chemical structure object with an associated set of properties including number of carbons; for each derived chemical structure object, applying one or more filters, including a filter to eliminate objects identified as having less than a selected number of carbons; storing objects not eliminated by the one or more filters in a searchable electronic compendium of identified objects; and displaying results of a user-initiated search of the electronic compendium of identified objects on an electronic display. - View Dependent Claims (12, 13, 14)
-
Specification