METHODS AND SYSTEMS FOR POPULATING AND SEARCHING A DRUG INFORMATICS DATABASE
First Claim
1. A method for populating a drug informatics database, comprising:
- receiving, into a computing device, unprocessed data associated with a chemical compound from one or more data sources;
parsing the unprocessed data in the computing device into a plurality of data objects based on a categorization associated with each of the data objects;
identifying and associating, in the computing device, additional information including explanatory notes with at least one of the data objects; and
storing the data objects in entries within a data structure, where the data structure is searchable based on one or more of the data objects.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides a method for populating and searching a drug informatics database that includes receiving unprocessed data associated with a chemical compound from one or more data sources. The unprocessed data is parsed into a plurality of data objects based on a categorization associated with each of the data objects. Additional information, such as explanatory notes, is identified and associated with at least one of the data objects. The data objects are stored in entries within a data structure, where the data structure is searchable based on one or more of the data objects. A query for data associated with a chemical compound is received at a drug informatics database. The drug informatics database is then searched for data associated with the chemical compound and the search results are provided to a user.
24 Citations
21 Claims
-
1. A method for populating a drug informatics database, comprising:
-
receiving, into a computing device, unprocessed data associated with a chemical compound from one or more data sources; parsing the unprocessed data in the computing device into a plurality of data objects based on a categorization associated with each of the data objects; identifying and associating, in the computing device, additional information including explanatory notes with at least one of the data objects; and storing the data objects in entries within a data structure, where the data structure is searchable based on one or more of the data objects.
-
-
2. The method of claim 1 wherein receiving unprocessed data includes receiving data from one of chemical companies, public databases, and public literature.
-
3. The method of claim 1 wherein parsing the unprocessed data includes identifying one of a company name, a company drug id, a molecular weight, and bibliographic information associated with a chemical compound.
-
4. The method of claim 1 wherein storing the data objects includes standardizing the data objects.
-
5. The method of claim 4 wherein standardizing the data includes associating a single unique representation with each of the chemical compounds.
-
6. The method of claim 4 wherein standardizing the data includes replacing aromatic systems with aromatic bonds and replacing explicit atoms with implicit atoms.
-
7. A method for searching a drug informatics database, comprising:
-
receiving, at a drug informatics database, a query for data associated with a chemical compound; searching the drug informatics database for data associated with the chemical compound; and providing the search results to a user.
-
-
8. The method of claim 7 wherein receiving a query includes receiving a query that includes a visual representation of the chemical compound.
-
9. The method of claim 7 wherein searching the drug informatics database includes converting the visual representation of the chemical compound into a search string.
-
10. The method of claim 7 wherein searching the drug informatics database includes performing a search on a subset of the drug informatics database.
-
11. The method of claim 7 wherein searching the drug informatics database includes incrementally caching the search results in real time or near real time.
-
12. The method of claim 7 wherein searching the drug informatics database includes using one of structure-based searching, property based searching, similarity-based searching, or matching similarity over existing experimentally validated compounds.
-
13. The method of claim 7 wherein searching the drug informatics database includes performing a substructure search by identifying chemical compounds that contain the queried chemical structure as a substructure.
-
14. The method of claim 7 wherein providing the search results includes providing an initial set of search results within a first time period and providing an updated set of search results within a second time period, where the first time period is less than the second time period.
-
15. The method of claim 14 wherein the search results are periodically updated and displayed without interrupting interactability with the drug informatics database.
-
16. The method of claim 7 wherein providing the search results includes presenting the search results in an .sdf format.
-
17. The method of claim 7 wherein providing the search results includes presenting the search results in two or more sortable columns, where the number and nature of the columns is user-selectable.
-
18. A drug informatics database, comprising:
-
a non-transitory computer-readable medium; a primary data structure on said medium for storing primary data objects in entries, where the data structure is searchable based on one or more of the data object associated with one or more chemical compounds; and an auxiliary data structure on said medium for storing auxiliary data objects in entries associated with the one or more chemical compounds, where the auxiliary data objects are linked to the primary data objects.
-
-
19. The database of claim 18 wherein the primary data structure includes a dataset tag used to identify related groups of data.
-
20. The database of claim 18 further comprising a web server including:
-
a computer-readable medium for storing non-transitory computer readable instructions; a processor for executing the non-transitory computer readable instructions stored in the computer-readable medium, where the computer-readable medium includes; an importation module for; receiving unprocessed data associated with a chemical compound from one or more data sources; parsing the unprocessed data into a plurality of data objects based on a categorization associated with each of the data objects; identifying and associating additional information including explanatory notes with at least one of the data objects; and storing the data objects in entries within a data structure in the drug informatics database, where the data structure is searchable based on one or more of the data objects; a search module for receiving a query for data associated with a chemical compound and searching the drug informatics database for data associated with the chemical compound; and a presentation module for providing the search results to a user.
-
-
21. The database of claim 18 wherein the one or more data sources include one of private chemical company databases, public databases, and public literature.
Specification