System and method for a precompiled database for biomolecular sequence information
First Claim
Patent Images
1. A method of accessing biomolecular data stored in a database, comprising the steps of:
- generating a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity;
storing the generated set of entities in a memory; and
retrieving data associated with the at least one attribute of a particular entry of one of the entities in the set of entities using an associated entity offset.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer system stores biomolecular data in a database in a memory. The biomolecular database has a set of entities. Each entity stores attributes for a plurality of entries. At least one attribute is stored in an array. Data associated with an entry is stored at a location in the array. An entity offset designates the location of the data in the array. The same entity offset value is used to access data associated with a particular entry for all attributes within the entity.
62 Citations
34 Claims
-
1. A method of accessing biomolecular data stored in a database, comprising the steps of:
-
generating a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity;
storing the generated set of entities in a memory; and
retrieving data associated with the at least one attribute of a particular entry of one of the entities in the set of entities using an associated entity offset. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
generating the clone name attribute as a function of the entity offset of the clone entity.
-
-
3. The method of claim 1 wherein at least one entity has a first attribute, further comprising the step of:
generating the first attribute value based on the associated entity offset.
-
4. The method of claim 1, wherein the set of entities includes a first entity and a second entity, and further comprising the steps of:
-
storing a particular offset value associated with a particular entry of the second entity in a first array for an entry of the first entity; and
accessing the particular entry of the second entity using the particular offset value stored in the first array for the entry of the first entity.
-
-
5. The method of claim 1 wherein the set of entities includes a clone entity and a library entity, and further comprising the steps of:
-
storing a particular library offset value associated with a particular library entry of the library entity in a clone.library array for a clone entry of the clone entity; and
accessing the particular library entry of the second entity using the particular library offset value stored in the clone.library array for the clone entry of the clone entity.
-
-
6. The method of claim 1 wherein the set of entities includes:
-
a first entity with plurality of first entries in a first array, and a second entity with a plurality of second entries in a second array, one of the plurality of second entries being a particular second entry associated with a particular second offset value, and further comprising the step of; associating a set of first entries having a set of first offset values with the particular second entry by;
storing the set of first offset values in a subsidiary array at a location associated with a particular subsidiary array offset value; and
storing the particular subsidiary array offset value in the second array at a location designated by the particular second offset value.
-
-
7. The method of claim 6 further comprising the steps of:
-
storing, in the second array, a count representing the number of entries of the set of first entries; and
wherein said step of storing the set of first offset values sequentially stores the offset values of the set of first offset values in the subsidiary array.
-
-
8. The method of claim 7 further comprising the steps of
retrieving the count value; -
retrieving the set of first offset values in the subsidiary array; and
accessing the entries of the set of first entries based on the set of first offset values.
-
-
9. The method of claim 1 wherein the set of entities includes:
-
a clone entity with plurality of clone entries, a portion of the plurality of clone entries being a first set of clone entries and having a first set of clone offset values, and a cluster entity with a plurality of cluster entries in a Cluster.Clone attribute array, one of the plurality of cluster entries being a particular cluster entry with an associated a particular cluster offset value, and further comprising the step of; associating the first set of clone entries with the particular cluster entry by;
storing the first set of clone offset values in a subsidiary array at a location associated with a particular subsidiary array offset value; and
storing the particular subsidiary array offset value in the Cluster.Clone attribute array at a location designated by the particular cluster offset value for the particular cluster entry.
-
-
10. The method of claim 9 wherein the cluster entries include high scoring pairs of associated clone entries having a region of sequence homology such that the region of sequence homology comprises a first predetermined number of matches that exceeds a threshold number of matches, wherein the threshold number of matches was increased when the region of sequence homology was not 100% similar.
-
11. The method of claim 9 further comprising:
a MapPos entity wherein a portion of the entries of the clone entity and a portion of the entries of the cluster entity are associated with the MapPos entity.
-
12. The method of claim 1 wherein the set of entities includes:
-
a tissueclass entity for establishing a predefined hierarchy of tissue classes; and
a library entity including a library.tissueclass attribute array for associating an entry of the library entity with at least one entry of the tissueclass entity.
-
-
13. The method of claim 12 wherein the tissueclass entity has a TissueClass.Name attribute array and a tissueclass.sub-class attribute array;
- further comprising the steps of;
storing a particular tissueclass name for a particular entry in the TissueClass.Name attribute array; and
storing a tissueclass offset value in the tissueclass.sub-class attribute array for the particular entry such that the tissueclass offset value associates another tissueclass entry having a different tissueclass name with the particular tissueclass entry.
- further comprising the steps of;
-
14. The method of claim 13 further comprising the step of:
populating a portion of the tissueclass entity with tissueclass entries having predetermined relationships among the tissueclass entries as defined by the TissueClass.Name attribute array, a tissueclass.parent attribute array and the tissueclass.sub-class attribute array.
-
15. The method of claim 1 wherein the set of entities includes:
-
a cluster entity; a HitID entity for associating entries of the cluster entity with entries of the HitID entity, the entries of the HitID entity being identifiers of known biomolecules; and
a proteinfunction entity for establishing a predefined hierarchy of biomolecules by protein function, having a proteinclass.hitid attribute array for associating entries of the proteinfunction entity with entries of the HitID entity, wherein the proteinclass.hitid attribute array associates a biomolecule entry with at least one identifier of a known biomolecule in the HitID entity, allowing a user to search for biomolecular information by protein function and to identify clusters performing that protein function.
-
-
16. The method of claim 15 wherein the proteinfunction entity has a proteinfunction.name attribute array and a proteinfunction.subclass attribute array;
-
further comprising the steps of; storing a particular biomolecule name for a particular entry in the proteinfunction.name attribute array; and
storing a proteinfunction offset value in the proteinfunction.subclass attribute array for the particular entry such that the proteinfunction offset value associates another proteinfunction entry having a different biomolecule name with the particular proteinfunction entry for establishing a hierarchy of protein functions.
-
-
17. A computer system for a biomolecular database, comprising:
-
a biomolecular database stored on memory media associated with the computer system, the biomolecular database having a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity; and
a data retrieval process that includes instructions for;
receiving a request for data associated with a particular attribute of the particular entry of a particular entity and returning the retrieved data to the requester, determining a particular entity offset value associated with the particular entry of the particular entity, and retrieving data using the particular entity offset value. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25)
generating the clone name attribute as a function of the entity offset of the clone entity.
-
-
19. The computer system of claim 18 wherein the set of entities includes:
-
a library entity having library entries including a particular library entry associated with a particular library offset value; and
a clone entity having a clone.library array storing associated library entity offset values, the particular library offset value being stored in the clone.library array to associate a particular clone entry with the particular library entry;
the data retrieval process further including instructions for; accessing the particular library entry of the library entity using the particular library offset value stored in the clone.library array.
-
-
20. The computer system of claim 17 wherein at least one entity has a first attribute, the data retrieval process further including instructions for:
generating the first attribute value based on the associated entity offset.
-
21. The computer system of claim 17, wherein the set of entities includes:
-
a clone entity with plurality of clone entries, a portion of the plurality of clone entries being a first set of clone entries and having a first set of clone offset values, and a cluster entity with a plurality of cluster entries in a Cluster.Clone attribute array, one of the plurality of cluster entries being a particular cluster entry with an associated a particular cluster offset value, wherein the first set of clone entries is associated with the particular cluster such that the first set of clone offset values is stored in a subsidiary array at a location associated with a particular subsidiary array offset value; and
the particular subsidiary array offset value is stored in the Cluster.Clone attribute array at a location designated by the particular cluster offset value for the particular cluster entry; andthe data retrieval process further includes instructions for retrieving the first set of clone offset values from the subsidiary array; and
accessing the entries in the clone entity using the first set of clone offset values.
-
-
22. The computer system of claim 21 wherein clusters have a high scoring pairs having a region of sequence homology such that the region of sequence homology comprises a first predetermined number of matches, that exceeds a threshold number of matches, wherein the threshold number of matches was increased when the region of sequence homology was not 100% similar.
-
23. The computer system of claim 21 wherein the set of entities further includes:
a MapPos entity wherein a portion of the entries of the clone entity and a portion of the entries of the cluster entity are associated with the MapPos entity.
-
24. The computer system of claim 17 wherein the set of entities includes:
-
a tissueclass entity for establishing a predefined hierarchy of tissue classes; and
a library entity including a library.tissueclass attribute array for associating an entry of the library entity with at least one entry of the tissueclass entity.
-
-
25. The computer system of claim 17 wherein the set of entities includes:
-
a cluster entity; a HitID entity for associating entries of the cluster entity with entries of the HitID entity, the entries of the HitID entity being identifiers of known biomolecules, and a proteinfunction entity for establishing a predefined hierarchy of biomolecules by protein function, having a proteinclass.hitid attribute array for associating entries of the proteinfunction entity with entries of the HitID entity, wherein the proteinclass.hitid attribute array associates a biomolecule entry with at least one identifier of a known biomolecule in the HitID entity, allowing a user to search for biomolecular information by protein function and to identify clusters performing that protein function.
-
-
26. A computer program product for a computer system that stores a biomolecular database comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program product comprising:
-
a biomolecular database stored on memory media associated with the computer system, the biomolecular database having a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity; and
a data retrieval module that includes instructions for;
receiving a request for data associated with a particular attribute of the particular entry of a particular entity and returning the retrieved data to the requester, determining a particular entity offset value associated with the particular entry of the particular entity, and retrieving data using the particular entity offset value. - View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34)
the data retrieval module includes instructions for: generating the clone name attribute as a function of the entity offset of the clone entity.
-
-
28. The computer program product of claim 27 wherein the set of entities includes:
-
a cluster entity; a HitiD entity for associating entries of the cluster entity with entries of the HitID entity, the entries of the HitID entity being identifiers of known biomolecules; and
a proteinfunction entity for establishing a predefined hierarchy of biomolecules by protein function, having a proteinclass.hitid attribute array for associating entries of the proteinfunction entity with entries of the HitID entity, wherein the proteinclass.hitid attribute array associates a biomolecule entry with at least one identifier of a known biomolecule in the HitID entity, allowing a user to search for biomolecular information by protein function and to identify clusters performing that protein function.
-
-
29. The computer program product of claim 26 wherein at least one entity of the set of entities has a first attribute, and
the data retrieval module includes instructions for: generating the first attribute value based on the associated entity offset.
-
30. The computer program product of claim 26 wherein the set of entities includes:
-
a library entity having library entries including a particular library entry associated with a particular library offset value, and a clone entity having a clone.library array storing associated library entity offset values, the particular library offset value being stored in the clone.library array to associate a particular clone entry with the particular library entry; and
the data retrieval module includes instructions for;
accessing the particular library entry of the library entity using the particular library offset value stored in the clone.library array.
-
-
31. The computer program product of claim 26, wherein the set of entities includes:
-
a clone entity with plurality of clone entries, a portion of the plurality of clone entries being a first set of clone entries and having a first set of clone offset values, and a cluster entity with a plurality of cluster entries in a Cluster.Clone attribute array, one of the plurality of cluster entries being a particular cluster entry with an associated a particular cluster offset value, wherein the first set of clone entries is associated with the particular cluster such that the first set of clone offset values is stored in a subsidiary array at a location associated with a particular subsidiary array offset value; and
the particular subsidiary array offset value is stored in the Cluster.Clone attribute array at a location designated by the particular cluster offset value for the particular cluster entry; and
the data retrieval module further includes instructions for retrieving the first set of clone offset values from the subsidiary array; and
accessing the entries in the clone entity using the first set of clone offset values.
-
-
32. The computer program product of claim 31 wherein clone entries associated with cluster entries of the cluster entity include high scoring pairs of associated clone entries having a region of sequence homology such that the region of sequence homology comprises a first predetermined number of matches that exceeds a threshold number of matches wherein the threshold number of matches was increased when the region of sequence homology was not 100% similar.
-
33. The computer program product of claim 30 wherein the set of entities includes:
a MapPos entity, wherein a portion of the entries of the clone entity and a portion of the entries of the cluster entity are associated with the MapPos entity.
-
34. The computer program product of claim 26 wherein the set of entities includes:
-
a tissueclass entity for establishing a predefined hierarchy of tissue classes; and library entity including a library.tissueclass attribute array wherein the library.tissueclass attribute array associates an entry of the library entity with at least one entry of the tissueclass entity.
-
Specification