System and method for a precompiled database for biomolecular sequence information
First Claim
Patent Images
1. A method of storing biomolecular data in a database, comprising:
- generating a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity; and
storing the generated set of entities in a memory.
0 Assignments
0 Petitions
Accused Products
Abstract
A computer system stores biomolecular data in a database in a memory. The biomolecular database has a set of entities. Each entity stores attributes for a plurality of entries. At least one attribute is stored in an array. Data associated with an entry is stored at a location in the array. An entity offset designates the location of the data in the array. The same entity offset value is used to access data associated with a particular entry for all attributes within the entity.
190 Citations
31 Claims
-
1. A method of storing biomolecular data in a database, comprising:
-
generating a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity; and
storing the generated set of entities in a memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
generating the clone name attribute as a function of the entity offset of the clone entity.
-
-
3. The method of claim 1, wherein at least one entity has a first attribute, further comprising:
generating the first attribute value based on the associated entity offset.
-
4. The method of claim 1, wherein the set of attributes includes a first entity and a second entity, and further comprising:
storing a particular offset value associated with a particular entry of the second entity in the a first array for an entry of the first entity.
-
5. The method of claim 1, wherein the set of entities includes a clone entity and a library entity, and further comprising:
storing a particular library offset value associated with a particular library entity entry of the library entity in a clone library array for a clone entry of the clone entity.
-
6. The method of claim 1, wherein the set of entities includes:
-
a first entity with a plurality of first entries in a first array, and a second entity with a plurality of second entries in a second array, one of the plurality of second entries being a particular second entry associated with a particular second offset value, and further comprising;
associating a set of first entries having a set of first offset values with the particular second entry by;
storing a set of first offset values in a subsidiary array at a location associated with a particular subsidiary array offset value, and storing the particular subsidiary array offset value in the second array at a location designated by the particular second offset value.
-
-
7. The method of claim 6, further comprising:
-
storing, in the second array, a count representing the number of entries of the set of first entries; and
wherein the storing the set of first offset values sequentially stores the offset values of the set of first offset values in the subsidiary array.
-
-
8. The method of claim 1, wherein the set of entities includes:
-
a clone entity with a plurality of clone entries, a portion of the plurality of clone entries being a first set of clone entries and having a first set of clone offset values, and a cluster entity with a plurality of cluster entries in a Cluster.Clone attribute array, one of the plurality of cluster entries being a particular cluster entry with an associated particular cluster offset value, and further comprising;
associating the first set of clone entries wit the particular cluster entry by storing the first set of dine entity offset values in a subsidiary array at a location associated with a particular subsidiary array offset value, and storing the particular subsidiary array offset value in the Cluster.Clone attribute array at a location designated by the particular cluster offset value for the particular cluster entry.
-
-
9. The method of claim 8, wherein the cluster entries include high scoring pairs of associated clone entries having a region of sequence homology such that the region of sequence homology comprises a first predetermined number of matches that exceeds a threshold number of matches, wherein the threshold number of matches was increased when the region of sequence homology was not 100% similar.
-
10. The method of claim 8, wherein the set of entities further comprises a MapPos entity wherein a portion of the entries of the clone entity and a portion of the entries of the cluster entity are associated with the MapPos entity.
-
11. The method of claim 1, wherein the set of entities includes:
-
a tissueclass entity for establishing a predefined hierarchy of tissue classes; and
a library entity including a library.tissueclass attribute array for associated an entry of the library entity with at least one entry of the tissueclass entity.
-
-
12. The method of claim 11, wherein the tissueclass entity has a TissueClass.Name attribute array and a tissueclass.sub-class attribute array;
- and further comprising;
storing a particular tissueclass name for a particular entry in the TissueClass.Name attribute array, and storing a tissueclass offset value in the tissueclass sub-class attribute array for the particular entry such that the tissueclass offset value associates another tissueclass entry having a different tissueclass name with the particular tissueclass entry.
- and further comprising;
-
13. The method of claim 12, further comprising:
populating a portion of the tissueclass entity with tissueclass entries having a predetermined relationships among the tissueclass entries as defined by the TissueClass.Name attribute array, a tissueclass.parent attribute array and the tissueclass.sub-class attribute attribute array.
-
14. The method of claim 1, wherein the set of entities includes:
-
a cluster entity;
a HitID entity associating entries of the cluster entity with entries of the HitID entity, the entries of the HITID entity being identifiers of known biomolecules; and
a proteinfunction entity for establishing a predefined hierarchy of biomolecules by protein function, having a proteinclass.hitid attribute array for associating entries of the proteinfunction entity with entries of the HitID entity, wherein the proteinclass.hitid attribute array associates a biomolecule entry with at least one identifier of a known biomolecule in the HitID entity, allowing a user to search for biomolecular information by protein function and to identify clusters performing that protein function.
-
-
15. The method of claim 14, wherein the proteinfunction entity has a proteinfunction.name attribute array and a proteinfunction.subclass attribute array;
- further comprising;
storing a particular biomolecular name for a particular entry in the proteinfunction.name attribute array; and
storing a proteinfunction offset value in the proteinfunction.subclass attribute array for the particular entry such that the proteinfunction offset value associates another proteinfunction entry having a different biomolecule name with the particular proteinfunction entry for establishing a hierarchy of protein functions.
- further comprising;
-
16. An article of manufacture comprising a machine-readable medium having stored thereon instructions to:
-
generate a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity; and
store the generated set of entities in a memory. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
generate the clone name attribute as a function of the entity offset of the clone entity.
-
-
18. The article of manufacture of claim 16, wherein at least one entity has a first attribute, and wherein the machine-readable medium further comprises an instruction to:
generate the first attribute value based on the associated entity offset.
-
19. The article of manufacture of claim 16, wherein the set of attributes includes a first entity and a second entity, and wherein the machine-readable medium further comprises an instruction to:
store a particular offset value associated with a particular entry of the second entity in the a first array for an entry of the first entity.
-
20. The article of manufacture of claim 16, wherein the set of entities includes a clone entity and a library entity, and wherein the machine-readable medium further comprises an instruction to:
store a particular library offset value associated with a particular library entity entry of the library entity in a clone library array for a clone entry of the clone entity.
-
21. The article of manufacture of claim 16, wherein the set of entities includes:
-
a first entity with a plurality of first entries in a first array, and a second entity with a plurality of second entries in a second array, one of the plurality of second entries being a particular second entry associated with a particular second offset value, and wherein the machine-readable medium further comprises an instruction to;
associate a set of first entries having a set of first offset values with the particular second entry by;
storing a set of first offset values in a subsidiary array at a location associated with a particular subsidiary array offset value, and storing the particular subsidiary array offset value in the second array at a location designated by the particular second offset value.
-
-
22. The article of manufacture of claim 21, wherein the machine-readable medium further comprises an instruction to:
-
store, in the second array, a count representing the number of entries of the set of first entries; and
wherein the instruction to store the set of first offset values sequentially stores the offset values of the set of first offset values in the subsidiary array.
-
-
23. The article of manufacture of claim 16, wherein the set of entities includes:
-
a clone entity with a plurality of clone entries, a portion of the plurality of clone entries being a first set of clone entries and having a first set of clone offset values, and a cluster entity with a plurality of cluster entries in a Cluster.Clone attribute array, one of the plurality of cluster entries being a particular cluster entry with an associated particular cluster offset value, and wherein the machine-readable medium further comprises an instruction to;
associate the first set of clone entries wit the particular cluster entry by storing the first set of dine entity offset values in a subsidiary array at a location associated with a particular subsidiary array offset value, and storing the particular subsidiary array offset value in the Cluster.Clone attribute array at a location designated by the particular cluster offset value for the particular cluster entry.
-
-
24. The article of manufacture of claim 23, wherein the cluster entries include high scoring pairs of associated clone entries having a region of sequence homology such that the region of sequence homology comprises a first predetermined number of matches that exceeds a threshold number of matches, wherein the threshold number of matches was increased when the region of sequence homology was not 100% similar.
-
25. The article of manufacture of claim 23, wherein the set of entities further comprises a MapPos entity wherein a portion of the entries of the clone entity and a portion of the entries of the cluster entity are associated with the MapPos entity.
-
26. The article of manufacture of claim 16, wherein the set of entities includes:
-
a tissueclass entity for establishing a predefined hierarchy of tissue classes; and
a library entity including a library.tissueclass attribute array for associated an entry of the library entity with at least one entry of the tissueclass entity.
-
-
27. The article of manufacture of claim 26, wherein the tissueclass entity has a TissueClass.Name attribute array and a tissueclass.sub-class attribute array;
- and and wherein the machine-readable medium further comprises instructions to;
store a particular tissueclass name for a particular entry in the TissueClass.Name attribute array, and store a tissueclass offset value in the tissueclass sub-class attribute array for the particular entry such that the tissueclass offset value associates another tissueclass entry having a different tissueclass name with the particular tissueclass entry.
- and and wherein the machine-readable medium further comprises instructions to;
-
28. The article of manufacture of claim 27, and wherein the machine-readable medium further comprises an instruction to:
populate a portion of the tissueclass entity with tissueclass entries having a predetermined relationships among the tissueclass entries as defined by the TissueClass.Name attribute array, a tissueclass.parent attribute array and the tissueclass.sub-class attribute attribute array.
-
29. The article of manufacture of claim 16, wherein the set of entities includes:
-
a cluster entity;
a HitID entity associating entries of the cluster entity with entries of the HitID entity, the entries of the HITID entity being identifiers of known biomolecules; and
a proteinfunction entity for establishing a predefined hierarchy of biomolecules by protein function, having a proteinclass.hitid attribute array for associating entries of the proteinfunction entity with entries of the HitID entity, wherein the proteinclass.hitid attribute array associates a biomolecule entry with at least one identifier of a known biomolecule in the HitID entity, allowing a user to search for biomolecular information by protein function and to identify clusters performing that protein function.
-
-
30. The article of manufacture of claim 29, wherein the proteinfunction entity has a proteinfunction.name attribute array and a proteinfunction.subclass attribute array;
- and wherein the machine-readable medium further comprises instructions to;
store a particular biomolecular name for a particular entry in the proteinfunction.name attribute array; and
store a proteinfunction offset value in the proteinfunction.subclass attribute array for the particular entry such that the proteinfunction offset value associates another proteinfunction entry having a different biomolecule name with the particular proteinfunction entry for establishing a hierarchy of protein functions.
- and wherein the machine-readable medium further comprises instructions to;
-
31. A computer system for a biomolecular database, comprising:
-
means for generating a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity; and
means for storing the generated set of entities in a memory.
-
Specification