System and method for a precompiled database for biomolecular sequence information

US 6,223,186 B1
Filed: 10/20/1998
Issued: 04/24/2001
Est. Priority Date: 05/04/1998
Status: Expired due to Fees

First Claim

Patent Images

1. A method of accessing biomolecular data stored in a database, comprising the steps of:

generating a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity;

storing the generated set of entities in a memory; and

retrieving data associated with the at least one attribute of a particular entry of one of the entities in the set of entities using an associated entity offset.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A computer system stores biomolecular data in a database in a memory. The biomolecular database has a set of entities. Each entity stores attributes for a plurality of entries. At least one attribute is stored in an array. Data associated with an entry is stored at a location in the array. An entity offset designates the location of the data in the array. The same entity offset value is used to access data associated with a particular entry for all attributes within the entity.

62 Citations

View as Search Results

34 Claims

1. A method of accessing biomolecular data stored in a database, comprising the steps of:
- generating a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity;
  
  storing the generated set of entities in a memory; and
  
  retrieving data associated with the at least one attribute of a particular entry of one of the entities in the set of entities using an associated entity offset.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
- - 2. The method of claim 1 wherein at least one entity is a clone entity having a clone name attribute associated with a clone name, further comprising the step of:
3. The method of claim 1 wherein at least one entity has a first attribute, further comprising the step of:
- generating the first attribute value based on the associated entity offset.
4. The method of claim 1, wherein the set of entities includes a first entity and a second entity, and further comprising the steps of:
- storing a particular offset value associated with a particular entry of the second entity in a first array for an entry of the first entity; and
  
  accessing the particular entry of the second entity using the particular offset value stored in the first array for the entry of the first entity.
5. The method of claim 1 wherein the set of entities includes a clone entity and a library entity, and further comprising the steps of:
- storing a particular library offset value associated with a particular library entry of the library entity in a clone.library array for a clone entry of the clone entity; and
  
  accessing the particular library entry of the second entity using the particular library offset value stored in the clone.library array for the clone entry of the clone entity.
6. The method of claim 1 wherein the set of entities includes:
- a first entity with plurality of first entries in a first array, and a second entity with a plurality of second entries in a second array, one of the plurality of second entries being a particular second entry associated with a particular second offset value, and further comprising the step of;
  
  associating a set of first entries having a set of first offset values with the particular second entry by;
  
  storing the set of first offset values in a subsidiary array at a location associated with a particular subsidiary array offset value; and
  
  storing the particular subsidiary array offset value in the second array at a location designated by the particular second offset value.
7. The method of claim 6 further comprising the steps of:
- storing, in the second array, a count representing the number of entries of the set of first entries; and
  
  wherein said step of storing the set of first offset values sequentially stores the offset values of the set of first offset values in the subsidiary array.
8. The method of claim 7 further comprising the steps ofretrieving the count value;
- retrieving the set of first offset values in the subsidiary array; and
  
  accessing the entries of the set of first entries based on the set of first offset values.
9. The method of claim 1 wherein the set of entities includes:
- a clone entity with plurality of clone entries, a portion of the plurality of clone entries being a first set of clone entries and having a first set of clone offset values, and a cluster entity with a plurality of cluster entries in a Cluster.Clone attribute array, one of the plurality of cluster entries being a particular cluster entry with an associated a particular cluster offset value, and further comprising the step of;
  
  associating the first set of clone entries with the particular cluster entry by;
  
  storing the first set of clone offset values in a subsidiary array at a location associated with a particular subsidiary array offset value; and
  
  storing the particular subsidiary array offset value in the Cluster.Clone attribute array at a location designated by the particular cluster offset value for the particular cluster entry.
10. The method of claim 9 wherein the cluster entries include high scoring pairs of associated clone entries having a region of sequence homology such that the region of sequence homology comprises a first predetermined number of matches that exceeds a threshold number of matches, wherein the threshold number of matches was increased when the region of sequence homology was not 100% similar.
11. The method of claim 9 further comprising:
- a MapPos entity wherein a portion of the entries of the clone entity and a portion of the entries of the cluster entity are associated with the MapPos entity.
12. The method of claim 1 wherein the set of entities includes:
- a tissueclass entity for establishing a predefined hierarchy of tissue classes; and
  
  a library entity including a library.tissueclass attribute array for associating an entry of the library entity with at least one entry of the tissueclass entity.
13. The method of claim 12 wherein the tissueclass entity has a TissueClass.Name attribute array and a tissueclass.sub-class attribute array;
- further comprising the steps of;
  
  storing a particular tissueclass name for a particular entry in the TissueClass.Name attribute array; and
  
  storing a tissueclass offset value in the tissueclass.sub-class attribute array for the particular entry such that the tissueclass offset value associates another tissueclass entry having a different tissueclass name with the particular tissueclass entry.
14. The method of claim 13 further comprising the step of:
- populating a portion of the tissueclass entity with tissueclass entries having predetermined relationships among the tissueclass entries as defined by the TissueClass.Name attribute array, a tissueclass.parent attribute array and the tissueclass.sub-class attribute array.
15. The method of claim 1 wherein the set of entities includes:
- a cluster entity;
  
  a HitID entity for associating entries of the cluster entity with entries of the HitID entity, the entries of the HitID entity being identifiers of known biomolecules; and
  
  a proteinfunction entity for establishing a predefined hierarchy of biomolecules by protein function, having a proteinclass.hitid attribute array for associating entries of the proteinfunction entity with entries of the HitID entity, wherein the proteinclass.hitid attribute array associates a biomolecule entry with at least one identifier of a known biomolecule in the HitID entity, allowing a user to search for biomolecular information by protein function and to identify clusters performing that protein function.
16. The method of claim 15 wherein the proteinfunction entity has a proteinfunction.name attribute array and a proteinfunction.subclass attribute array;
- further comprising the steps of;
  
  storing a particular biomolecule name for a particular entry in the proteinfunction.name attribute array; and
  
  storing a proteinfunction offset value in the proteinfunction.subclass attribute array for the particular entry such that the proteinfunction offset value associates another proteinfunction entry having a different biomolecule name with the particular proteinfunction entry for establishing a hierarchy of protein functions.

17. A computer system for a biomolecular database, comprising:
- a biomolecular database stored on memory media associated with the computer system, the biomolecular database having a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity; and
  
  a data retrieval process that includes instructions for;
  
  receiving a request for data associated with a particular attribute of the particular entry of a particular entity and returning the retrieved data to the requester, determining a particular entity offset value associated with the particular entry of the particular entity, and retrieving data using the particular entity offset value.
- View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25)
- - 18. The computer system of claim 17 wherein at least one entity is a clone entity having a clone name attribute associated with a clone name, the data retrieval process further including instructions for:
19. The computer system of claim 18 wherein the set of entities includes:
- a library entity having library entries including a particular library entry associated with a particular library offset value; and
  
  a clone entity having a clone.library array storing associated library entity offset values, the particular library offset value being stored in the clone.library array to associate a particular clone entry with the particular library entry;
  
  the data retrieval process further including instructions for;
  
  accessing the particular library entry of the library entity using the particular library offset value stored in the clone.library array.
20. The computer system of claim 17 wherein at least one entity has a first attribute, the data retrieval process further including instructions for:
- generating the first attribute value based on the associated entity offset.
21. The computer system of claim 17, wherein the set of entities includes:
- a clone entity with plurality of clone entries, a portion of the plurality of clone entries being a first set of clone entries and having a first set of clone offset values, and a cluster entity with a plurality of cluster entries in a Cluster.Clone attribute array, one of the plurality of cluster entries being a particular cluster entry with an associated a particular cluster offset value, wherein the first set of clone entries is associated with the particular cluster such that the first set of clone offset values is stored in a subsidiary array at a location associated with a particular subsidiary array offset value; and
  
  the particular subsidiary array offset value is stored in the Cluster.Clone attribute array at a location designated by the particular cluster offset value for the particular cluster entry; and
  
  the data retrieval process further includes instructions for retrieving the first set of clone offset values from the subsidiary array; and
  
  accessing the entries in the clone entity using the first set of clone offset values.
22. The computer system of claim 21 wherein clusters have a high scoring pairs having a region of sequence homology such that the region of sequence homology comprises a first predetermined number of matches, that exceeds a threshold number of matches, wherein the threshold number of matches was increased when the region of sequence homology was not 100% similar.
23. The computer system of claim 21 wherein the set of entities further includes:
- a MapPos entity wherein a portion of the entries of the clone entity and a portion of the entries of the cluster entity are associated with the MapPos entity.
24. The computer system of claim 17 wherein the set of entities includes:
- a tissueclass entity for establishing a predefined hierarchy of tissue classes; and
  
  a library entity including a library.tissueclass attribute array for associating an entry of the library entity with at least one entry of the tissueclass entity.
25. The computer system of claim 17 wherein the set of entities includes:
- a cluster entity;
  
  a HitID entity for associating entries of the cluster entity with entries of the HitID entity, the entries of the HitID entity being identifiers of known biomolecules, and a proteinfunction entity for establishing a predefined hierarchy of biomolecules by protein function, having a proteinclass.hitid attribute array for associating entries of the proteinfunction entity with entries of the HitID entity, wherein the proteinclass.hitid attribute array associates a biomolecule entry with at least one identifier of a known biomolecule in the HitID entity, allowing a user to search for biomolecular information by protein function and to identify clusters performing that protein function.

26. A computer program product for a computer system that stores a biomolecular database comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program product comprising:
- a biomolecular database stored on memory media associated with the computer system, the biomolecular database having a set of entities, each entity storing attributes for a plurality of entries, at least one attribute being stored in an array wherein data associated with an entry is stored at a location in the array, an entity offset designating the location of the data associated with each entry within the array, wherein the same entity offset is used to access data associated with a particular entry for all attributes within the entity; and
  
  a data retrieval module that includes instructions for;
  
  receiving a request for data associated with a particular attribute of the particular entry of a particular entity and returning the retrieved data to the requester, determining a particular entity offset value associated with the particular entry of the particular entity, and retrieving data using the particular entity offset value.
- View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34)
- - 27. The computer program product of claim 26 wherein at least one entity of the set of entities is a clone entity having a clone name attribute associated with a clone name, and
28. The computer program product of claim 27 wherein the set of entities includes:
- a cluster entity;
  
  a HitiD entity for associating entries of the cluster entity with entries of the HitID entity, the entries of the HitID entity being identifiers of known biomolecules; and
  
  a proteinfunction entity for establishing a predefined hierarchy of biomolecules by protein function, having a proteinclass.hitid attribute array for associating entries of the proteinfunction entity with entries of the HitID entity, wherein the proteinclass.hitid attribute array associates a biomolecule entry with at least one identifier of a known biomolecule in the HitID entity, allowing a user to search for biomolecular information by protein function and to identify clusters performing that protein function.
29. The computer program product of claim 26 wherein at least one entity of the set of entities has a first attribute, andthe data retrieval module includes instructions for:
- generating the first attribute value based on the associated entity offset.
30. The computer program product of claim 26 wherein the set of entities includes:
- a library entity having library entries including a particular library entry associated with a particular library offset value, and a clone entity having a clone.library array storing associated library entity offset values, the particular library offset value being stored in the clone.library array to associate a particular clone entry with the particular library entry; and
  
  the data retrieval module includes instructions for;
  
  accessing the particular library entry of the library entity using the particular library offset value stored in the clone.library array.
31. The computer program product of claim 26, wherein the set of entities includes:
- a clone entity with plurality of clone entries, a portion of the plurality of clone entries being a first set of clone entries and having a first set of clone offset values, and a cluster entity with a plurality of cluster entries in a Cluster.Clone attribute array, one of the plurality of cluster entries being a particular cluster entry with an associated a particular cluster offset value, wherein the first set of clone entries is associated with the particular cluster such that the first set of clone offset values is stored in a subsidiary array at a location associated with a particular subsidiary array offset value; and
  
  the particular subsidiary array offset value is stored in the Cluster.Clone attribute array at a location designated by the particular cluster offset value for the particular cluster entry; and
  
  the data retrieval module further includes instructions for retrieving the first set of clone offset values from the subsidiary array; and
  
  accessing the entries in the clone entity using the first set of clone offset values.
32. The computer program product of claim 31 wherein clone entries associated with cluster entries of the cluster entity include high scoring pairs of associated clone entries having a region of sequence homology such that the region of sequence homology comprises a first predetermined number of matches that exceeds a threshold number of matches wherein the threshold number of matches was increased when the region of sequence homology was not 100% similar.
33. The computer program product of claim 30 wherein the set of entities includes:
- a MapPos entity, wherein a portion of the entries of the clone entity and a portion of the entries of the cluster entity are associated with the MapPos entity.
34. The computer program product of claim 26 wherein the set of entities includes:
- a tissueclass entity for establishing a predefined hierarchy of tissue classes; and
  
  library entity including a library.tissueclass attribute array wherein the library.tissueclass attribute array associates an entry of the library entity with at least one entry of the tissueclass entity.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Incyte Pharmaceuticals, Inc.
Original Assignee
Incyte Pharmaceuticals, Inc.
Inventors
Klingler, Tod M., Walker, Michael G., Stuve, Laura L., Lagace, Robert E., Hadley, David A., Goold, Richard D., Rigault, Philippe E., Wood, Michael P., Curtis, Anne L., Hibbert, Harold H.
Primary Examiner(s)
Alam, Hosain T.
Assistant Examiner(s)
ALAM, SHAHID AL

Application Number

US09/175,738
Time in Patent Office

917 Days
Field of Search

707/100, 707/104, 707/2, 707/5, 707/6-10, 707/101, 707/102, 707/103
US Class Current

N/A
CPC Class Codes

G16B 50/00   ICT programming tools or da...

G16B 50/30   Data warehousing; Computing...

Y10S 707/99936   Pattern matching access

Y10S 707/99945   Object-oriented database st...

Y10S 707/99948   Application of database or ...

System and method for a precompiled database for biomolecular sequence information

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

62 Citations

34 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for a precompiled database for biomolecular sequence information

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

62 Citations

34 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links