Database system employing protein function hierarchies for viewing biomolecular sequence data
First Claim
1. A computer system comprising:
- a database containing records pertaining to a plurality of biomolecular sequences;
a first hierarchy of protein function categories into which at least some of said biomolecular sequences are grouped, said protein function categories specifying biological functions of proteins corresponding to said biomolecular sequences and said first hierarchy including(i) a first set of protein function categories specifying biological functions at a cellular level, and(ii) a second set of protein function categories specifying biological functions at a level above the cellular level; and
a user interface allowing a user to selectively view information regarding said plurality of said biomolecular sequences as it relates to said first hierarchy.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a relational database system for storing biomolecular sequence information in a manner that allows sequences to be catalogued and searched according to one or more protein function hierarchies. The hierarchies allow searches for sequences based upon a protein'"'"'s biological function or molecular function. Also disclosed is a mechanism for automatically grouping new sequences into protein function hierarchies. This mechanism uses descriptive information obtained from "external hits" which are matches of stored sequences against gene sequences stored in an external database such as GenBank. The descriptive information provided with the external database is evaluated according to a specific algorithm and used to automatically group the external hits (or the sequences associated with the hits) in the categories. Ultimately, the biomolecular sequences stored in databases of this invention are provided with both descriptive information from the external hit and category information from a relevant hierarchy or hierarchies.
-
Citations
45 Claims
-
1. A computer system comprising:
-
a database containing records pertaining to a plurality of biomolecular sequences; a first hierarchy of protein function categories into which at least some of said biomolecular sequences are grouped, said protein function categories specifying biological functions of proteins corresponding to said biomolecular sequences and said first hierarchy including (i) a first set of protein function categories specifying biological functions at a cellular level, and (ii) a second set of protein function categories specifying biological functions at a level above the cellular level; and a user interface allowing a user to selectively view information regarding said plurality of said biomolecular sequences as it relates to said first hierarchy. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method of using a computer system to present information pertaining to a plurality of biomolecular sequence records stored in a database, the method comprising:
-
displaying a list of said records or a field for entering information identifying one or more of said records; identifying one or more of said records that a user has selected from said list or field; matching said one or more selected records with one or more protein function categories from a first hierarchy of protein function categories into which at least some of said biomolecular sequence records are grouped; and displaying the one or more categories matching said one or more selected records, wherein said protein function categories specify biological functions of proteins corresponding to said biomolecular sequences and said first hierarchy includes (i) a first set of protein function categories specifying biological functions at a cellular level, and (ii) a second set of protein function categories specifying biological functions at a tissue level. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
-
21. A method of using a computer system to present information pertaining to a plurality of biomolecular sequence records stored in a database, the method comprising:
-
displaying a list of one or more protein biological function categories from a first hierarchy of protein biological function categories into which at least some of said biomolecular sequence records are grouped identifying one or more of said protein biological function categories that a user has selected from said list; matching said one or more selected protein biological function categories with one or more biomolecular sequence records which are grouped in the selected protein biological function categories; and displaying the one or more sequence records matching said one or more selected protein biological function categories, wherein said protein biological function categories specify biological functions of proteins corresponding to said biomolecular sequences and said first hierarchy includes (i) a first set of protein biological function categories specifying biological functions at a cellular level, and (ii) a second set of protein biological function categories specifying biological functions at a tissue level. - View Dependent Claims (22, 23, 24, 25, 26)
-
-
27. A database system having a plurality of internal records, the database comprising:
-
a plurality of sequence records specifying biomolecular sequences, at least some of said records referencing hits to an external database, which hits specify genes having sequences that at least partially match those of the biomolecular sequences a plurality of external hit records specifying said hits to said external database, at least some of said records referencing protein function hierarchy categories which specify at least one of biological functions of proteins or molecular functions of proteins. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34)
-
-
35. In an internal database, a method of using a computer system to automatically categorize biomolecular sequence records into protein function categories, the method comprising the steps of:
-
(a) receiving descriptive information about a biomolecular sequence in the internal database from a record in an external database, which record pertains to a gene having a sequence that at least partially matches that of the biomolecular sequence; (b) determining whether the descriptive information contains one or more terms matching one or more keywords associated with a first protein function category, the keywords being terms consistent with a classification in the first protein function category; (c) when at least one keyword is found to match a term in the descriptive information, determining whether the descriptive information contains a term matching one or more anti-keywords associated with the first protein function category, the anti-keywords being terms inconsistent with a classification in the first protein function category; (d) grouping said biomolecular sequence in the first protein function category when the descriptive information contains a term matching a keyword but contains no term matching an anti-keyword. - View Dependent Claims (36, 37, 38, 39, 40, 41)
-
-
42. A computer readable medium including program instructions for performing the following steps:
-
(a) receiving descriptive information about a biomolecular sequence in the internal database from a record in an external database, which record pertains to a gene having a sequence that at least partially matches that of the biomolecular sequence; (b) determining whether the descriptive information contains one or more terms matching one or more keywords associated with a first protein function category, the keywords being terms consistent with a classification in the first protein function category; (c) when at least one keyword is found to match a term in the descriptive information, determining whether the descriptive information contains a term matching one or more anti-keywords associated with the first protein function category, the anti-keywords being terms inconsistent with a classification in the first protein function category; (d) grouping said biomolecular sequence in the first protein function category when the descriptive information contains a term matching a keyword but contains no term matching an anti-keyword. - View Dependent Claims (43, 44, 45)
-
Specification