Semantic gene organizer
First Claim
Patent Images
1. A sementic gene organization method comprising:
- producing at least one gene document for a plurality of selected genes by compiling textual information for citations which are cross-referenced in a database for said selected genes;
processing said gene documents according to a latent semantic indexing (LSI) model to measure similarities between gene documents based upon similar word usage patterns; and
, parsing said gene documents to produce a result set of semantically relevant gene relationships responsive to receiving a query vector of at least one term.
1 Assignment
0 Petitions
Accused Products
Abstract
A semantic gene classification and annotation system, method and computer program can utilize Latent Semantic Indexing (LSI) to identify conceptually related genes based on textual information in biomedical literature, including MEDLINE citations. In addition, term weights calculated from the usage of the gene terms in and across gene documents can be used to automatically assign gene aliases and extend gene function annotation based upon primary biomedical literature.
16 Citations
20 Claims
-
1. A sementic gene organization method comprising:
-
producing at least one gene document for a plurality of selected genes by compiling textual information for citations which are cross-referenced in a database for said selected genes;
processing said gene documents according to a latent semantic indexing (LSI) model to measure similarities between gene documents based upon similar word usage patterns; and
,parsing said gene documents to produce a result set of semantically relevant gene relationships responsive to receiving a query vector of at least one term. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A semantic gene organization data processing system comprising:
-
a term-by-gene matrix generator configured to generate a term-by-gene document matrix based upon terms identified within gene documents;
singular value decomposition (SVD) logic enabled to generate a plurality of factor matrices based upon said term-by-gene document matrix; and
,a document-to-document similarity processor having a configuration to receive said factor matrices and to generate one of similarity and distance scores based upon a received query vector to produce results for said query vector. - View Dependent Claims (10, 11, 12)
-
-
13. A computer program product comprising a computer usable medium having computer usable program code for sementic gene organization, said computer program product including:
-
computer usable program code for producing at least one gene document for a plurality of selected genes by compiling textual information for citations which are cross-referenced in a database for said selected genes;
computer usable program code for processing said the gene documents according to a latent semantic indexing (LSI) model to measure similarities between gene documents based upon similar word usage patterns; and
,computer usable program code for parsing said gene documents to produce a result set of semantically relevant gene relationships responsive to receiving a query vector of at least one term. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification