System and method of prediction through the use of latent semantic indexing
First Claim
1. An optimizing process for optimizing a predictive modeling method implemented on a computer for predicting patient outcomes and conditions from medical database records of a population of patients, said predictive modeling method comprising the steps of:
- (a) providing medical database records of a population of patients, each patient ofsaid population of patients having corresponding outcome health score values;
(b) processing said medical database records by using Natural Language Processing;
(c) building a patient document corpus from said medical database records processedby using Natural Language Processing;
(d) weighting terms in said corpus by assigning a weight to each term in the corpus,said weight representing said term'"'"'s frequency for a patient'"'"'s document withrespect to said term'"'"'s frequency across all documents in said corpus;
(e) Constructing a high-dimensional and sparse term-by-document matrix from saidweighted terms, each said term in said corpus being represented as a vector acrosssaid population of patients;
(f) performing matrix factorization of said term-by-document matrix using LatentSemantic Indexing to reduce the dimensionality of said term-by-document matrixinto a lower-dimensional matrix concept space;
(g) querying said lower-dimensional matrix concept space using a term orcombination of terms to produce a single ranking of patients in said corpus usinga similarity score;
(h) given a single threshold of said similarity score, combining multiple said singlerankings to re-rank said population of patients in said corpus based on relatednessto multiple queries;
(i) optimizing said predictive modeling method through iterative variation of certainparameters to achieve a best precision fit, said certain parameters comprising;
(1) the number of patients used for each said query of said multiple queries;
(2) said threshold of said similarity score;
(3) a frequency of association to query of said patients of said corpus;
(4) a recall value of said patients returned by said query; and
(5) a precision value of said patients returned by said query; and
transmitting data associated with optimized re-ranked population of patients to one or more practitioners.
0 Assignments
0 Petitions
Accused Products
Abstract
A predictive modeling method implemented on a computer for predicting patient outcomes and conditions from medical database records of a population of patients, and an optimization process of iterative variation of parameters of the method to achieve a best precision fit. Individual patient documents are created by concatenation of unstructured text fields from the patient'"'"'s medical record, and these are processed using Natural Language Processing. A patient document corpus is built, and terms in the corpus are weighted and mapped to standard vocabularies. A term-by-document matrix is built and its dimensionality is reduced by Latent Semantic Indexing. Patient and term queries are combined and scored, producing a ranked list. The parameters of the model are iteratively optimized for an input list of patients with corresponding health score values.
-
Citations
3 Claims
-
1. An optimizing process for optimizing a predictive modeling method implemented on a computer for predicting patient outcomes and conditions from medical database records of a population of patients, said predictive modeling method comprising the steps of:
-
(a) providing medical database records of a population of patients, each patient of said population of patients having corresponding outcome health score values; (b) processing said medical database records by using Natural Language Processing; (c) building a patient document corpus from said medical database records processed by using Natural Language Processing; (d) weighting terms in said corpus by assigning a weight to each term in the corpus, said weight representing said term'"'"'s frequency for a patient'"'"'s document with respect to said term'"'"'s frequency across all documents in said corpus; (e) Constructing a high-dimensional and sparse term-by-document matrix from said weighted terms, each said term in said corpus being represented as a vector across said population of patients; (f) performing matrix factorization of said term-by-document matrix using Latent Semantic Indexing to reduce the dimensionality of said term-by-document matrix into a lower-dimensional matrix concept space; (g) querying said lower-dimensional matrix concept space using a term or combination of terms to produce a single ranking of patients in said corpus using a similarity score; (h) given a single threshold of said similarity score, combining multiple said single rankings to re-rank said population of patients in said corpus based on relatedness to multiple queries; (i) optimizing said predictive modeling method through iterative variation of certain parameters to achieve a best precision fit, said certain parameters comprising; (1) the number of patients used for each said query of said multiple queries; (2) said threshold of said similarity score; (3) a frequency of association to query of said patients of said corpus; (4) a recall value of said patients returned by said query; and (5) a precision value of said patients returned by said query; and transmitting data associated with optimized re-ranked population of patients to one or more practitioners. - View Dependent Claims (2, 3)
-
Specification