System and method for data sensitive filtering of patient demographic record queries
First Claim
1. A computer implemented method for data sensitive filtering in patient database searches, said method comprising the steps of:
- providing a search criteria for searching a database, the search criteria comprising characters entered into multiple fields of an available plurality of search locator fields;
determining, with a processor, a retrieval formula based on said search criteria that maximizes error tolerance prior to the execution of the retrieval formula, wherein the error tolerance is maximized to achieve a candidate record range bounded by a maximum and a minimum number of records to be returned, wherein the candidate record range is achieved by establishing a probability of a subset of the characters entered into the multiple fields using a comparison of the subset of the characters to a predetermined collection of high probability character strings of said database, and wherein the error tolerance is maximized by determining a candidate record filtering condition that allows the maximum number of candidate records within the candidate record range to be retrieved within a response time requirement;
retrieving, by the processor and based on said retrieval formula, candidate records from said database, wherein the determination of the retrieval formula occurs prior to the retrieving;
scoring, by the processor, each said candidate record by comparing a search criteria locator field with a corresponding retrieved record field, wherein comparing comprises performing a field by field comparison of said locator field and said candidate record field pair to fill in components of a comparison result vector cj for a field pair j and using a field comparison method predefined for each field pair;
scoring said comparison result c1 based on one or more probabilities using a formula
score(cj)=log(P1j(cj))−
log(P0j(cj))=log(P1j(cj)/P0j(cj)),wherein P0j(cj) and P1j(cj) are probabilities that are functions of the number of matching characters in said pair of fields;
summing score(cj) over all fields j where the both the locator field and the corresponding field in said candidate record are not blank to calculate a first score; and
determining, by the processor, whether said score of said candidate record exceeds a predefined threshold, and if said candidate score does exceed said threshold, adding said candidate record to a list of records to be returned in response to said search criteria.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for data sensitive filtering in a patient database searches includes providing a search criteria comprising one or more search locator fields, determining a retrieval formula from the search criteria that maximizes error tolerance in the search criteria while satisfying a predefined response time requirement, and retrieving candidate records from the database. If no retrieval formula can be found that satisfies the response time requirements, the method includes requesting additional search criteria, scoring each candidate record by comparing a search criteria locator field with a corresponding retrieved record field, and determining whether the score of the candidate record exceeds a predefined threshold. If the candidate score does exceed the threshold, the candidate record is added to a list of records to be returned in response to the search criteria.
54 Citations
21 Claims
-
1. A computer implemented method for data sensitive filtering in patient database searches, said method comprising the steps of:
-
providing a search criteria for searching a database, the search criteria comprising characters entered into multiple fields of an available plurality of search locator fields; determining, with a processor, a retrieval formula based on said search criteria that maximizes error tolerance prior to the execution of the retrieval formula, wherein the error tolerance is maximized to achieve a candidate record range bounded by a maximum and a minimum number of records to be returned, wherein the candidate record range is achieved by establishing a probability of a subset of the characters entered into the multiple fields using a comparison of the subset of the characters to a predetermined collection of high probability character strings of said database, and wherein the error tolerance is maximized by determining a candidate record filtering condition that allows the maximum number of candidate records within the candidate record range to be retrieved within a response time requirement; retrieving, by the processor and based on said retrieval formula, candidate records from said database, wherein the determination of the retrieval formula occurs prior to the retrieving; scoring, by the processor, each said candidate record by comparing a search criteria locator field with a corresponding retrieved record field, wherein comparing comprises performing a field by field comparison of said locator field and said candidate record field pair to fill in components of a comparison result vector cj for a field pair j and using a field comparison method predefined for each field pair; scoring said comparison result c1 based on one or more probabilities using a formula
score(cj)=log(P1j(cj))−
log(P0j(cj))=log(P1j(cj)/P0j(cj)),wherein P0j(cj) and P1j(cj) are probabilities that are functions of the number of matching characters in said pair of fields; summing score(cj) over all fields j where the both the locator field and the corresponding field in said candidate record are not blank to calculate a first score; and determining, by the processor, whether said score of said candidate record exceeds a predefined threshold, and if said candidate score does exceed said threshold, adding said candidate record to a list of records to be returned in response to said search criteria. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer implemented method for data sensitive filtering in patient database searches, said method comprising the steps of:
-
providing a search criteria for retrieving one or more records from a database comprising one or more search locator fields; determining, with a processor, a maximum number of candidate records to be retrieved from said database; determining, with the processor, a number of candidate records to be retrieved based on a number of characters in said search criteria being correct; determining, with the processor, a retrieval formula from said search criteria that maximizes error tolerance in said search criteria while satisfying said maximum number of candidate records, wherein the retrieval formula comprises selecting a sub-string of said search field, searching a dictionary of high frequency strings of said database for said substring, and performing a database query when said search sub-string is not found in said dictionary; retrieving, with the processor and based on the retrieval formula, said candidate records from said database; scoring, by the processor, each said candidate record by comparing a search criteria locator field with a corresponding retrieved record field, wherein comparing comprises performing a field by field comparison of said locator field and said candidate record field pair to fill in components of a comparison result vector cj for a field pair j and using a field comparison method predefined for each field pair; scoring said comparison result cj based on one or more probabilities using a formula
score(cj)=log(P1j(cj))−
log(P0j(cj))=log(P1j(cj)/P0j(cj)),wherein P0j(cj) and P1j(cj) are probabilities that are functions of the number of matching characters in said pair of fields; summing score(cj) over all fields j where the both the locator field and the corresponding field in said candidate record are not blank to calculate a first score; and determining, by the processor, whether said score of said candidate record exceeds a predefined threshold, and if said candidate score does exceed said threshold, adding said candidate record to a list of records to be returned in response to said search criteria. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A non-transitory program storage device readable by a computer, tangibly embodying a program of instructions executable by the computer to perform method steps for data sensitive filtering in patient database searches, said method comprising the steps of:
-
the computer; providing a search criteria comprising at least one string of characters entered into a plurality of search locator fields; determining a retrieval formula from said search criteria that maximizes error tolerance in said search criteria while satisfying a candidate record range prior to the execution of the retrieval formula, wherein the error tolerance is maximized to achieve the candidate record range, and the candidate record range is achieved by establishing a probability of a subset of the characters entered into the multiple fields using a comparison of the subset of the characters to a predetermined collection of high probability character strings of said database, and wherein the error tolerance is maximized by determining a candidate record filtering condition that allows the maximum number of candidate records within the candidate record range to be retrieved within a response time requirement; retrieving, based on the retrieval formula, candidate records from said database; scoring each said candidate record by comparing a search criteria locator field with a corresponding retrieved record field, wherein comparing comprises performing a field by field comparison of said locator field and said candidate record field pair to fill in components of a comparison result vector cj for a field pair j and using a field comparison method predefined for each field pair; scoring said comparison result cj based on one or more probabilities using a formula
score(cj)=log(P1j(cj))−
log(P0j(cj))=log(P1j(cj)/P0j(cj)),wherein P0j(cj) and P1j(cj) are probabilities that are functions of the number of matching characters in said pair of fields; and summing score(cj) over all fields j where the both the locator field and the corresponding field in said candidate record are not blank to calculate a first score; and determining whether said score of said candidate record exceeds a predefined threshold, and if said candidate score does exceed said threshold, adding said candidate record to a list of records to be returned in response to said search criteria. - View Dependent Claims (17, 18, 19, 20, 21)
wherein maxScore=Σ
j maxScorej,minScore=Σ
j minScorej,wherein the locator field that is used to generate the jth component of the comparison vector is not blank in the search criteria, and wherein maxScorej=max(score(cj)) and minScorej=mi(score(cj)) over all possible values of cj.
-
-
20. The computer readable program storage device of claim 17, wherein said field comparison method for a field is one of an exact distance match, a Hamming distance, an edit distance, an edit distance with swap, a first name distance, and a last name distance.
-
21. The computer readable program storage device of claim 16, the method further comprising, for each candidate record in said list of records to be returned:
-
retrieving a most recent complete record from said database for said search locator field; for each field that has a non-empty value in said candidate record, replacing the field value in the retrieved complete record with the corresponding value in the candidate record; and adding the altered complete record to said database.
-
Specification