Classification confidence estimating tool
First Claim
1. A non-transitory computer readable medium for estimating a confidence level of a classification assignment, the non-transitory computer readable medium having instructions configured to cause a processor of a computer to:
- create a data model having a plurality of classes of records, the data model being based on a set of pre-classified records;
assign one of said classes to a new record;
obtain a qualitative confidence level for the assignment of said class;
obtain a quantitative confidence level for the assignment of said class; and
combine the qualitative confidence level and the quantitative confidence level, wherein the quantitative confidence level is obtained by finding a ratio of the probability of said class to a second most likely class, finding a ratio of a complement probability of said second most likely class to a complement probability of said class, and using matchfactors to generate a scaled ratio.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a tool that assesses and classifies data in a data set to and compares those assessments with nominal attributes and text attributes of data present in a new record to assign a classification to the new record. The classification assigned to the new record is provided a confidence level based on both a qualitative factor and a quantitative factor. The qualitative factor may be calculated by forming of a list of important words for each class and comparing the list to data in the new record and converting the comparison into a confidence level; the quantitative factor may be calculated by estimating the importance or weight of several factors, ratios of certain probabilities related to the most likely class and to the second most likely class, and using the importance of the factors and matchfactors to scale the resulting ratio, then transforming the resulting ratio into a confidence level.
-
Citations
13 Claims
-
1. A non-transitory computer readable medium for estimating a confidence level of a classification assignment, the non-transitory computer readable medium having instructions configured to cause a processor of a computer to:
-
create a data model having a plurality of classes of records, the data model being based on a set of pre-classified records; assign one of said classes to a new record; obtain a qualitative confidence level for the assignment of said class; obtain a quantitative confidence level for the assignment of said class; and combine the qualitative confidence level and the quantitative confidence level, wherein the quantitative confidence level is obtained by finding a ratio of the probability of said class to a second most likely class, finding a ratio of a complement probability of said second most likely class to a complement probability of said class, and using matchfactors to generate a scaled ratio. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A non-transitory computer readable medium for estimating a confidence level of a classification assignment, the non-transitory computer readable medium comprising a set of instructions stored on a computer readable medium which, when read by a processor, cause a computer to:
-
create a data model having a plurality of classes of records, the data model being based on a set of pre-classified records, assign one of said classes to a new record, obtain a qualitative confidence level for the assignment of said class, obtain a quantitative confidence level for the assignment of said class, and combine the qualitative confidence level and the quantitative confidence, wherein the non-transitory computer readable medium is configured to estimate a weight for each of a plurality of factors affecting said quantitative confidence level, find a ratio of the probability of said class to the probability of second most likely class, find a ratio of a complement probability of said second most likely class to a complement probability of said class, and use matchfactors to generate a scaled ratio. - View Dependent Claims (8, 9, 10)
-
-
11. A non-transitory computer readable medium for estimating the confidence level of a classification assignment comprising, the non-transitory computer readable medium including instructions configured to cause a processor of a computer to:
-
a) create a data model from a set of pre-classified records, the data model having a plurality of classes of records; b) assign one of said classes to a new record; c) obtain a qualitative confidence level for the assignment of said class comprising finding a list of words belonging to the class;
filtering said list to eliminate any noise words, and for each word on said list, finding a Mutual Information Score for each possible combination of one of the list of words and one of said plurality of classes using the following;
-
-
12. A non-transitory computer readable medium for estimating the confidence level of a classification assignment comprising, the non-transitory computer readable medium being configured to cause a processor of a computer to:
-
a) create a data model from a set of pre-classified, the data model having a plurality of classes of records; b) assign one of said classes to a new record; c) obtain a qualitative confidence level for the assignment of said class comprising finding a list of words belonging to the class; d) filter said list to eliminate any noise words, and for each word on said list, find a Mutual Information Score for each possible combination of one of the list of words and one of said plurality of classes; e) obtain a quantitative confidence level for the assignment of said class using the following;
determining a ratio of probabilities of a most likely class to a second most likely class;
-
-
13. A non-transitory computer readable medium for estimating the confidence level of a classification assignment comprising, the non-transitory computer readable medium being configured to cause a processor of a computer to:
-
a) create a data model from a set of pre-classified records, the data model having a plurality of classes of records; b) assign one of said classes to a new record; c) obtain a qualitative confidence level for the assignment of said class comprising finding a list of words belonging to the class; d) filter said list to eliminate any noise words, and for each word on said list, find a Mutual Information Score for each possible combination of one of the list of words and one of said plurality of classes using the following;
-
Specification