System and method for multiple instance learning for computer aided detection
First Claim
1. A computer-implemented method of training a classifier for computer aided detection of digitized medical images, comprising the steps of:
- providing a plurality of bags, each bag containing a plurality of feature samples of a single region-of-interest in said medical image, wherein said feature samples include texture, shape, intensity, and contrast of said region-of-interest, wherein each region-of-interest has been labeled as either malignant or healthy; and
training said classifier on said plurality of bags of feature samples, subject to the constraint that at least one point in a convex hull of each bag, corresponding to said feature sample, is correctly classified according to the label of the associated region-of-interest,wherein said classifier is trained on a computer, and wherein said classifier is trained by minimizing the expression vE(ξ
)+Φ
(ω
,η
)+Ψ
(λ
) over arguments (ξ
,ω
,η
,λ
)ε
Rr+n+1+γ
subject to the conditions
ξ
i=di−
(λ
jiBjiω
−
eη
),
ξ
ε
Ω
,
e′
λ
ji=1,
0≦
λ
ji,wherein ξ
={ξ
1, . . . ,ξ
r} are slack terms, E;
RrR represents a loss function, ω
is a hyperplane coefficient, η
is the bias term, λ
is a vector containing the coefficients of the convex combination that defines the representative point of bag i in class j wherein 0≦
λ
ji,e′
λ
ji=1, γ
is the total number of convex hull coefficients corresponding to the representative points in class j,Φ
;
R(n+1)R is a regularization function on the hyperplane coefficients, Ψ
is a regularization function on the convex combination coefficients λ
ji, Ω
represents a feasible set for ξ
matrix Bjiε
Rmji×
n,i=1, . . . ,rj, jε
{±
1} is the ith bag of class label j, r is the total number of representative points, n is the number of features, mji is the number of rows in B, vector dε
{±
1}rj represents binary bag-labels for the malignant and healthy sets, respectively, and the vector e represents a vector with all its elements equal to one.
3 Assignments
0 Petitions
Accused Products
Abstract
A method of training a classifier for computer aided detection of digitized medical image, includes providing a plurality of bags, each bag containing a plurality of feature samples of a single region-of-interest in a medical image, where each region-of-interest has been labeled as either malignant or healthy. The training uses candidates that are spatially adjacent to each other, modeled by a “bag”, rather than each candidate by itself. A classifier is trained on the plurality of bags of feature samples, subject to the constraint that at least one point in a convex hull of each bag, corresponding to a feature sample, is correctly classified according to the label of the associated region-of-interest, rather than a large set of discrete constraints where at least one instance in each bag has to be correctly classified.
62 Citations
19 Claims
-
1. A computer-implemented method of training a classifier for computer aided detection of digitized medical images, comprising the steps of:
-
providing a plurality of bags, each bag containing a plurality of feature samples of a single region-of-interest in said medical image, wherein said feature samples include texture, shape, intensity, and contrast of said region-of-interest, wherein each region-of-interest has been labeled as either malignant or healthy; and training said classifier on said plurality of bags of feature samples, subject to the constraint that at least one point in a convex hull of each bag, corresponding to said feature sample, is correctly classified according to the label of the associated region-of-interest, wherein said classifier is trained on a computer, and wherein said classifier is trained by minimizing the expression vE(ξ
)+Φ
(ω
,η
)+Ψ
(λ
) over arguments (ξ
,ω
,η
,λ
)ε
Rr+n+1+γsubject to the conditions
ξ
i=di−
(λ
jiBjiω
−
eη
),
ξ
ε
Ω
,
e′
λ
ji=1,
0≦
λ
ji,wherein ξ
={ξ
1, . . . ,ξ
r} are slack terms, E;
RrR represents a loss function, ω
is a hyperplane coefficient, η
is the bias term, λ
is a vector containing the coefficients of the convex combination that defines the representative point of bag i in class j wherein 0≦
λ
ji,e′
λ
ji=1, γ
is the total number of convex hull coefficients corresponding to the representative points in class j,Φ
;
R(n+1)R is a regularization function on the hyperplane coefficients, Ψ
is a regularization function on the convex combination coefficients λ
ji, Ω
represents a feasible set for ξ
matrix Bjiε
Rmj i ×
n,i=1, . . . ,rj, jε
{±
1} is the ith bag of class label j, r is the total number of representative points, n is the number of features, mji is the number of rows in B, vector dε
{±
1}rj represents binary bag-labels for the malignant and healthy sets, respectively, and the vector e represents a vector with all its elements equal to one.- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
wherein
-
-
6. The method of claim 5, further comprising:
initializing
-
7. The method of claim 6, further comprising setting convex-hull coefficients of negative bags to be 1.
-
8. The method of claim 6, further comprising transforming said feature samples into a higher dimensional space using a kernel transformation (X{+}, X) for the positive class and K(X{−
- }, X) for the negative class, wherein X{+}, X{−
}, and X are data matrices for positive, negative and all samples respectively, wherein each row is a sample vector in these matrices, wherein if the size of X is too large, subsampling a random subset from said original feature samples.
- }, X) for the negative class, wherein X{+}, X{−
-
9. The method of claim 5, wherein Φ
- (ω
)=ε
∥
ω
∥
22 and Ψ
(λ
)=ε
∥
λ
∥
22, wherein ε
is a positive regularization parameter.
- (ω
-
10. A method of training a classifier for computer aided detection of digitized medical images, comprising the steps of:
-
providing a plurality of bags, each bag containing a plurality of feature samples of a single region-of-interest in said medical image, wherein each region-of-interest has been labeled as either malignant or healthy, wherein each bag is represented by a matrix Bjiε
Rmj i ×
n, i=1, . . . ,rj,jε
{±
1} is the ith bag of class label j, r is the total number of representative points, n is the number of features, mji is the number of rows in B; andtraining said classifier by minimizing the expression ∥
ξ
∥
22+Φ
(ω
,η
)+Ψ
(λ
) over arguments (ξ
,ω
,η
,λ
)ε
Rr+n+1+γ
subject to the conditions
ξ
i=di−
(λ
jiBjiω
−
eη
),
e′
ξ
j=0,
e′
λ
ji=1,
0≦
λ
ji,wherein ξ
={ξ
, . . . ,ξ
r} are slack terms, ω
is a hyperplane coefficient, η
is the bias offset from the origin term, λ
is a vector containing the coefficients of the convex combination that defines the representative point of bag i in class j wherein 0≦
λ
ji,e′
λ
ji=1, γ
is the total number of convex hull coefficients corresponding to the representative points in class j, Φ
;
R(n+1)R is a regularization function on the hyperplane coefficients, Ψ
is a regularization function on the convex combination coefficients λ
ji, matrix Bjiε
Rmj i ×
n,i=1, . . . ,rj, jε
{±
1} is the ith bag of class label j, r is the total number of representative points, n is the number of features, mji is the number of rows in B, vector dε
{±
1}rj represents binary bag-labels for the malignant and healthy sets, respectively, and the vector e represents a vector with all its elements equal to one.
-
-
11. A program storage device readable by a computer, tangibly embodying a non-transitory program of instructions executable by the computer to perform the method steps for training a classifier for computer aided detection of digitized medical images, said method comprising the steps of:
-
providing a plurality of bags, each bag containing a plurality of feature samples of a single region-of-interest in said medical image, wherein said feature samples include texture, shape, intensity, and contrast of said region-of-interest, wherein each region-of-interest has been labeled as either malignant or healthy; and training said classifier on said plurality of bags of feature samples, subject to the constraint that at least one point in a convex hull of each bag, corresponding to said feature sample, is correctly classified according to the label of the associated region-of-interest wherein said classifier is trained by minimizing the expression vE(ξ
)+Φ
(ω
,η
)+Ψ
(λ
) over arguments (ξ
,ω
,η
,λ
)ε
Rr+n+1+γsubject to the conditions
ξ
i=di−
(λ
jiBjiω
−
eη
),
ξ
ε
Ω
,
e′
λ
ji−
1,
0≦
λ
ji,wherein ξ
={ξ
1, . . . ,ξ
r} are slack terms, E;
RrR represents a loss function, ω
is a hyperplane coefficient, η
is the bias term, λ
is a vector containing the coefficients of the convex combination that defines the representative point of bag i in class j wherein 0≦
λ
ji,e′
λ
ji=1, γ
is the total number of convex hull coefficients corresponding to the representative points in class j, Φ
;
R(n+1)R is a regularization function on the hyperplane coefficients, Ψ
is a regularization function on the convex combination coefficients λ
ji, Ω
represents a feasible set for ξ
, matrix Bjiε
Rmj i ,i=1, . . . ,rj, jε
{±
1} is the ith bag of class label j, r is the total number of representative points, n is the number of features, mji is the number of rows in B, vector dε
{±
1}rj represents binary bag-labels for the malignant and healthy sets, respectively, and the vector e represents a vector with all its elements equal to one.- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
subject to the conditions
ω
T(μ
+−
μ
−
)=b,
e′
λ
ji=1,
0≦
λ
ji,wherein
-
-
16. The computer readable program storage device of claim 15, the method further comprising:
initializing
-
17. The computer readable program storage device of claim 16, the method further comprising setting convex-hull coefficients of negative bags to be 1.
-
18. The computer readable program storage device of claim 16, the method further comprising transforming said feature samples into a higher dimensional space using a kernel transformation (X{+}, X) for the positive class and K(X{−
- }, X) for the negative class, wherein X{+}, X{−
}, and X are data matrices for positive, negative and all samples respectively, wherein each row is a sample vector in these matrices, wherein if the size of X is too large, subsampling a random subset from said original feature samples.
- }, X) for the negative class, wherein X{+}, X{−
-
19. The computer readable program storage device of claim 15, wherein Φ
- (ω
)=ε
∥
ω
∥
22 and Ψ
(λ
)=ε
∥
λ
∥
22, wherein ε
is a positive regularization parameter.
- (ω
Specification