×

Enhanced max margin learning on multimodal data mining in a multimedia database

  • US 8,463,053 B1
  • Filed: 08/10/2009
  • Issued: 06/11/2013
  • Est. Priority Date: 08/08/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method for multimodal data mining, comprising:

  • defining a multimodal data set comprising image information;

    representing image information of a data object as a set of feature vectors in a feature space representing a plurality of constraints for each training example, wherein the feature vectors comprise a joint feature representation associated with Lagrange multipliers, the feature vectors being partitioned into a dual variable set comprising two partitions and having non-image representations associated with the respective image data object;

    clustering in the feature space to group similar features;

    associating a non-image representation with a respective data object based on the clustering;

    determining a joint feature representation of a respective data object as a mathematical weighted combination of a set of components of the joint feature representation;

    optimizing a weighting for a plurality of components of the mathematical weighted combination with respect to a prediction error between a predicted classification and a training classification by iteratively solving a Lagrange dual problem, with an automated data processor, by partitioning the Lagrange multipliers into an active set and an inactive set, wherein the Lagrange multiplier for a member of the active set is greater than or equal to zero and the Lagrange multiplier for a member of the inactive set is zero, the iteratively solving comprising moving members of the active set having zero-valued Lagrange multipliers to the inactive set without changing an objective function, and moving members of the inactive set to the active set which result in a decrease in the objective function; and

    employing the mathematical weighted combination for automatically classifying a new data object,wherein;

    the set of feature vectors in the feature space represents a plurality of constraints for each training example, the feature vectors comprise joint feature representation defined by Φ

    , having a Lagrange multiplier μ

    i, y for each constraint to form the Lagrangian, wherein i and j denote different elements of a respective set, yi is an annotation and a member of the set Y, xi and xj are each annotations and members of the set X, superscript T denotes a transpose matrix, n is the number of elements in the respective set, y represents a prediction of yi from the set Yi, {tilde over (y)} is an operator of y, α



    i, yμ

    i, yΦ

    i,yi, y, l( y,yi) is a loss function defined as the number of the different entries in vectors y and yi, Φ

    i,yi, y

    i(yi)−

    Φ

    i( y), and a kernel function K((xi, y),(xj,{tilde over (y)}))=<

    Φ

    i,yi, y, Φ

    j,yj, y>

    , the feature vectors being partitioned into a dual variable set μ

    comprising two partitions, μ

    B and μ

    N and non-image representations S associated with the respective image data object, the dual variable set μ

    having i examples such that μ

    =[μ

    1T . . . μ

    nT]T and S=[S1T . . . SnT]T, wherein the lengths of μ and

    S are the same, and Ai is defined as a vector which has the same length as that of μ

    , where Ai, y=1 and Aj, y=0 for j≠

    i, such that A=[A1 . . . An]T, matrix D represents a kernel matrix where each entry is K((xi, y), (xj,{tilde over (y)})), and C represents a vector where each entry is a constant C;

    the feature vectors comprise a dual variable set μ

    comprising labeled examples which is decomposed into two partitions, μ

    B and μ

    N ; and

    said optimizing comprises iteratively solving for each member of the set;

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×