×

Structured prediction model learning apparatus, method, program, and recording medium

  • US 8,566,260 B2
  • Filed: 09/30/2010
  • Issued: 10/22/2013
  • Est. Priority Date: 09/30/2010
  • Status: Active Grant
First Claim
Patent Images

1. A structured prediction model learning apparatus, having a central processing unit, for learning a structured prediction model used to predict an output structure y corresponding to an input structure x, by using supervised data DL and unsupervised data DU, the structured prediction model learning apparatus comprising:

  • an output candidate graph generator implemented by the central processing unit to generate a supervised data output candidate graph for the supervised data and an unsupervised data output candidate graph for the unsupervised data, by using a set of definition data for generating output candidates identified by a structured prediction problem;

    a feature vector generator extracting features from the supervised data output candidate graph and the unsupervised data output candidate graph by using a feature extraction template, generating a D-dimensional base-model feature vector fx,y corresponding to a set of the features extracted from the supervised data output candidate graph, dividing a set of the features extracted from the unsupervised data output candidate graph into K subsets, and generating a Dk-dimensional auxiliary model feature vector g(k)x,y corresponding to features included in a subset k of the K subsets, where K is a natural number and kε

    {1, 2, . . . , K};

    a parameter generator generating a base-model parameter set λ

    which includes a first parameter set w formed of D first parameters in one-to-one correspondence with D elements of the base-model feature vector fx,y, generating an auxiliary model parameter set θ

    (k) formed of Dk auxiliary model parameters in one-to-one correspondence with Dk elements of the auxiliary model feature vector g(k)x,y, and to generate a set Θ

    ={θ

    (1), θ

    (2), . . . , θ

    (K)} of auxiliary model parameter sets, formed of K auxiliary model parameter sets θ

    (k);

    an auxiliary model parameter estimating unit estimating the set Θ

    of auxiliary model parameter sets which minimizes the Bregman divergence having a regularization term obtained from the auxiliary model parameter set θ

    (k), between each auxiliary model qk and a reference function {tilde over (r)} (x,y) which is a nonnegative function and indicates the degree of pseudo accuracy of the output structure y corresponding to the input structure x, by using the regularization term and the unsupervised data DU, where the auxiliary model qk is obtained by defining the auxiliary model parameter set θ

    (k) with a log-linear model; and

    a base-model parameter estimating unit estimating a base-model parameter set λ

    which minimizes an empirical risk function defined beforehand, by using the supervised data DL and the set Θ

    of auxiliary model parameter sets, where the base-model parameter set λ

    includes a second parameter set v={v1, v2, . . . , vK} formed of K second parameters in one-to-one correspondence with K auxiliary models;

    wherein the auxiliary model parameter estimating unit uses the auxiliary model parameter set θ

    (k) to obtain an L1 norm regularization term |θ

    (k)|1, obtains the Bregman divergence having the regularization term as the following empirical generalized relative entropy having a regularization term

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×