×

System and method for analyzing language using supervised machine learning method

  • US 7,542,894 B2
  • Filed: 07/08/2002
  • Issued: 06/02/2009
  • Est. Priority Date: 10/09/2001
  • Status: Expired due to Fees
First Claim
Patent Images

1. A system for analyzing Japanese language using supervised learning method, the system comprising:

  • sentence data storage means for storing sentence data which do not include solutions for a target problem;

    problem expression storage means for storing problem expression data comprising a problem expression which indicates an object of a language analysis and information of expressions corresponding to said problem expression;

    problem expression extraction processing means for extracting a portion which corresponds to any one of the expressions corresponding to the problem expression from said sentence data by using a predetermined language analysis and replacing the extracted portion of the sentence data with the problem expression;

    supervised data creation processing means for creating a plurality of supervised data, which is formed as a pair of a problem and either a solution or a solution candidate, wherein the pair comprises the sentence data in which the portion is replaced with the problem expression as the problem and either the portion extracted from said sentence data by the problem expression extracting processing means as the solution or the portion extracted from other sentence data except said sentence data, which are stored in said sentence data storage means as the solution candidate;

    supervised data features obtaining processing means for obtaining a plurality of predetermined syntactic supervised data features, which include one or more of a part of speech, root form, lexical category, dependency structure and modification structure from each sentence of the supervised data using syntactic analysis and then generating solution/features pairs of each sentence of the supervised data, wherein the solution/features pairs are a positive example having the plurality of supervised data features and the solution and negative examples having the plurality of supervised data features and each one of the solution candidates;

    machine learning processing means for performing machine learning, processing on the solution/features pairs using a kernel function executed as a support vector machine, by classifying the solution based upon generating a hyperplane which maximizes an interval of the positive and negative examples and divides these two examples by the hyperplane on a space having dimensions determined by the plurality of obtained featuresand storing the hyperplane as the result of the machine learning processing in the learning result storing database;

    object sentence data obtaining processing means for inputting object sentence data and obtaining a plurality of syntactic object sentence features, which include one or more of a part of speech, root form, lexical category, dependency structure and modification structure from the input object sentence data using the syntactic analysis; and

    solution extrapolation processing means for using the stored hyperplane to determine which divided part of the space does the plurality of the syntactic object sentence features belong to, and estimates a determined part with highest probability as the solution as classified for the plurality of syntactic object sentence features.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×