×

Method and system for bootstrapping statistical processing into a rule-based natural language parser

  • US 5,963,894 A
  • Filed: 05/20/1997
  • Issued: 10/05/1999
  • Est. Priority Date: 06/24/1994
  • Status: Expired due to Term
First Claim
Patent Images

1. A method in a computer system for bootstrapping statistical processing into a rule-based natural language parser to efficiently parse a principal input string using a plurality of sample input strings representative of strings to be parsed by the natural language parser, the natural language parser for producing one or more parse results from an input string comprised of words by applying rules from a set of conditioned rules that each combine words or already combined groups of words, certain subsets of the set of rules being applicable when parsing particular input strings, comprising the steps of:

  • for each rule, initializing a plurality of indications of the number of times that the rule has succeeded, each of the plurality of indications corresponding to a characteristic of at least one of the words or already combined groups of words that may be combined by the rule;

    for each sample input string;

    exhaustively parsing the sample input string by applying each applicable rule of the set of rules to produce one or more parse results, andif fewer than a maximum number of parse results were produced by exhaustively parsing the sample input string, updating for each rule that combined words or already combined groups of words in the parse result an indication of the number of times that the rule succeeded that corresponds to a characteristic of at least one of the words or already combined groups of words of the sample input string combined in the parse results by the rule; and

    efficiently parsing the principal input string by applying applicable rules to the principal input string from the set of rules in the decreasing order of their likelihood of success as indicated by updated indications of the number of times that each rule succeeded that corresponds to a characteristic of at least one of the words or already combined groups of words of the sample input string combined in the parse results by the rule.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×