×

Method and apparatus for computerized extracting of scheduling information from a natural language e-mail

  • US 7,158,980 B2
  • Filed: 10/02/2003
  • Issued: 01/02/2007
  • Est. Priority Date: 10/02/2003
  • Status: Active Grant
First Claim
Patent Images

1. A method for computerized extracting of scheduling information from a natural language text for automatic entry into a calendar application, the method comprising the following steps:

  • (a) parsing the natural language text to build a dependency tree by segmenting each sentence in the natural language text into words, building the dependency tree containing dependency pairs by comparing word pairs in the natural language text with a dependency database, and adding the word pairs found in the dependency database as dependency pairs to the dependency tree;

    (b) determining if the natural language text contains scheduling information by calculating a probability sum for the dependency tree; and

    (c) if the probability sum exceeds a predetermined value, extracting scheduling information from the dependency tree and exporting the scheduling information to the calendar application;

    wherein building the dependency database includes the following steps;

    segmenting each sentence in a text corpus into words, wherein the text corpus contains a plurality of sample natural language texts containing scheduling information;

    for each sentence in the text corpus, checking all possible combinations of word pairs to determine if the word pair has a high co-occurrency in the text corpus;

    if the word pair has the high co-occurrency in the text corpus, determining a head word using a tagged corpus, and checking the validity of the word pair using violation constraints, wherein the tagged corpus specifies actual head words for sentences relevant to scheduling information in the text corpus and contains dependencies for all other words with respect to the actual head words, and the violation constraints specify illegal dependency structures;

    if the word pair is a valid dependency pair, computing a probability of the word pair, adding the word pair as a dependency pair to the dependency database, and adding the probability of the dependency pair to the dependency database, wherein the probability of the dependency pair corresponds to a frequency of the word pair in the text corpus; and

    repeating the above steps until no new dependency pairs are identified.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×