×

Method for dividing sentences into phrases using entropy calculations of word combinations based on adjacent words

  • US 6,505,151 B1
  • Filed: 03/15/2000
  • Issued: 01/07/2003
  • Est. Priority Date: 03/15/2000
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for automating the process of dividing sentences into phrases, comprising the following steps:

  • (a) dividing a sentence into sub-sentences using statistical analysis, FE

    (Ci)
    =-

    Cj


    PF

    (Cj|Ci)






    log





    PF

    (Cj|Ci)
    embedded image

    including the following substeps;

    (a.1) for each pair of adjacent words in the sentence calculating a metric which represents a strength of disconnection between the adjacent words, and (a.2) breaking the sentence into sub-sentences at locations in the sentence where the metric exceeds a first threshold; and

    , (b) dividing the sub-sentences into phrases, using statistical analysis;

    wherein in substep (a.1), the metric is a cutability measure that is calculated as a sum of backward entropy, forward entropy and mutual information; and

    , wherein in substep (a.1) forward entropy (FE) of a character CI, which immediately proceeds a character Cj in a sentence is calculated using the following equation;

    where PF(Cj|Ci) is the probability of Cj following Cj.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×