×

Building and updating of co-occurrence dictionary and analyzing of co-occurrence and meaning

  • US 5,406,480 A
  • Filed: 01/15/1993
  • Issued: 04/11/1995
  • Est. Priority Date: 01/17/1992
  • Status: Expired due to Term
First Claim
Patent Images

1. A computer implemented method, implemented by a programmed computer, of building a co-occurrence dictionary describing whether phrases co-occur in one sentence, the phases belonging to first and second categories in a dictionary containing phrases of a natural language which is an object, said method comprising using the computer to build the co-occurrence dictionary by implementing the steps of:

  • selecting, as a first sub-group of phrases (11), phrases from a first group of phrases (1) comprising all phrases belonging to said first category in said dictionary;

    selecting, as a second sub-group of phrases (21), phrases from a second group of phrases (2) comprising all phrases belonging to said second category in the dictionary;

    preparing first co-occurrence information describing whether each phrase belonging to the first sub-group (11) and each phrase belonging to the second sub-group (21) co-occur in one sentence of the object language;

    preparing second-co-occurrence information describing whether each phrase belonging to a third sub-group of phrases (12), comprising all the phrases in the first group (1) which do not belong to the first sub-group (11) and each phrase belonging to the second sub-group (21), co-occur in one sentence of the object language;

    preparing third co-occurrence information describing whether each phrase belonging to a fourth sub-group of phrases (22), comprising all the phrases in the second group (2) which do not belong to the second sub-group (21) and each phrase belonging to the first sub-group (11) co-occur in one sentence of the object language;

    arranging the first co-occurrence information such that each phrase belonging to the first sub-group (11) corresponds to a real number vector with a dimension below a common maximum dimension and each phrase belonging to the second sub-group (21) corresponds to a real number vector with a dimension below the common maximum dimension;

    calculating a value of the real number vector corresponding to each phrase in the first sub-group (11) and a value of the real number vector corresponding to each phrase in the second sub-group (21) on the basis of the first co-occurrence information so that the number of sets of two phrases, wherein;

    a value of an inner product of the real number vector corresponding to a first phrase and the real number vector corresponding to a second phrase becomes positive when describing, in the first co-occurrence information, that a first phrase belonging to said first sub-group (11) and a second phrase belonging to said second sub-group (21) co-occur in one sentence, andthe value of an inner product of the real number vector corresponding to said first phrase and the real number vector corresponding to said second phrase becomes negative when describing, in said first co-occurrence information, that said first phrase belonging to said first sub-group (11) and said second phrase belonging to said second sub-group (21) do not co-occur in one sentence,becomes the greatest of all the numbers of sets each comprising phrases belonging to said first sub-group (11) and phrases belonging to the second sub-group (21);

    arranging said second co-occurrence information such that each phrase belonging to said third sub-group (12) corresponds to a real number vector with a dimension below the maximum dimension;

    calculating a value of the real number vector corresponding to each phrase in said third sub-group (12) on the basis of said second co-occurrence information so that the number of sets of two phrases, wherein;

    a value of the inner product of the real number vector corresponding to a third phrase belonging to said third sub-group (12) and the real number vector corresponding to a fourth phrase belonging to said second sub-group (21) and calculated on the basis of said first co-occurrence information becomes positive when describing, in said second co-occurrence information, that the third phase and the fourth phrase co-occur in one sentence, anda value of an inner product of the real number vector corresponding to the third phrase and the real number vector corresponding to the fourth phrase becomes negative when describing, in said second co-occurrence information, that the third phrase and the fourth phrase do not co-occur in one sentence,becomes the largest of all the numbers of sets each comprising a phrase belonging to said third sub-group (12) and a phrase belonging to said second sub-group (21);

    arranging said third co-occurrence information such that each phrase belonging to the fourth sub-group (22) corresponds to a real number vector with a dimension below the maximum dimension; and

    calculating a value of the real number vector corresponding to each phrase in the fourth sub-group (22) on the basis of said third co-occurrence information so that the number of sets of two phrases, wherein;

    the inner product of the real number vector corresponding to a fifth phrase belonging to said first sub-group (11) and calculated on the basis of said first co-occurrence information and the real number vector corresponding to a sixth phrase belonging to the fourth sub-group (22) becomes positive when describing, in the third co-occurrence information, that the fifth phrase and the sixth phrase co-occur in one sentence and, on the other hand,the inner product of the real number vector corresponding to the fifth phrase calculated on the basis of the first co-occurrence information and the real number vector corresponding to the sixth phrase becomes negative when describing, in the third co-occurrence information, that the fifth phrase and the sixth phrase do not co-occur in one sentence,becomes the greatest of all the numbers of sets each comprising a phrase belonging to said first sub-group (11) and a phrase belonging to said fourth sub-group (22).

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×