×

Contextual weighting of words in a word grouping

  • US 9,201,876 B1
  • Filed: 05/29/2012
  • Issued: 12/01/2015
  • Est. Priority Date: 05/29/2012
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method of determining a co-occurrence relationship between words in a corpus of word groupings, comprising:

  • identifying a plurality of word pairs from a vocabulary of words;

    determining, utilizing one or more processors, a co-occurrence probability for each of the word pairs in a corpus having a plurality of word groupings, each of the co-occurrence probability is based on the probability of co-occurrence of a single of the word pairs in a single of the word groupings;

    wherein determining the co-occurrence probability for a word pair of the word pairs comprises determining a weighted count of the word groupings in which the word pair is present, wherein a word grouping of the word groupings in which the word pair is present is from a document, and wherein the weight of the contribution of the word grouping to the weighted count is based on at least two of frequency of occurrence, field weighting, and decorations of both words of the word pair in the document;

    determining, utilizing one or more processors, a co-occurrence consistency for each of the word pairs by comparing the co-occurrence probability for each of the word pairs to an incidental occurrence probability for each of the word pairs, the incidental occurrence probability for each of the word pairs being specific to a respective of the word pairs;

    creating a co-occurrence consistency matrix with the co-occurrence consistency for each of the word pairs;

    receiving, by a search engine, a query submitted to the search engine by a user, the query including a word grouping having a plurality of word grouping words;

    identifying, by the search engine and based on the co-occurrence consistency matrix, the co-occurrence consistency for each of a plurality of the word pairs in the word grouping words;

    performing, by the search engine, a link analysis on the word grouping words utilizing the identified co-occurrence consistencies for the plurality of the word pairs in the word grouping words as weighting factors in the link analysis;

    assigning, by the search engine, a contextual weight to each of a plurality of the word grouping words based on the link analysis; and

    providing, by the search engine and based on the assigned contextual weights, results to the query submitted to the search engine by the user.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×