×

Query phrasification

  • US 8,166,021 B1
  • Filed: 03/30/2007
  • Issued: 04/24/2012
  • Est. Priority Date: 03/30/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for identifying valid phrases in an input text comprising a plurality of three or more words, the method comprising:

  • decomposing, by at least one processor of a computer system, the input text into a plurality of candidate phrasifications, including different groupings of words of the input text, each candidate phrasification comprising a disjoint union of component phrases, and each component phrase including at least one word or related word of the input text;

    scoring, by at least one of the processors of the computer system, at least two of the candidate phrasifications, wherein the candidate phrasifications include two or more component phrases, and wherein the scoring is based on a probability of occurrence of each of the candidate phrasification'"'"'s component phrases, and is based on the number of component phrases constituting the candidate phrasification, wherein candidate phrasifications having relatively fewer component phrases are weighted higher than candidate phrasifications having relatively more component phrases;

    comparing, by at least one of the processors of the computer system, a score for each scored candidate phrasification to a threshold value;

    selecting, by at least one of the processors of the computer system, at least one candidate phrasification, wherein the scores of each selected phrasification exceeds a chosen threshold value; and

    identifying, by at least one of the processors of the computer system, the component phrases of each selected phrasification as valid phrases for the input text.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×