×

Document ranking based on semantic distance between terms in a document

  • US 8,606,778 B1
  • Filed: 10/20/2011
  • Issued: 12/10/2013
  • Est. Priority Date: 03/31/2004
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • identifying, by one or more processors, a document based on two or more search terms;

    forming, by one or more processors, a tree structure based on the document, the tree structure including a plurality of items;

    analyzing, by the one or more processors, a repetition of one or more tags in the tree structure;

    determining, by the one or more processors and based on analyzing the repetition of the one or more tags, that the plurality of items are associated with a list, in the tree structure,the list not being defined by a list tag,the list including a header, andeach item, of the plurality of items associated with the list, including a plurality of words that describe the item associated with the list;

    annotating, by the one or more processors, the tree structure to indicate that the list is present;

    determining, by the one or more processors, a metric associated with the two or more search terms in a first manner when the two or more search terms appear in a single item of the plurality of items associated with the list;

    determining, by the one or more processors, the metric associated with the two or more search terms in a second manner when;

    a first search term, of the two or more search terms, appears in a first item of the plurality of items associated with the list, anda second search term, of the two or more search terms, appears in a second item of the plurality of items associated with the list,the first manner being different than the second manner;

    determining, by the one or more processors, the metric associated with the two or more search terms in a third manner when;

    the first search term appears in the header, andthe second search term appears in an item of the plurality of items associated with the list,the third manner being different than the first manner and the second manner;

    determining, by the one or more processors, a score for the document based on the metric associated with the two or more search terms; and

    ranking, by the one or more processors and based on the score, the document with respect to at least one other document.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×