×

Identifying product references in user-generated content

  • US 9,256,593 B2
  • Filed: 11/28/2012
  • Issued: 02/09/2016
  • Est. Priority Date: 11/28/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method for product extraction, the method comprising:

  • receiving, by a computer system, a document;

    identifying, by the computer system, a product type for the document according to content of the document;

    extracting, by the computer system, product attributes and attribute values from the document;

    retrieving, by the computer system, an attribute set corresponding to the product type from a database;

    identifying, by the computer system, a first set of products that have at least the product attributes and the attribute values of the document that are included in the attribute set, the first set of products being nodes in a hierarchical taxonomy;

    filtering, by the computer system, the first set of products by;

    identifying a common ancestor node in the hierarchical taxonomy having all of the first set of products as descendants;

    identifying immediate child nodes of the common ancestor node;

    identifying a majority child node having a major portion of the first set of products as descendants; and

    identifying a second set of products including a portion of the first set of products that are descendants of the majority child node and excluding those products of the first set of products that are not descendants of the majority child node;

    selecting, by the computer system, an inferred product for the document from the second set of products;

    wherein;

    identifying the second set of products comprises;

    calculating a score for each product in the first set of products; and

    selecting the second set of products based at least in part on the calculated scores for the first set of products;

    selecting the second set of products comprises;

    removing products from the first set of products if application of a blacklist rule to the document so indicates; and

    selecting the inferred product comprises;

    selecting the inferred product as specified by a whitelist rule if application of the whitelist rule to the document so indicates; and

    at least one of the blacklist rule and the whitelist rule take as an input a list of keywords from the document.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×