×

Product line extraction

  • US 7,853,597 B2
  • Filed: 04/28/2008
  • Issued: 12/14/2010
  • Est. Priority Date: 04/28/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. One or more computer-readable media having computer-executable instructions embodied thereon for performing a method of extracting product lines from a plurality of product titles, the method comprising:

  • receiving the plurality of product titles;

    breaking the plurality of product titles into a plurality of tokens, wherein the plurality of tokens includes unigrams and bigrams;

    generating an association rule for each of a plurality of token pairs, wherein a token pair may include two of the bigrams, two of the unigrams, or one bigram and one unigram, wherein the association rule includes a confidence factor and a support factor, wherein the support factor is a total number of times a first token and a second token occur together in a single product title divided by a total number of product titles within the plurality of product titles, and wherein the confidence factor is the total number of times the first token and the second token occur together within the plurality of product titles divided by a number of times the first token occurs within the plurality of product titles;

    generating a plurality of brand specific tokens that form part of a brand name;

    generating a plurality of product class specific tokens using the plurality of brand specific tokens and the association rule for each of the plurality of token pairs;

    generating a plurality of model specific tokens that form part of a product model; and

    generating a plurality of product lines from the plurality of tokens.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×