×

Method and system for determining sets of variant items

  • US 9,135,396 B1
  • Filed: 12/22/2008
  • Issued: 09/15/2015
  • Est. Priority Date: 12/22/2008
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method, comprising:

  • performing, by one or more computers having at least one processor and memory;

    for each particular item of a plurality of items;

    determining one or more other items of the plurality of items that are each distinct from but similar to the particular item, wherein said determining is based on accessing data that includes, for each item of the plurality of items, a textual description of the item that describes the item but is not itself an item in the plurality of items;

    for each given item of the determined one or more other items, identifying an item data pair with one member comprising a sequence of text strings from the textual description of the particular item, and the other member comprising another sequence of text strings from the textual description of the given item;

    subsequent to said identifying, aligning each identified item data pair, wherein said aligning the identified item data pair comprises aligning text in the sequence of text strings from the textual description of the particular item with text in the other sequence of text strings from the textual description of the given item; and

    for each aligned item data pair, determining one or more misalignments of the aligned item data pair, and assigning a similarity score to the aligned item data pair dependent on the one or more misalignments, wherein the similarity score indicates a degree of confidence that the given item and the particular item are distinct variants of each other; and

    based on a plurality of the aligned item data pairs and similarity scores assigned to each of those aligned item data pairs, determining a variant set comprising multiple ones of the plurality of items, wherein each item of the variant set is determined to be a variant of each other item of the variant set;

    wherein at least one of the aligned item data pairs comprises multiple misalignments;

    for each misalignment of the multiple misalignments, determining a respective subscore based on that misalignment;

    wherein said assigning the similarity score to said at least one aligned item data pair comprises assigning a result of a combination of each of said subscores to said at least one aligned item data pair.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×