×

SYSTEMS AND METHODS FOR RECOGNIZING AMBIGUITY IN METADATA

  • US 20130332400A1
  • Filed: 06/07/2013
  • Published: 12/12/2013
  • Est. Priority Date: 06/08/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method for estimating artist ambiguity in a dataset, comprising:

  • at an electronic device having one or more processors and memory storing one or more programs for execution by the one or more processors;

    applying a statistical classifier to a first dataset including a plurality of media items, wherein each media item is associated with one of a plurality of artist identifiers, each artist identifier identifies a real world artist, and the statistical classifier calculates a respective probability that each respective artist identifier is associated with media items from two or more different real world artists based on a respective feature vector corresponding to the respective artist identifier; and

    providing a report of the first dataset, including the calculated probabilities, to a user of the electronic device;

    wherein each respective feature vector includes features selected from the group consisting of;

    whether the corresponding respective artist identifier matches multiple artist entries in one or more second datasets;

    whether a respective number of countries of registration of media items associated with the corresponding respective artist identifier exceeds a predetermined country threshold;

    whether a respective number of characters in the corresponding respective artist identifier exceeds a predetermined character threshold;

    whether a respective number of record labels associated with the corresponding respective artist identifier exceeds a predetermined label threshold;

    whether the corresponding respective artist identifier is associated with albums in at least two different languages; and

    whether a difference between an earliest release date and a latest release date of media items associated with the corresponding respective artist identifier exceeds a predetermined time span threshold.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×