×

Visual comparison of documents using latent semantic differences

  • US 10,360,302 B2
  • Filed: 09/15/2017
  • Issued: 07/23/2019
  • Est. Priority Date: 09/15/2017
  • Status: Active Grant
First Claim
Patent Images

1. A method for comparing documents using latent semantic differences, the method comprising:

  • receiving a plurality of documents from a user;

    extracting a plurality of linguistic units associated with the received plurality of documents,wherein a canonical unit for each linguistic unit from the extracted plurality of linguistic units is determined,wherein that at least one variation of each linguistic unit from the extracted plurality of linguistic units are present is determined,wherein a number of variations of each linguistic unit from the extracted plurality of linguistic units by utilizing a dictionary is determined,wherein the determined number of variations of each linguistic unit is tracked by utilizing a set of tables,wherein at least one start position and at least one end position with each linguistic unit from the extracted plurality of linguistic units is stored,wherein each linguistic unit from the extracted plurality of linguistic units includes a plurality of words in a contiguous sequence;

    building a plurality of latent semantic dimensions based on the extracted plurality of linguistic units;

    weighting the extracted plurality of linguistic units utilizing the built plurality of latent semantic dimensions;

    determining a plurality of latent semantic differences between the received plurality of documents based on weighted plurality of linguistic units;

    mapping the weighted plurality of linguistic units to a scaled visual feature; and

    generating a visualization to the user of the received plurality of documents based on the determined plurality of latent semantic differences and the scaled visual feature,wherein a plurality of mark-ups is added to the mapped plurality of linguistic units,wherein at least one value associated with at least one dimension of the mapped plurality of linguistic units from each of the received plurality of documents is correlated with at least one value associated with a hue, a saturation and a lightness based on the determined plurality of latent semantic differences associated with the at least one dimension of the mapped plurality of linguistic units,wherein the at least one value associated with the hue, saturation and lightness is translated into a hexadecimal code.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×