×

METHODS AND SYSTEMS FOR MAPPING DATA ITEMS TO SPARSE DISTRIBUTED REPRESENTATIONS

  • US 20190332619A1
  • Filed: 07/12/2019
  • Published: 10/31/2019
  • Est. Priority Date: 08/07/2014
  • Status: Active Application
First Claim
Patent Images

1. A computer-implemented method for identifying a level of similarity between a user-provided data item and a data item within a set of data documents, the method comprising:

  • clustering, by a reference map generator executing on a first computing device, in a two-dimensional metric space, a set of data documents selected according to at least one criterion, generating a semantic map;

    associating, by the semantic map, a coordinate pair with each of the set of data documents;

    generating, by a parser executing on the first computing device, an enumeration of terms occurring in the set of data documents;

    determining, by a representation generator executing on the first computing device, for each term in the enumeration, occurrence information including;

    (i) a number of data documents in which the term occurs, (ii) a number of occurrences of the term in each data document, and (iii) the coordinate pair associated with each data document in which the term occurs;

    generating, by the representation generator, for each term in the enumeration, a sparse distributed representation (SDR) using the occurrence information;

    storing, in an SDR database, each of the generated SDRs;

    receiving, by a filtering module executing on a second computing device, from a third computing device, a filtering criterion;

    generating, by the representation generator, for the filtering criterion, at least one SDR;

    receiving, by the filtering module, a plurality of streamed documents from a data source;

    generating, by the representation generator, for a first of the plurality of streamed documents, a compound SDR for a first of the plurality of streamed documents;

    determining, by a similarity engine executing on the second computing device, a distance between the filtering criterion SDR and the generated compound SDR for the first of the plurality of streamed documents; and

    acting, by the filtering module, on the first streamed document, based upon the determined distance.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×