×

Methods and systems for mapping data items to sparse distributed representations

  • US 10,394,851 B2
  • Filed: 08/03/2015
  • Issued: 08/27/2019
  • Est. Priority Date: 08/07/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method performed by at least one computer processor of each of a plurality of computing devices executing computer program instructions stored on at least one non-transitory computer-readable medium, wherein the computer program instructions are executable by the at least one computer processor to perform a method for enhancing a computing networking including a full-text search system through enhancement of queries based upon determining similarities between data items mapped to sparse distributed representations, the method comprising:

  • clustering in a two-dimensional metric space, by a reference map generator, executing on a first computing device, a set of data documents selected according to at least one criterion, generating a semantic map;

    associating, by the semantic map, a coordinate pair with each of the set of data documents;

    generating, by a parser executing on the first computing device, an enumeration of data items occurring in the set of data documents;

    determining, by a representation generator executing on the first computing device, for each data item in the enumeration, occurrence information including;

    (i) a number of data documents in which the data item occurs, (ii) a number of occurrences of the data item in each data document, and (iii) the coordinate pair associated with each data document in which the data item occurs;

    generating, by the representation generator, a distributed representation using the occurrence information;

    receiving, by a sparsifying module executing on the first computing device, an identification of a maximum level of sparsity;

    reducing, by the sparsifying module, a total number of set bits within the distributed representation based on the maximum level of sparsity to generate a sparse distributed representation (SDR) having a normative fillgrade;

    generating, by the representation generator and the sparsifying module, at least one SDR for each data item in the enumeration of data items occurring in the set of data documents;

    storing, in an SDR database, each of the generated SDRs;

    receiving, by a query expansion module executing on a second computing device, from a third computing device, a first term;

    determining, by a similarity engine executing on a fourth computing device, a level of semantic similarity between a first SDR generated based on the first term and a second SDR of a second term, the second SDR retrieved from the SDR database;

    transmitting, by the query expansion module, to a full-text search system, using the first term and the second term, a query for an identification of each of a subset of a second set of documents containing at least one term similar to at least one of the first term and the second term; and

    transmitting, by the query expansion module, to the third computing device, the identification received from the full-text search system of each of the subset of the second set of documents containing at least one term similar to at least one of the first term and the second term.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×