Associative memory
First Claim
1. A computer-implemented method of simulating an associative memory capable of retrieving a document similar to an inputted query document from a set of stored documents, the method comprising:
- coding all documents of the set through a plurality of feature vectors, each feature vector belonging to a particular document or part of the particular document of the set and comprising a series of vector elements which relate to certain features present or absent in the particular document or part of the particular document, the feature vectors each comprising a series of bits as vector elements, wherein each feature of the certain features is associated to one particular bit of the series of bits, the particular bit being set to a logical one if the associated feature is present in a respective document or part of the respective document, so that the series of bits codes for the presence or absence of certain features in the respective document or part of the respective document of the set;
arranging the feature vectors in a matrix, so that the rows of the matrix each are associated to the particular document or part of the particular document of the set and corresponds to a respective feature vector of the plurality of feature vectors, the matrix being stored column-wise to provide that respective bits of each respective matrix column, which respectively are set or not set to a logical one, are simultaneously processed in one processor operation by a processor used for obtaining the result column;
generating a query feature vector based on the query document and according to the rules used for generating the feature vectors corresponding to the set of stored documents such that the query vector corresponds in its length to the length of a row of the matrix, the query feature vector consisting of the series of bits as vector elements, wherein each feature of the certain features is associated to one particular bit of the series of bits, the particular bit being set to a logical one if the associated feature is present in the query document, so that the series of bits codes for the presence or absence of certain features in the query document;
obtaining, on basis of the matrix and the query feature vector, a result column coding for a similarity measure to indicate a similarity between the query document and all documents or parts of documents represented in the matrix, wherein in order to obtain the result column a logical operation is performed on those columns of the matrix for which the respective bit of the query vector is set to a logical one, so a result vector is obtained which codes the similarity measure between the query document and the documents or parts of the documents of the set, the logical operations comprising a logical operation performed bitwise between the bits of those columns of the matrix;
retrieving one or more stored documents based on the obtained similarity measure;
wherein the method is characterized by;
treating the documents of the set as hierarchically structured sets of units, with the respective document of the set consisting of several units, in such a way that the feature vectors, which are arranged in the matrix, are unit feature vectors, which are obtained by coding a respective unit of the respective document of the set each through a corresponding unit feature vector comprising a series of bits which respectively are set to a logical one if the respective associated feature of the certain features is present in the respective unit to code for the presence or absence of the certain features in the respective unit, so that the rows of the matrix each are associated to the respective unit of the respective document of the set and correspond to the respective unit feature vector coded therefor;
obtaining scores from the result vector which represent as a similarity measure a frequency of occurrence of logical ones in the rows of the matrix at the positions indicated by the positions of logical ones in the query feature vector;
adding for the documents of the set the scores which are obtained for the different units of the respective document to obtain a document score of each document of the set; and
selecting for retrieval and retrieving a best document which has the highest document score and/or for which the document score fulfills a score threshold condition.
10 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method of realizing an associative memory capable of storing a set of documents and retrieving one or more stored documents similar to an inputted query document, said method comprising: coding each document or a part of it through a corresponding feature vector consisting of a series of bits which respectively code for the presence or absence of certain features in said document; arranging the feature vectors in a matrix; generating a query feature vector based on the query document and according to the rules used for generating the feature vectors corresponding to the stored documents such that the query vector corresponds in its length to the width of the matrix; storing the matrix column-wise; for those columns of the matrix where the query vector indicates the presence of a feature, bitwise performing one or more of preferably hardware supported logical operations between the columns of the matrix to obtain one or more additional result columns coding for a similarity measure between the query and parts or the whole of the stored documents; and said method further comprising one or a combination of the following: retrieval of one or more stores documents based on the obtained similarity measure; and or storing a representation of a document through its feature vector into the above matrix.
-
Citations
1 Claim
-
1. A computer-implemented method of simulating an associative memory capable of retrieving a document similar to an inputted query document from a set of stored documents, the method comprising:
-
coding all documents of the set through a plurality of feature vectors, each feature vector belonging to a particular document or part of the particular document of the set and comprising a series of vector elements which relate to certain features present or absent in the particular document or part of the particular document, the feature vectors each comprising a series of bits as vector elements, wherein each feature of the certain features is associated to one particular bit of the series of bits, the particular bit being set to a logical one if the associated feature is present in a respective document or part of the respective document, so that the series of bits codes for the presence or absence of certain features in the respective document or part of the respective document of the set; arranging the feature vectors in a matrix, so that the rows of the matrix each are associated to the particular document or part of the particular document of the set and corresponds to a respective feature vector of the plurality of feature vectors, the matrix being stored column-wise to provide that respective bits of each respective matrix column, which respectively are set or not set to a logical one, are simultaneously processed in one processor operation by a processor used for obtaining the result column; generating a query feature vector based on the query document and according to the rules used for generating the feature vectors corresponding to the set of stored documents such that the query vector corresponds in its length to the length of a row of the matrix, the query feature vector consisting of the series of bits as vector elements, wherein each feature of the certain features is associated to one particular bit of the series of bits, the particular bit being set to a logical one if the associated feature is present in the query document, so that the series of bits codes for the presence or absence of certain features in the query document; obtaining, on basis of the matrix and the query feature vector, a result column coding for a similarity measure to indicate a similarity between the query document and all documents or parts of documents represented in the matrix, wherein in order to obtain the result column a logical operation is performed on those columns of the matrix for which the respective bit of the query vector is set to a logical one, so a result vector is obtained which codes the similarity measure between the query document and the documents or parts of the documents of the set, the logical operations comprising a logical operation performed bitwise between the bits of those columns of the matrix; retrieving one or more stored documents based on the obtained similarity measure; wherein the method is characterized by; treating the documents of the set as hierarchically structured sets of units, with the respective document of the set consisting of several units, in such a way that the feature vectors, which are arranged in the matrix, are unit feature vectors, which are obtained by coding a respective unit of the respective document of the set each through a corresponding unit feature vector comprising a series of bits which respectively are set to a logical one if the respective associated feature of the certain features is present in the respective unit to code for the presence or absence of the certain features in the respective unit, so that the rows of the matrix each are associated to the respective unit of the respective document of the set and correspond to the respective unit feature vector coded therefor; obtaining scores from the result vector which represent as a similarity measure a frequency of occurrence of logical ones in the rows of the matrix at the positions indicated by the positions of logical ones in the query feature vector; adding for the documents of the set the scores which are obtained for the different units of the respective document to obtain a document score of each document of the set; and selecting for retrieval and retrieving a best document which has the highest document score and/or for which the document score fulfills a score threshold condition.
-
Specification