Method and system for unified information representation and applications thereof
First Claim
1. A method, implemented on a machine having at least one processor, storage, and a communication platform connected to a network for archiving a document, comprising the steps of:
- receiving a document via the communication platform;
analyzing, by a feature extractor, the received document in accordance with at least one model to form a feature-based vector characterizing the document;
generating, by a semantic extractor, a semantic-based representation of the document based on the feature-based vector, wherein the semantic-based representation has a reduced dimension;
constructing, by a reconstruction unit, a reconstructed feature-based vector based on the semantic-based representation of the document, by mapping the semantic-based representation to a feature space of the feature-based vector;
comparing, by a discrepancy analyzer, the feature-based vector with the reconstructed feature-based vector to identify a difference between the feature-based vector and the reconstructed feature-based vector;
forming a residual feature-based representation of the document based on the difference between the feature-based vector and the reconstructed feature-based vector;
generating, by a unified representation construction unit, a unified representation for the document based on the semantic-based representation and the residual feature-based representation; and
archiving the document in an information archive based on the unified representation of the document.
1 Assignment
0 Petitions
Accused Products
Abstract
Method, system, and programs for information search and retrieval. A query is received and is processed to generate a feature-based vector that characterizes the query. A unified representation is then created based on the feature-based vector, that integrates semantic and feature based characterizations of the query. Information relevant to the query is then retrieved from an information archive based on the unified representation of the query. A query response is generated based on the retrieved information relevant to the query and is then transmitted to respond to the query.
71 Citations
31 Claims
-
1. A method, implemented on a machine having at least one processor, storage, and a communication platform connected to a network for archiving a document, comprising the steps of:
-
receiving a document via the communication platform; analyzing, by a feature extractor, the received document in accordance with at least one model to form a feature-based vector characterizing the document; generating, by a semantic extractor, a semantic-based representation of the document based on the feature-based vector, wherein the semantic-based representation has a reduced dimension; constructing, by a reconstruction unit, a reconstructed feature-based vector based on the semantic-based representation of the document, by mapping the semantic-based representation to a feature space of the feature-based vector; comparing, by a discrepancy analyzer, the feature-based vector with the reconstructed feature-based vector to identify a difference between the feature-based vector and the reconstructed feature-based vector; forming a residual feature-based representation of the document based on the difference between the feature-based vector and the reconstructed feature-based vector; generating, by a unified representation construction unit, a unified representation for the document based on the semantic-based representation and the residual feature-based representation; and archiving the document in an information archive based on the unified representation of the document. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method, implemented on a machine having at least one processor, storage, and a communication platform connected to a network for archiving a document, comprising the steps of:
-
receiving a document via the communication platform; analyzing, by a feature extractor, the received document in accordance with at least one model to form a feature-based vector characterizing the document; generating, by a semantic extractor, a semantic-based representation of the document based on the feature-based vector, wherein the semantic-based representation has a reduced dimension; constructing, by a reconstruction unit, a reconstructed feature-based vector based on the semantic-based representation of the document, by mapping the semantic-based representation to a feature space of the feature-based vector; forming a blurred feature-based representation of the document based on a difference between the feature-based vector and the reconstructed feature-based vector; generating, by a unified representation construction unit, a unified representation for the document based on the blurred feature-based representation and the semantic-based representation; and archiving the document in an information archive based on the unified representation of the document. - View Dependent Claims (9, 10, 11)
-
-
12. A method, implemented on a machine having at least one processor, storage, and a communication platform connected to a network for search and retrieval of information archived based on a unified representation, comprising the steps of:
-
obtaining a query via the communication platform; processing, by a query processor, the query to generate a feature-based vector characterizing the query; generating, by a semantic extractor, a semantic-based representation of the query based on the feature-based vector, wherein the semantic-based representation has a reduced dimension; constructing, by a reconstruction unit, a reconstructed feature-based vector based on the semantic-based representation of the query, by mapping the semantic-based representation to a feature space of the feature-based vector; comparing, by a discrepancy analyzer, the feature-based vector with the reconstructed feature-based vector to identify a difference between the feature-based vector and the reconstructed feature-based vector; forming a residual feature-based representation of the query based on the difference between the feature-based vector and the reconstructed feature-based vector; generating, by a unified representation construction unit, a unified representation of the query based on the semantic-based representation and the residual feature-based representation; retrieving, by a candidate search unit, information relevant to the query from an information archive based on the unified representation of the query; generating, by a query response generator, a query response based on the information relevant to the query retrieved from the information archive; and transmitting the query response to respond to the query. - View Dependent Claims (13, 14)
-
-
15. A system having at least one processor, storage, and a communication platform for generating a unified representation for a document, comprising:
-
a communication platform through which a document can be received; a feature extractor configured for analyzing the received document in accordance with at least one model to form a feature-based vector characterizing the document; a semantic extractor configured for generating a semantic-based representation of the document based on the feature-based vector, wherein the semantic-based representation has a reduced dimension; a reconstruction unit configured for producing a reconstructed feature-based vector based on the semantic-based representation of the document by mapping the semantic-based representation to a feature space of the feature-based vector; a residual feature identifier configured for forming a residual feature-based representation of the document based on the difference between the feature-based vector and the reconstructed feature-based vector; and a unified representation construction unit configured for generating a unified representation for the document based on the semantic-based representation and the residual feature-based representation. - View Dependent Claims (16, 17, 18)
-
-
19. A system having at least one processor, storage, and a communication platform for search and retrieval of information archived based on a unified representation, comprising:
-
a communication platform for obtaining a query and transmitting a query response; a query processor configured for processing the query to generate a feature-based vector characterizing the query; a semantic extractor configured for generating a semantic-based representation of the query based on the feature-based vector, wherein the semantic-based representation has a reduced dimension; a reconstruction unit configured to construct a reconstructed feature-based vector based on the semantic-based representation of the query by mapping the semantic-based representation to a feature space of the feature-based vector; a residual feature identifier configured for forming a residual feature-based representation of the query based on the difference between the feature-based vector and the reconstructed feature-based vector; a query representation generator configured for generating a unified representation for the query based on the semantic-based representation and the residual feature-based representation, wherein the unified representation integrates semantic and residual feature based characterizations of the query; a candidate search unit configured for retrieving information relevant to the query from an information archive based on the unified representation for the query; and a query response generator configured for generating the query response based on the information relevant to the query retrieved from the information archive and transmitting the query response to respond to the query.
-
-
20. A system having at least one processor, storage, and a communication platform for search and retrieval of information archived based on a unified representation, comprising:
-
a communication platform for obtaining a query and transmitting a query response; a query processor configured for processing the query to generate a feature-based vector characterizing the query; a semantic extractor configured for generating a semantic-based representation of the query based on the feature-based vector, wherein the semantic-based representation has a reduced dimension; a reconstruction unit configured to construct a reconstructed feature-based vector based on the semantic-based representation of the query by mapping the semantic-based representation to a feature space of the feature-based vector; a feature vector blurring unit configured for generating a blurred feature-based representation of the query based on a difference between the feature-based vector and the reconstructed feature-based vector; a query representation generator configured for generating a unified representation for the query based on the semantic-based representation and the blurred feature-based representation; a candidate search unit configured for retrieving information relevant to the query from an information archive based on the unified representation for the query; and a query response generator configured for generating the query response based on the information relevant to the query retrieved from the information archive and transmitting the query response to respond to the query.
-
-
21. A machine-readable non-transitory medium having information recorded thereon related to document archiving, the information, when read by the machine, causes the machine to perform the following:
-
receiving a document via a communication platform; analyzing the received document in accordance with at least one model to form a feature-based vector characterizing the document; generating a semantic-based representation of the document based on the feature-based vector, wherein the semantic-based representation has a reduced dimension; constructing a reconstructed feature-based vector based on the semantic-based representation of the document, by mapping the semantic-based representation to a feature space of the feature-based vector; comparing the feature-based vector with the reconstructed feature-based vector to identify a difference between the feature-based vector and the reconstructed feature-based vector; forming a residual feature-based representation of the document based on the difference between the feature-based vector and the reconstructed feature-based vector; generating a unified representation for the document based on the semantic-based representation and the residual feature-based representation; and archiving the document in an information archive based on the unified representation of the document. - View Dependent Claims (22, 23, 24)
-
-
25. A machine-readable non-transitory medium having information recorded thereon for document archiving, the information, when read by the machine, causes the machine to perform the following:
-
receiving a document via a communication platform; analyzing the received document in accordance with at least one model to form a feature-based vector characterizing the document; generating a semantic-based representation of the document based on the feature-based vector, wherein the semantic-based representation has a reduced dimension; constructing a reconstructed feature-based vector based on the semantic-based representation of the document, by mapping the semantic-based representation to a feature space of the feature-based vector; forming a blurred feature-based representation of the document based on a difference between the feature-based vector and the reconstructed feature-based vector; generating a unified representation for the document based on the blurred feature-based representation and the semantic-based representation; and archiving the document in an information archive based on the unified representation of the document. - View Dependent Claims (26, 27, 28)
-
-
29. A machine-readable non-transitory medium having information recorded thereon for information search and retrieval, when read by the machine, causes the machine to perform the following:
-
obtaining a query via a communication platform; processing the query to generate a feature-based vector characterizing the query; generating a semantic-based representation of the query based on the feature-based vector, wherein the semantic-based representation has a reduced dimension; constructing a reconstructed feature-based vector based on the semantic-based representation of the query, by mapping the semantic-based representation to a feature space of the feature-based vector; comparing the feature-based vector with the reconstructed feature-based vector to identify a difference between the feature-based vector and the reconstructed feature-based vector; forming a residual feature-based representation of the query based on the difference between the feature-based vector and the reconstructed feature-based vector; generating a unified representation of the query based on the semantic-based representation and the residual feature-based representation, wherein the unified representation integrates semantic and residual feature based characterizations of the query; retrieving information relevant to the query from an information archive based on the unified representation of the query; generating a query response based on the information relevant to the query retrieved from the information archive; and transmitting the query response to respond to the query. - View Dependent Claims (30, 31)
-
Specification