Intelligent query system for automatically indexing information in a database and automatically categorizing users
First Claim
1. An evolutionary system for identifying information, comprising:
- multiple information sets each representing a portion of the information;
multiple collators each independently deriving vector spaces from associated information sets and identifying concepts in the vector spaces; and
the multiple collators independently identifying information in the associated information sets according to the identified concepts in the vector spaces and competing against each other to identify relevant information in response to information queries.
5 Assignments
0 Petitions
Accused Products
Abstract
An Intelligent Query Engine (IQE) system automatically develops multiple information spaces in which different types of real-world objects (e.g., documents, users, products) can be represented. Machine learning techniques are used to facilitate automated emergence of information spaces in which objects are represented as vectors of real numbers. The system then delivers information to users based upon similarity measures applied to the representation of the objects in these information spaces. The system simultaneously classifies documents, users, products, and other objects. Documents are managed by collators that act as classifiers of overlapping portions of the database of documents. Collators evolve to meet the demands for information delivery expressed by user feedback. Liaisons act on the behalf of users to elicit information from the population of collators. This information is then presented to users upon logging into the system via Internet or another communication channel. Mites handle incoming documents from multiple information sources (e.g., in-house editorial staff, third-party news feeds, large databases, World Wide Web spiders) and feed documents to those collators which provide a good fit for the new documents.
694 Citations
48 Claims
-
1. An evolutionary system for identifying information, comprising:
-
multiple information sets each representing a portion of the information; multiple collators each independently deriving vector spaces from associated information sets and identifying concepts in the vector spaces; and the multiple collators independently identifying information in the associated information sets according to the identified concepts in the vector spaces and competing against each other to identify relevant information in response to information queries. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method for identifying relevant information in an information source, comprising:
-
converting different sets of information into different vector spaces; converting the vector spaces into associated centroid spaces that identify central concepts for the sets of information that comprise the vector spaces; independently identifying in each of the different centroid spaces the information clustered around the identified central concepts; and controlling genetic evolution for each of the vector spaces according to the similarity of the identified information to the central concepts. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A genetic system for information retrieval and information categorization, comprising:
-
a corpus of information; a multidimensional vector space derived from the corpus of information, the vector space comprising a set of axes that locate contextual relationships in the corpus of information; a centroid space that locates central concepts in the vector space; and a collator that automatically controls evolution of the vector space over time according to the relevancy of the central concepts to information queries. - View Dependent Claims (31, 32, 33, 34, 35, 36, 37, 38)
-
-
39. A method for processing queries in an information retrieval system, comprising;
-
initiating selectable query modes; generating a query according to the query modes selected; identifying the concepts in the information set most similar to the query; identifying the information in the information set most closely clustered around the identified concepts; generating a goodness score indicating how closely the query relates to the identified concepts; combining the identified information and the goodness score into a recommendations list; wherein one of the query modes comprises a knowledge-based query including; retrieving user profile data; creating an expert recommendations list from the profile data containing facts relevant to the user weighted by confidence levels; broadcasting an identifier for each fact separately to the collator; recalling stored topic vectors representing the fact identifiers in the collator; and identifying information in the collator similar to each of the topic vectors.
-
-
40. A method for processing queries in an information retrieval system, comprising;
-
initiating selectable query modes; generating a query according to the query modes selected; identifying the concepts in the information set most similar to the query; identifying the information in the information set most closely clustered around the identified concepts; generating a goodness score indicating how closely the query relates to the identified concepts; combining the identified information and the goodness score into a recommendations list; wherein one of the query modes comprises a user query including; identifying a feedback event table for a user; broadcasting the identified feedback event table to the collator; recalling a feedback event table vector in the collator for the identified feedback event table; and identifying information in the collator similar to the feedback event vector.
-
-
41. A method for processing queries in an information retrieval system, comprising;
-
initiating selectable query modes; generating a query according to the query modes selected; identifying the concepts in the information set most similar to the query; identifying the information in the information set most closely clustered around the identified concepts; generating a goodness score indicating how closely the query relates to the identified concepts; combining the identified information and the goodness score into a recommendations list; wherein one of the query modes comprises a type 1 social query including; generating a feedback event table rating information according to its relevancy to previous queries; mapping the information in the feedback event table into the collator; identifyg a feedback event table vector in the collator according to the mapped set of information and the rating associated with the information; locating in the collator other similar feedback event table vectors representing reading interests of other users; generating a goodness score for the collator indicating how closely the feedback event table vector for the user relates to the central concepts of the collator; and generating a recommendations list for the user listing the feedback event tables for the most similar other users and the goodness score.
-
-
42. A method for processing queries in an information retrieval system, comprising;
-
initiating selectable query modes; generating a query according to the query modes selected; identifying the concepts in the information set most similar to the query; identifying the information in the information set most closely clustered around the identified concepts; generating a goodness score indicating how closely the query relates to the identified concepts; combining the identified information and the goodness score into a recommendations list; wherein one of the query modes comprises a type 2 social query as follows; using a knowledge-based system to look up facts about the user; creating an expert recommendations list containing facts relevant to the user weighted by confidence levels; identifying a key set of facts in the expert recommendations list; locating other users according to similarity of the key facts and the confidence levels of the similar key facts; and returning a recommendations list by the knowledge-based system of the identified similar users.
-
-
43. A method for categorizing users in an information retrieval system, comprising:
-
mapping reading histories for multiple users into multiple vector spaces; identifying central concepts in the vector spaces; mapping a reading history for a target user into the multiple vector spaces; identifying which central concepts are most relevant to the reading history of the target user; generating a recommendations list identifying the users most closely clustered; and wherein mapping reading histories of multiple users includes; maintaining a feedback event table identifying information supplied to the users during previous queries; ranking the information in the feedback event table according to the relevance of the information to the previous queries; mapping the information into the vector spaces; generating a feedback event table vector that is located in the vector spaces according to the mapped information and the ratings associated with the mapped information; locating similar feedback event table vectors in the vector spaces for other users; and generating a recommendations list identifying the similar uses.
-
-
44. A method for adapting a semantic space comprising:
-
generating the semantic space from a resident set of information; continuously checking for new information that become available in the information source; computing a goodness value that characterizes the closeness of the new information to concepts in the semantic space for the resident set of information; and automatically adding the new information to the resident set of information when the goodness value meets a given threshold. - View Dependent Claims (45, 46, 47)
-
-
48. A system for classifying information, comprising;
-
a knowledge-based system that includes facts and sets of rules over the facts, the knowledge-based system inferring facts from initial information, assigning confidence levels for each of the inferred facts, and identifying key facts according to the assigned confidence levels; an artificial neural network that converts a corpus of information into a multidimensional vector space having a set of axes that locate contextual relationships in the corpus of information, the neural network receiving a key fact from the knowledge-based system, mapping the key fact into the vector space, and identifying information in the vector space similar to the key fact; and an information processor for representing, storing, and incrementally improving the representations of facts from the knowledge-based system within the vector space of the neural network.
-
Specification