Intelligent query system for automatically indexing in a database and automatically categorizing users
First Claim
1. A method for categorizing information in an information source, comprising:
- converting information into different vector spaces;
identifying central concepts in the vector spaces;
identifying in each of the different vector spaces the information clustered around the identified central concepts; and
displaying to a user through a graphical user interface the information according to the identified central concepts in the different vector spaces.
5 Assignments
0 Petitions
Accused Products
Abstract
An intelligent Query Engine (IQE) system automatically develops multiple information spaces in which different types of real-world objects (e.g., documents, users, products) can be represented. Machine learning techniques are used to facilitate automated emergence of information spaces in which objects are represented as vectors of real numbers. The system then delivers information to users based upon similarity measures applied to the representation of the objects in these information spaces. The system simultaneously classifies documents, users, products, and other objects. Documents are managed by collators that act as classifiers of overlapping portions of the database of documents. Collators evolve to meet the demands for information delivery expressed by user feedback. Liaisons act on the behalf of users to elicit information from the population of collators. This information is then presented to users upon logging into the system via Internet or another communication channel. Mites handle incoming documents from multiple information sources (e.g., in-house editorial staff, third-party news feeds, large databases, World Wide Web spiders) and feed documents to those collators which provide a good fit for the new documents.
-
Citations
27 Claims
-
1. A method for categorizing information in an information source, comprising:
-
converting information into different vector spaces;
identifying central concepts in the vector spaces;
identifying in each of the different vector spaces the information clustered around the identified central concepts; and
displaying to a user through a graphical user interface the information according to the identified central concepts in the different vector spaces. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
converting the information into information vectors;
displaying distribution of the information vectors in the vector spaces;
selecting centroid vectors representing the densest neighborhoods of information vectors; and
displaying the information having information vectors closest to the selected centroid vectors.
-
-
3. A method according to claim 1 wherein categorizing the information includes:
-
generating topics for a query;
casting the topics in terms of text descriptions;
converting the text descriptions into an artificial centroid vector;
projecting the artificial centroid vector into the vector spaces; and
displaying the information most closely related to the artificial centroid vector.
-
-
4. A method according to claim 3 whereby a predefined of set words is used to generate the topics.
-
5. A method according to claim 1 including displaying to the user how closely the displayed information matches the central concepts.
-
6. A method according to claim 1 including automatically adapting the central concepts to the interests of the user by having the vector spaces compete against each other for supplying the most relevant information to the user.
-
7. A method according to claim 1 including generating offspring from the vector spaces that are successful over time in identifying information of most interest to the user.
-
8. A method according to claim 1 including:
-
receiving information queries from the user;
mapping the information queries into the different vector spaces;
identifying which central concepts in the vector spaces map closest to the information queries;
identifying the information closest to the identified concepts; and
supplying the identified information and the closest identified concepts to the user.
-
-
9. A method according to claim 1 including:
-
rating the displayed information;
mapping the rated information into each vector space;
identifying new information in each vector space similar to the mapped rated information; and
displaying the identified new information to the user.
-
-
10. A method according to claim 1 including:
-
retrieving user profile data;
generating a list of facts from the profile data relevant to the user;
mapping the list of facts into the vector spaces;
identifying information in each of the vector spaces similar to the list of facts; and
displaying the identified information to the user.
-
-
11. A method according to claim 1 including:
-
creating a list containing facts associated with the user; and
mapping those facts into the vector spaces to locate other users having similar facts.
-
-
12. A method according to claim 11 including:
-
selecting the most similar other users;
identifying information closest to central concepts in the vector spaces of the selected other users; and
displaying the identified information to the user.
-
-
13. A system for information retrieval and categorization, comprising:
-
an information space;
a vector space locating contextual relationships in the information space;
a centroid space categorizing the vector space into central concepts;
a collator that automatically adapts the central concepts to the reading interests of a user by controlling evolution of the vector space over time according to the relevancy of the central concepts to information queries; and
a liaison that retrieves and displays the information according to the central concepts. - View Dependent Claims (14, 15, 16)
-
-
17. A search engine for identifying information responsive to user queries, the search engine comprising:
-
an initial stage where an information space is formed and a vector space is generated that identifies central concepts in the information space;
a query phase where the central concepts most relevant to the user queries are identified;
a display phase where the information most closely tied to the identified central concepts are displayed to the user; and
an evolutionary phase where portions of the vector space most pertinent to the user queries reproduce while other portions of the vector space least similar to the central concepts are discarded. - View Dependent Claims (18)
-
-
19. A method for categorizing users in an information retrieval system, comprising:
-
mapping reading histories for multiple users into vector spaces;
identifying central concepts in the vector spaces;
mapping a reading history for a target user into the vector spaces;
identifying the central concepts most relevant to the reading history of the target user; and
displaying information to the target user most closely clustered around the identified central concepts. - View Dependent Claims (20)
-
-
21. A method for categorizing information in an information source, comprising:
-
converting information into different vector spaces;
identifying central concepts in the vector spaces;
identifying in each of the different vector spaces the information clustered around the identified central concepts;
converting the information into information vectors;
displaying distribution of the information vectors in the vector spaces;
selecting centroid vectors representing the densest neighborhoods of information vectors;
displaying to a user through a graphical user interface the information according to the identified central concepts in the different vector spaces; and
displaying to the user through the graphical user interface the information having information vectors closest to the selected centroid vectors.
-
-
22. A method for categorizing information in an information source, comprising:
-
converting information into different vector spaces;
identifying central concepts in the vector spaces;
identifying in each of the different vector spaces the information clustered around the identified central concepts;
generating topics for a query;
casting the topics in terms of text descriptions;
converting the text descriptions into an artificial centroid vector;
projecting the artificial centroid vector into the vector spaces;
displaying to a user through a graphical user interface the information according to the identified central concepts in the different vector spaces; and
displaying to a user through a graphical user interface the information most closely related to the artificial centroid vector.
-
-
23. A method for categorizing information in an information source, comprising:
-
converting information into different vector spaces;
identifying central concepts in the vector spaces;
identifying in each of the different vector spaces the information clustered around the identified central concepts;
converting the information into information vectors;
identifying centroid vectors representing the densest neighborhoods of information vectors;
displaying to a user through a graphical user interface the information according to the identified central concepts in the different vector spaces;
displaying to the user through the graphical user interface the information having information vectors most closely related to the centroid vectors;
generating topics for a query;
casting the topics in terms of text descriptions;
converting the text descriptions into an artificial centroid vector;
projecting the artificial centroid vectors into the vector spaces; and
displaying the information most closely related to the artificial centroid vector.
-
-
24. A method for categorizing information in an information source, comprising:
-
converting information into different vector spaces;
identifying central concepts in the vector spaces;
identifying in each of the different vector spaces the information clustered around the identified central concepts;
converting the information into information vectors;
identifying centroid vectors representing the densest neighborhoods of information vectors;
displaying to a user through a graphical user interface the information according to the identified central concepts in the different vector spaces;
displaying to the user through the graphical user interface the information having information vectors most closely related to the centroid vectors;
identifying a profile for a first user;
locating other users having similar profiles;
identifying vector spaces associated with the other users; and
using the vector spaces of the located other users to identify information for the first user.
-
-
25. A system for information retrieval and categorization, comprising:
-
an information space;
a vector space locating contextual relationships in the information space;
a centroid space categorizing the vector space into central concepts;
the centroid space representing the densest neighborhoods of information space;
a collator that automatically adapts the central concepts to the reading interests of a user by controlling evolution of the vector space over time according to the relevancy of the central concepts to information queries;
a liaison that retrieves and displays the information according to the central concepts;
the liaison displaying the information having information space most closely related to the centroid space;
feedback data from the user for mapping into the vector space, the feedback data used to identify others having similar feedback data;
a recommendations list that merges together information related to the other users having most similar feedback data; and
a display for displaying the recommendations list to the user.
-
-
26. A system for information retrieval and categorization, comprising:
-
an information space;
a vector space locating contextual relationships in the information space;
a centroid space categorizing the vector space into central concepts;
the centroid space representing the densest neighborhoods of information space;
a collator that automatically adapts the central concepts to the reading interests of a user by controlling evolution of the vector space over time according to the relevancy of the central concepts to information queries;
a liaison that retrieves and displays the information according to the central concepts;
the liaison displaying the information having information space most closely related to the centroid space; and
the centroid space classifying the multiple users into groups having similar profile characteristics.
-
-
27. A method for categorizing users in an information retrieval system, comprising:
-
mapping reading histories for multiple users into vector spaces, wherein the mapping reading histories of multiple users includes;
maintaining a feedback event table identifying information supplied to the multiple users during previous queries;
ranking the information in the feedback event table according to the relevance of the information to the previous queries;
mapping the ranked information into the vector spaces;
generating a feedback event table vector that is located in the vector spaces according to the mapped information and the rankings associated with the mapped information;
locating similar feedback event table vectors in the vector spaces for other users; and
identifying the information associated with the similar feedback event table vectors;
identifying central concepts in the vector spaces;
mapping a reading history for a target user into the vector spaces;
identifying the central concepts most relevant to the reading history of the target user;
displaying information to the target user most closely clustered around the identified central concepts; and
identifying centroid vectors representing the densest neighborhoods of vector spaces.
-
Specification