Methods and apparatus for finding semantic information, such as usage logs, similar to a query using a pattern lattice data space
First Claim
1. For use in a computing system in which information is represented as graphs, each graph having entity nodes which depict properties or attributes of the information and relationship links which denote relationships among the entity nodes, a method for finding information based on a query, the method comprising:
- a) receiving the query;
b) generating a subcone lattice from the query to a selected one of a set depth and a set number of lattice nodes, the subcone lattice having lattice nodes corresponding to graphs and lattice links between lattice nodes corresponding to a graph and a sub-graph of the graph;
c) ranking the lattice nodes generated in the subcone lattice that correspond to graphs matching the graphs representing the information and the lattice nodes generated in the subcone lattice that correspond to graphs which are subgraphs of the graphs representing the information; and
d) causing the computing system to output at least one query result based on the ranking.
3 Assignments
0 Petitions
Accused Products
Abstract
A pattern lattice data space as a framework for analyzing data, in which both schema-based and statistical analysis are accommodated, is defined. Ways to manage the size of the lattice structures in the pattern lattice data space are described. Utilities to classify or cluster, search (find similar data), or relate data using lattice fragments in the pattern lattice data space are also described. Superpattern cone or lattice generation function, which may be used by the classification and clustering functions, is also described. In addition, a subpattern cone or lattice generation process, which may be used by the search (find similar data) and data relating functions, is also described. Finally, a function to label, in readily understandable “pidgin”, categories which classify information, is also described.
-
Citations
48 Claims
-
1. For use in a computing system in which information is represented as graphs, each graph having entity nodes which depict properties or attributes of the information and relationship links which denote relationships among the entity nodes, a method for finding information based on a query, the method comprising:
-
a) receiving the query;
b) generating a subcone lattice from the query to a selected one of a set depth and a set number of lattice nodes, the subcone lattice having lattice nodes corresponding to graphs and lattice links between lattice nodes corresponding to a graph and a sub-graph of the graph;
c) ranking the lattice nodes generated in the subcone lattice that correspond to graphs matching the graphs representing the information and the lattice nodes generated in the subcone lattice that correspond to graphs which are subgraphs of the graphs representing the information; and
d) causing the computing system to output at least one query result based on the ranking. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
i) representing the query as a graph having entity nodes and relationship links; - and
ii) for each link which is an element of the graph representing the query, A) legalizing all connected components of the graph representing the query, having the link, and B) for each legal connected entity, determining whether or not a pattern lattice node, corresponding to the legal connected entity, is new and if so, adding the pattern lattice node to the subcone and generating a subcone from the added pattern lattice node.
-
-
3. The method of claim 2 wherein the step of determining whether or not a pattern lattice node, corresponding to the legal connected entity, is new includes:
-
1) hashing the legal connected entity;
2) hashing the graphs corresponding to existing lattice nodes of the subcone lattice; and
3) comparing the legal connected entity to the graphs corresponding to existing lattice nodes of the subcone lattice that have the same hash values.
-
-
4. The method of claim 1 wherein the step of generating a subcone lattice from the query includes
i) representing the query as a graph having entity nodes and relationship links; - and
ii) for each separation link which is an element of the graph representing the query, A) determining the largest component of the graph, representing the query, having the separation link, and B) determining whether or not a pattern lattice node, corresponding to the determined largest component, is new and if so, adding the pattern lattice node to the subcone and generating a subcone from the added pattern lattice node.
- and
-
5. The method of claim 4 wherein the step of determining whether or not a pattern lattice node, corresponding to the legal connected entity, is new includes:
-
1) hashing the legal connected entity;
2) hashing the graphs corresponding to existing lattice nodes of the subcone lattice; and
3) comparing the legal connected entity to the graphs corresponding to existing lattice nodes of the subcone lattice that have the same hash values.
-
-
6. The method of claim 1 wherein the step of generating a subcone lattice from the query includes
i) representing the query as a graph having entity nodes and relationship links; - and
ii) for each of the levels of the subgraph A) for each subgraph of the graph representing the query, determining whether or not the subgraph is a tree and 1) if so, removing an entity node from the subgraph and determining whether or not the resulting subgraph is legal, and 2) if not, removing a relationship link from the subgraph, and B) determining whether or not the resulting subgraph is connected and if so, 1) establishing a lattice link from a lattice node corresponding to the subgraph to a lattice node corresponding to the parent of the subgraph, and 2) inserting the lattice node corresponding to the subgraph into a next level of the lattice.
- and
-
7. The method of claim 1 wherein at least one atomic subgraph, which may not be broken into smaller subgraphs, is defined.
-
8. The method of claim 1 wherein at least one subgraph, which must be kept by any and all subgraphs generated, is defined.
-
9. The method of claim 1 wherein the step of ranking the information lattice nodes reached, includes
i) for each graph corresponding to information, A) finding a maximal common pattern in the data graph and the graph corresponding to the query, and B) ranking the data graph based on the maximal common pattern. -
10. The method of claim 1 wherein the step of ranking the information lattice nodes reached, includes determining a lattice distance between the lattice node corresponding to the graph representing the query and each of the information lattice nodes reached.
-
11. The method of claim 10 wherein the distances are adjusted based on statistics associated with the information.
-
12. The method of claim 1 wherein the information represented as graphs is user actions.
-
13. The method of claim 1 wherein the information represented as graphs is user actions related to maintaining an address book.
-
14. The method of claim 13 wherein the information represented as graphs includes a least one entity node selected from a group of entity nodes consisting of person, name, contact information, organization, job title, telephone number, room number, e-mail address, personal e-mail address, and alias e-mail address.
-
15. The method of claim 1 wherein the information represented as graphs is user actions related to processing order forms.
-
16. The method of claim 15 wherein the information represented as graphs includes a least one entity node selected from a group of entity nodes consisting of order, item, requester, approver, order ID and destination.
-
17. The method of claim 1 wherein the information represented as graphs is user actions related to maintaining a calendar.
-
18. The method of claim 17 wherein the information represented as graphs includes a least one entity node selected from a group of entity nodes consisting of name, room, meeting, appointment, time stamp, job title, organization, telephone number, and person.
-
19. The method of claim 1 wherein the information represented as graphs is user actions related to entertainment.
-
20. The method of claim 19 wherein the information represented as graphs includes a least one entity node selected from a group of entity nodes consisting of restaurant, restaurant rating, cuisine type, restaurant price rating, company, location, neighborhood, name, theater, movie, and movie rating.
-
21. The method of claim 1 wherein the information represented as graphs is user actions related to computer events.
-
22. The method of claim 21 wherein the information represented as graphs includes a least one entity node selected from a group of entity nodes consisting of event, deleted, opened, created, saved, sent, from agent, to agent, time stamp, document, entity, name, mime type, URL location, file location, access type, e-mail message, e-mail address, priority status, message ID, thread ID and description.
-
23. A machine-readable medium storing machine-executable instructions which, when executed by a machine, effect the method of claim 1.
-
24. In a computing system in which information is represented as graphs, each graph having entity nodes which depict properties or attributes of the information and relationship links which denote relationships among the entity nodes, a method for finding information based on a received query, the method comprising:
-
a) determining whether or not a graph corresponding to the query is relatively large;
b) if it is determined that the graph corresponding to the query is relatively large, then i) factoring the graph corresponding to the query, and ii) for each piece of the factored graph, A) generating a subcone lattice from the query to a selected one of a set depth and a set number of information lattice nodes, the subcone lattice having lattice nodes corresponding to graphs and lattice links between lattice nodes corresponding to a graph and a sub-graph of the graph, B) searching the generated subcone lattice for lattice nodes corresponding to information graphs to generate a list of information lattice nodes related to the graph corresponding to the query, C) sorting the list of information lattice nodes into categories, and D) ranking the information lattice nodes in each category;
c) ranking the information lattice nodes across all categories; and
d) causing the computing system to output at least one query result based on the ranking. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46)
i) representing the query as a graph having entity nodes and relationship links; - and
ii) for each link which is an element of the graph representing the query, A) legalizing all connected components of the graph, representing the query, having the link, and B) for each legal connected entity, determining whether or not a pattern lattice node, corresponding to the legal connected entity, is new and if so, adding the pattern lattice node to the subcone and generating a subcone from the added pattern lattice node.
-
-
26. The method of claim 25 wherein the step of determining whether or not a pattern lattice node, corresponding to the legal connected entity, is new includes:
-
1) hashing the legal connected entity;
2) hashing the graphs corresponding to existing lattice nodes of the subcone lattice; and
3) comparing the legal connected entity to the graphs corresponding to existing lattice nodes of the subcone lattice that have the same hash values.
-
-
27. The method of claim 24 wherein the step of generating a subcone lattice from the query includes
i) representing the query as a graph having entity nodes and relationship links; - and
ii) for each separation link which is an element of the graph representing the query, A) determining the largest component of the graph, representing the query, having the separation link, and B) determining whether or not a pattern lattice node, corresponding to the determined largest component, is new and if so, adding the pattern lattice node to the subcone and generating a subcone from the added pattern lattice node.
- and
-
28. The method of claim 27 wherein the step of determining whether or not a pattern lattice node, corresponding to the legal connected entity, is new includes:
-
1) hashing the legal connected entity;
2) hashing the graphs corresponding to existing lattice nodes of the subcone lattice; and
3) comparing the legal connected entity to the graphs corresponding to existing lattice nodes of the subcone lattice that have the same hash values.
-
-
29. The method of claim 24 wherein the step of generating a subcone lattice from the query includes
i) representing the query as a graph having entity nodes and relationship links; - and
ii) for each of the levels of the subgraph A) for each subgraph of the graph representing the query, determining whether or not the subgraph is a tree and 1) if so, removing an entity node from the subgraph and determining whether or not the resulting subgraph is legal, and 2) if not, removing a relationship link from the subgraph, and B) determining whether or not the resulting subgraph is connected and if so, 1) establishing a lattice link from a lattice node corresponding to the subgraph to a lattice node corresponding to the parent of the subgraph, and 2) inserting the lattice node corresponding to the subgraph into a next level of the lattice.
- and
-
30. The method of claim 24 wherein at least one atomic subgraph, which may not be broken into smaller subgraphs, is defined.
-
31. The method of claim 24 wherein at least one subgraph, which must be kept by any and all subgraphs generated, is defined.
-
32. The method of claim 24 wherein the step of ranking the information lattice nodes reached, includes
i) for each graph corresponding to information, A) finding a maximal common pattern in the data graph and the graph corresponding to the query, and B) ranking the data graph based on the maximal common pattern. -
33. The method of claim 24 wherein the step of ranking the information lattice nodes reached, includes determining a lattice distance between the lattice node corresponding to the graph representing the query and each of the information lattice nodes reached.
-
34. The method of claim 33 wherein the distances are adjusted based on statistics associated with the information.
-
35. The method of claim 24 wherein the information represented as graphs is user actions.
-
36. The method of claim 24 wherein the information represented as graphs is user actions related to maintaining an address book.
-
37. The method of claim 36 wherein the information represented as graphs includes a least one entity node selected from a group of entity nodes consisting of person, name, contact information, organization, job title, telephone number, room number, e-mail address, personal e-mail address, and alias e-mail address.
-
38. The method of claim 24 wherein the information represented as graphs is user actions related to processing order forms.
-
39. The method of claim 38 wherein the information represented as graphs includes a least one entity node selected from a group of entity nodes consisting of order, item, requester, approver, order ID and destination.
-
40. The method of claim 24 wherein the information represented as graphs is user actions related to maintaining a calendar.
-
41. The method of claim 40 wherein the information represented as graphs includes a least one entity node selected from a group of entity nodes consisting of name, room, meeting, appointment, time stamp, job title, organization, telephone number, and person.
-
42. The method of claim 24 wherein the information represented as graphs is user actions related to entertainment.
-
43. The method of claim 42 wherein the information represented as graphs includes a least one entity node selected from a group of entity nodes consisting of restaurant, restaurant rating, cuisine type, restaurant price rating, company, location, neighborhood, name, theater, movie, and movie rating.
-
44. The method of claim 24 wherein the information represented as graphs is user actions related to computer events.
-
45. The method of claim 44 wherein the information represented as graphs includes a least one entity node selected from a group of entity nodes consisting of event, deleted, opened, created, saved, sent, from agent, to agent, time stamp, document, entity, name, mime type, URL location, file location, access type, e-mail message, e-mail address, priority status, message ID, thread ID and description.
-
46. A machine-readable medium storing machine-executable instructions which, when executed by a machine, effect the method of claim 24.
-
47. For use in a computing system in which information is represented as graphs, each graph having entity nodes which depict properties or attributes of the information and relationship links which denote relationships among the entity nodes, an apparatus for finding information based on a query, the apparatus comprising:
-
a) a lattice generator comprising logic for generating a subcone lattice from the query to a selected one of a set depth and a set number of information lattice nodes, the subcone lattice having lattice nodes corresponding to graphs and lattice links between lattice nodes corresponding to a graph and a sub-graph of the graph; and
b) a ranker comprising logic that is operatively coupled to the lattice generator and configured to rank the lattice nodes generated in the subcone lattice that correspond to graphs matching the graphs representing the information and the lattice nodes generated in the subcone lattice that correspond to graphs which are subgraphs of the graphs representing the information.
-
-
48. For use in a computing system in which information is representable as graphs, each graph having entity nodes which depict properties or attributes of the information and relationship links which denote relationships among the entity nodes, an apparatus for finding information based on a query, the apparatus comprising:
-
a) means for determining whether or not a graph corresponding to the query is relatively large;
b) means for, if it is determined that the graph corresponding to the query is relatively large, i) factoring the graph corresponding to the query, and ii) for each piece of the factored graph, A) generating a subcone lattice from the query to a selected one of a set depth and a set number of information lattice nodes, the subcone lattice having lattice nodes corresponding to graphs and lattice links between lattice nodes corresponding to a graph and a sub-graph of the graph, B) searching the generated subcone lattice for lattice nodes corresponding to information graphs to generate a list of information lattice nodes related to the graph corresponding to the query, C) sorting the list of information lattice nodes into categories, and D) ranking the information lattice nodes in each category;
c) means for ranking the information lattice nodes across all categories; and
d) means for storing at least one query result based on the ranking.
-
Specification