Information cataloging
First Claim
1. A computer implemented method comprising:
- detecting a plurality of observations about an entity from a plurality of sources, by a computer processor, the plurality of observations including one or more of various elements relating to the entity, the one or more of various elements including at least one of;
name, e-mail address, physical address, phone number, age, gender, employer, and social networking account;
representing the one or more of various elements from each of the plurality of observations about the entity using one or more nodes of a directed graph stored on a non-transitory computer readable media;
representing relations between the one or more nodes using edges connecting the one or more nodes in the directed graph;
computing, by the computer processor, a distance between two disparate nodes of the one or more nodes, each of the two disparate nodes representing an element related to the entity, wherein a shorter computed distance is associated to a higher confidence value that each of the two disparate nodes represent the entity;
inferring, by the computer processor, a relationship between the two disparate nodes based on the computed distance between the two disparate nodes;
assigning a noise level to the one or more of the nodes, wherein the noise level assigned to a node increasing the distance of an edge associated with that node and the noise level assigned to a node increasing if that node has multiple dissimilar associations to different nodes;
calculating a signal to noise ratio (SNR) of an edge based on noise levels of the nodes attached to that edge;
determining validity of a relationship between the two disparate nodes if the SNR of the edge connecting the two disparate nodes is above a threshold;
automatically generating and outputting an identity for the entity, by the computer processor, based on the one or more of various elements and the relationship inferred between the two disparate nodes, the identity including a listing of one or more of the various elements relating to the entity and reflecting the relationship inferred between the two disparate nodes; and
cataloging the identity for the entity in an information cataloging system by storing the identity in a database, the information cataloging system configured to respond to a user query based on the cataloged identity.
2 Assignments
0 Petitions
Accused Products
Abstract
An information cataloging system disclosed herein provides a system and method for inferring relationships between various elements, such as e-mail address, phone number, etc., of various observations, such as business cards, observations obtained from the Internet, etc. The method comprises representing various elements, such as name, e-mail address, etc., using nodes, representing the relations between the various elements using edges connecting these nodes, computing a distance between two disparate nodes, wherein each of the two disparate nodes represent an element related to the entity. An implementation of the information cataloging system disclosed herein also provides a method of calculating noise and signal to noise ratio attached to various nodes and using such noise information in calculating confidence level of relationships between various elements.
-
Citations
24 Claims
-
1. A computer implemented method comprising:
-
detecting a plurality of observations about an entity from a plurality of sources, by a computer processor, the plurality of observations including one or more of various elements relating to the entity, the one or more of various elements including at least one of;
name, e-mail address, physical address, phone number, age, gender, employer, and social networking account;representing the one or more of various elements from each of the plurality of observations about the entity using one or more nodes of a directed graph stored on a non-transitory computer readable media; representing relations between the one or more nodes using edges connecting the one or more nodes in the directed graph; computing, by the computer processor, a distance between two disparate nodes of the one or more nodes, each of the two disparate nodes representing an element related to the entity, wherein a shorter computed distance is associated to a higher confidence value that each of the two disparate nodes represent the entity; inferring, by the computer processor, a relationship between the two disparate nodes based on the computed distance between the two disparate nodes; assigning a noise level to the one or more of the nodes, wherein the noise level assigned to a node increasing the distance of an edge associated with that node and the noise level assigned to a node increasing if that node has multiple dissimilar associations to different nodes; calculating a signal to noise ratio (SNR) of an edge based on noise levels of the nodes attached to that edge; determining validity of a relationship between the two disparate nodes if the SNR of the edge connecting the two disparate nodes is above a threshold; automatically generating and outputting an identity for the entity, by the computer processor, based on the one or more of various elements and the relationship inferred between the two disparate nodes, the identity including a listing of one or more of the various elements relating to the entity and reflecting the relationship inferred between the two disparate nodes; and cataloging the identity for the entity in an information cataloging system by storing the identity in a database, the information cataloging system configured to respond to a user query based on the cataloged identity. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A non-transitory computer-readable storage medium embodied with computer-executable instructions for executing on one or more processors and circuits of a device a process comprising:
-
detecting a plurality of observations about an entity from a plurality of sources, the plurality of observations including one or more of various elements relating to the entity, the one or more of various elements including at least one of;
name, e-mail address, physical address, phone number, age, gender, employer, and social networking account;representing the one or more of various elements from each of the plurality of observations about the entity using one or more nodes of a directed graph; representing relations between the one or more nodes using edges connecting the one or more nodes in the directed graph; computing a distance between two disparate nodes of the one or more nodes, each of the two disparate nodes representing an element related to the entity, wherein a shorter computed distance is associated to a higher confidence value that each of the two disparate nodes represent the entity; inferring a relationship between the two disparate nodes based on the computed distance between the two disparate nodes; assigning a noise level to each of the two disparate nodes, wherein the noise level assigned to a node increasing the distance of an edge associated with that node and the noise level assigned to a node increasing if that node has multiple dissimilar associations to different nodes; calculating a signal to noise ratio (SNR) of an edge based on noise levels of the nodes attached to that edge; determining validity of a relationship between the two disparate nodes if the SNR of the edge connecting the two disparate nodes is above a threshold; automatically generating and outputting an identity for the entity based on the one or more of various elements and the relationship inferred between the two disparate nodes, the identity including a listing of one or more of the various elements relating to the entity and reflecting the relationship inferred between the two disparate nodes; and cataloging the identity for the entity in an information cataloging system by storing the identity in a database, the information cataloging system configured to respond to a user query based on the cataloged identity. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23)
-
-
24. A system, comprising:
-
a processor and a non-transitory computer readable memory, wherein the non-transitory computer readable memory comprises instructions for executing on the processor, the instructions including; a graph generation module configured to represent one or more of various elements from each of a plurality of observations about an entity, the plurality of observations including one or more of various elements relating to the entity, using one or more nodes of a directed graph in the memory and to represent relations between the one or more nodes using edges stored in the memory, the edges connecting the one or more nodes in the directed graph, the one or more of various elements including at least one of;
name, e-mail address, physical address, phone number, age, gender, employer, and social networking account;a graph traversal module configured to compute a distance between two disparate nodes of the one or more nodes, each of the two disparate nodes representing an element related to the entity and the distance between the two disparate nodes indicating a likelihood that the two disparate nodes belong to the entity; an inference module configured to; infer a relationship between the two disparate nodes based on the computed distance between the two disparate nodes; assign a noise level to one of the two disparate nodes if the one of the two disparate nodes has multiple dissimilar associations to different nodes of the same type; propagating the noise level to an edge connecting the two disparate nodes; calculate a signal to noise ratio (SNR) of the edge connecting the two disparate nodes, the SNR being based on the noise level of the two disparate nodes; determine a validity of the relationship between the two disparate nodes if the SNR of the edge connecting the two disparate nodes is above a threshold; wherein the graph traversal module is further configured to automatically generate and output an identity for the entity based on the one or more of various elements and the relationship inferred between the two disparate nodes, the identity including a listing of one or more of the various elements relating to the entity and reflecting the relationship inferred between the two disparate nodes; and a remote procedure call module configured to catalog the identity for the entity in an information cataloging system by storing the identity in a database, the information cataloging system configured to respond to a user query based on the cataloged identity.
-
Specification