Determining key concepts in documents based on a universal concept graph
First Claim
1. A method comprising:
- accessing a universal concept graph that includes a first set of nodes that represent concept phrases derived from one or more internal documents associated with a social networking service (SNS) and one or more external documents that are external to the SNS, and a first set of edges that connect a plurality of nodes of the first set of nodes;
accessing a content object associated with the SNS;
generating, using one or more hardware processors, an induced concept graph associated with the content object based on analysis of the content object and the universal concept graph, the induced graph including a second set of nodes that represent one or more concept phrases derived from the content object and a second set of edges that connect a plurality of nodes of the second set of nodes;
identifying one or more key concept phrases in the content object based on applying one or more key concept selection algorithms to the induced concept graph, a first key concept selection algorithm of the one or more key concept selection algorithms iteratively removing leaf nodes from the induced concept graph associated with the content object until a desired number of nodes representing key concept phrases are left, a leaf node being connected solely to one other node; and
storing the one or more key concept phrases in a record of a database, the record referencing the content object.
2 Assignments
0 Petitions
Accused Products
Abstract
A machine may be configured to determining key concepts in documents. For example, the machine accesses a universal concept graph that includes a first set of nodes that represent concept phrases derived from internal documents associated with a social networking service (SNS) and external documents that are external to the SNS, and a first set of edges that connect a plurality of nodes of the first set of nodes. The machine accesses a content object associated with the SNS. The machine generates an induced concept graph associated with the content object based on an analysis of the content object and the universal concept graph. The machine identifies one or more key concept phrases in the content object based on applying one or more key concept selection algorithms to the induced concept graph. The machine stores the one or more key concept phrases in a record of a database.
16 Citations
20 Claims
-
1. A method comprising:
-
accessing a universal concept graph that includes a first set of nodes that represent concept phrases derived from one or more internal documents associated with a social networking service (SNS) and one or more external documents that are external to the SNS, and a first set of edges that connect a plurality of nodes of the first set of nodes; accessing a content object associated with the SNS; generating, using one or more hardware processors, an induced concept graph associated with the content object based on analysis of the content object and the universal concept graph, the induced graph including a second set of nodes that represent one or more concept phrases derived from the content object and a second set of edges that connect a plurality of nodes of the second set of nodes; identifying one or more key concept phrases in the content object based on applying one or more key concept selection algorithms to the induced concept graph, a first key concept selection algorithm of the one or more key concept selection algorithms iteratively removing leaf nodes from the induced concept graph associated with the content object until a desired number of nodes representing key concept phrases are left, a leaf node being connected solely to one other node; and storing the one or more key concept phrases in a record of a database, the record referencing the content object. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising
one or more hardware processors; - and
a non-transitory machine-readable medium for storing instructions that, when executed by one or more hardware processors, cause the system to perform operations comprising; accessing a universal concept graph that includes a first set of nodes that represent concept phrases derived from one or more internal documents associated with a social networking service (SNS) and one or more external documents that are external to the SNS, and a first set of edges that connect a plurality of nodes of the first set of nodes; accessing a content object associated with the SNS; generating an induced concept graph associated with the content object based on analysis of the content object and the universal concept graph, the induced graph including a second set of nodes that represent one or more concept phrases derived from the content object and a second set of edges that connect a plurality of nodes of the second set of nodes; identifying one or more key concept phrases in the content object based on applying one or more key concept selection algorithms to the induced concept graph, a first key concept selection algorithm of the one or more key concept selection algorithms iteratively removing leaf nodes from the induced concept graph associated with the content object until a desired number of nodes representing key concept phrases are left, a leaf node being connected solely to one other node; and storing the one or more key concept phrases in a record of a database, the record referencing the content object. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
- and
-
20. A non-transitory machine-readable storage medium comprising instructions that, when executed by one or more hardware processors of a machine, cause the machine to perform operations comprising:
-
accessing a universal concept graph that includes a first set of nodes that represent concept phrases derived from one or more internal documents associated with a social networking service (SNS) and one or more external documents that are external to the SNS, and a first set of edges that connect a plurality of nodes of the first set of nodes; accessing a content object associated with the SNS; generating an induced concept graph associated with the content object based on analysis of the content object and the universal concept graph, the induced graph including a second set of nodes that represent one or more concept phrases derived from the content object and a second set of edges that connect a plurality of nodes of the second set of nodes; identifying one or more key concept phrases in the content object based on applying one or more key concept selection algorithms to the induced concept graph, a first key concept selection algorithm of the one or more key concept selection algorithms iteratively removing leaf nodes from the induced concept graph associated with the content object until a desired number of nodes representing key concept phrases are left, a leaf node being connected solely to one other node; and storing the one or more key concept phrases in a record of a database, the record referencing the content object.
-
Specification