Selectively deleting clusters of conceptually related words from a generative model for text
First Claim
1. A method for selectively deleting clusters of conceptually-related words from a probabilistic generative model for textual documents, comprising:
- receiving a current model, which contains terminal nodes representing random variables for words and contains one or more cluster nodes representing clusters of conceptually related words;
wherein nodes in the current model are coupled together by weighted links, so that for a cluster node with an incoming link from a node that has fired which causes the cluster node in the current model to fire with a probability proportionate to a weight of the incoming link, an outgoing link from the cluster node to another node causes the other node to fire with a probability proportionate to the weight of the outgoing link; and
processing, at a computer system, a given cluster node in the current model for possible deletion by,determining a number of outgoing links from the given cluster node to terminal nodes and/or cluster nodes in the current model;
determining that the determined number of outgoing links is less than a minimum value; and
deleting the given cluster node from the current model.
2 Assignments
0 Petitions
Accused Products
Abstract
One embodiment of the present invention provides a system that selectively deletes clusters of conceptually-related words from a probabilistic generative model for textual documents. During operation, the system receives a current model, which contains terminal nodes representing random variables for words and contains one or more cluster nodes representing clusters of conceptually related words. Nodes in the current model are coupled together by weighted links, so that if an incoming link from a node that has fired causes a cluster node to fire with a probability proportionate to a weight of the incoming link, an outgoing link from the cluster node to another node causes the other node to fire with a probability proportionate to the weight of the outgoing link. Next, the system processes a given cluster node in the current model for possible deletion. This involves determining a number of outgoing links from the given cluster node to terminal nodes or cluster nodes in the current model. If the determined number of outgoing links is less than a minimum value, or if the frequency with which the given cluster node fires is less than a minimum frequency, the system deletes the given cluster node from the current model.
41 Citations
20 Claims
-
1. A method for selectively deleting clusters of conceptually-related words from a probabilistic generative model for textual documents, comprising:
-
receiving a current model, which contains terminal nodes representing random variables for words and contains one or more cluster nodes representing clusters of conceptually related words; wherein nodes in the current model are coupled together by weighted links, so that for a cluster node with an incoming link from a node that has fired which causes the cluster node in the current model to fire with a probability proportionate to a weight of the incoming link, an outgoing link from the cluster node to another node causes the other node to fire with a probability proportionate to the weight of the outgoing link; and processing, at a computer system, a given cluster node in the current model for possible deletion by, determining a number of outgoing links from the given cluster node to terminal nodes and/or cluster nodes in the current model; determining that the determined number of outgoing links is less than a minimum value; and deleting the given cluster node from the current model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method for selectively deleting clusters of conceptually-related words from a probabilistic generative model for textual documents, the method comprising:
-
receiving a current model, which contains terminal nodes representing random variables for words and contains one or more cluster nodes representing clusters of conceptually related words; wherein nodes in the current model are coupled together by weighted links, so that for a cluster node with an incoming link from a node that has fired which causes the cluster node in the current model to fire with a probability proportionate to a weight of the incoming link, an outgoing link from the cluster node to another node causes the other node to fire with a probability proportionate to the weight of the outgoing link; and processing a given cluster node in the current model for possible deletion by, determining a number of outgoing links from the given cluster node to terminal nodes and/or cluster nodes in the current model; determining that the determined number of outgoing links is less than a minimum value; and deleting the given cluster node from the current model. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. An apparatus that selectively deletes clusters of conceptually-related words from a probabilistic generative model for textual documents, comprising:
-
a processor; a memory; a receiving mechanism configured to receive a current model, which contains terminal nodes representing random variables for words and contains one or more cluster nodes representing clusters of conceptually related words; wherein nodes in the current model are coupled together by weighted links, so that for a cluster node with an incoming link from a node that has fired which causes the cluster node in the current model to fire with a probability proportionate to a weight of the incoming link, an outgoing link from the cluster node to another node causes the other node to fire with a probability proportionate to the weight of the outgoing link; and a deletion mechanism configured to use the processor to selectively delete cluster nodes from the current model, wherein for a given cluster node the deletion mechanism is configured to, determine a number of outgoing links from the given cluster node to terminal nodes and/or cluster nodes in the current model; determine that the determined number of outgoing links is less than a minimum value; and delete the given cluster node from the current model. - View Dependent Claims (18, 19, 20)
-
Specification