Efficient parsing with structured prediction cascades
First Claim
1. A computer-implemented method, comprising:
- receiving, at a computing device having one or more processors, a sentence including one or more words;
determining, at the computing device, an index set of possible head-modifier dependencies for the sentence, the index set including inner arcs and outer arcs, the inners arcs representing possible head-modifier dependency between words in the sentence separated by a distance less than or equal to a first distance threshold and outer arcs representing possible head-modifier dependency between words in the sentence separated by a distance greater than the first distance threshold;
pruning, at the computing device, the outer arcs to exclude arcs representing possible head-modifier dependency between words in the sentence separated by a distance greater than a second distance threshold to obtain a first pruned index set, the second distance threshold being based on a determination of a longest head-modifier dependency distance observed in training data;
pruning, at the computing device, the first pruned index set based on an augmented vine parsing algorithm to obtain a second pruned index set, the second pruned index set including;
(i) each specific inner arc when a likelihood that the specific inner arc is appropriate is greater than a first threshold, and (ii) the outer arcs in the first pruned index set when a likelihood that there exists a possible outer arc that is appropriate is greater than the first threshold, wherein each specific inner arc corresponds to a specific index and wherein the likelihood that the specific inner arc is appropriate is determined based on a max-marginal value of its corresponding specific index;
pruning, at the computing device, the second pruned index set based on a second parsing algorithm to obtain a third pruned index set, the second parsing algorithm being a first-order parsing model;
pruning, at the computing device, the third pruned index set based on a third parsing algorithm to obtain a fourth pruned index set, the third parsing algorithm being a second-order parsing model;
pruning, at the computing device, the fourth pruned index set based on a fourth parsing algorithm to obtain a fifth pruned index set, the fourth parsing algorithm being a third-order parsing model;
determining, at the computing device, a most-likely parse for the sentence from the fifth pruned index set; and
outputting, from the computing device, the most-likely parse.
2 Assignments
0 Petitions
Accused Products
Abstract
A dependency parsing method can include determining an index set of possible head-modifier dependencies for a sentence. The index set can include inner arcs and outer arcs, inners arcs representing possible dependency between words in the sentence separated by a distance less than or equal to a threshold and outer arcs representing possible dependency between words in the sentence separated by a distance greater than the threshold. The index set can be pruned to include: (i) each specific inner arc when a likelihood that the specific inner arc is appropriate is greater than a first threshold, and (ii) the outer arcs when a likelihood that there exists any possible outer arc that is appropriate is greater than the first threshold. The method can include further pruning the pruned index set based on a second parsing algorithm, and determining a most-likely parse for the sentence from the pruned index set.
-
Citations
20 Claims
-
1. A computer-implemented method, comprising:
-
receiving, at a computing device having one or more processors, a sentence including one or more words; determining, at the computing device, an index set of possible head-modifier dependencies for the sentence, the index set including inner arcs and outer arcs, the inners arcs representing possible head-modifier dependency between words in the sentence separated by a distance less than or equal to a first distance threshold and outer arcs representing possible head-modifier dependency between words in the sentence separated by a distance greater than the first distance threshold; pruning, at the computing device, the outer arcs to exclude arcs representing possible head-modifier dependency between words in the sentence separated by a distance greater than a second distance threshold to obtain a first pruned index set, the second distance threshold being based on a determination of a longest head-modifier dependency distance observed in training data; pruning, at the computing device, the first pruned index set based on an augmented vine parsing algorithm to obtain a second pruned index set, the second pruned index set including;
(i) each specific inner arc when a likelihood that the specific inner arc is appropriate is greater than a first threshold, and (ii) the outer arcs in the first pruned index set when a likelihood that there exists a possible outer arc that is appropriate is greater than the first threshold, wherein each specific inner arc corresponds to a specific index and wherein the likelihood that the specific inner arc is appropriate is determined based on a max-marginal value of its corresponding specific index;pruning, at the computing device, the second pruned index set based on a second parsing algorithm to obtain a third pruned index set, the second parsing algorithm being a first-order parsing model; pruning, at the computing device, the third pruned index set based on a third parsing algorithm to obtain a fourth pruned index set, the third parsing algorithm being a second-order parsing model; pruning, at the computing device, the fourth pruned index set based on a fourth parsing algorithm to obtain a fifth pruned index set, the fourth parsing algorithm being a third-order parsing model; determining, at the computing device, a most-likely parse for the sentence from the fifth pruned index set; and outputting, from the computing device, the most-likely parse.
-
-
2. A computer-implemented method, comprising:
-
receiving, at a computing device, a sentence including one or more words; determining, at the computing device, an index set of possible head-modifier dependencies for the sentence, the index set including inner arcs and outer arcs, the inners arcs representing possible head-modifier dependency between words in the sentence separated by a first distance less than or equal to a distance threshold and outer arcs representing possible head-modifier dependency between words in the sentence separated by a second distance greater than the distance threshold; pruning, at the computing device, the index set based on an augmented vine parsing algorithm to obtain a first pruned index set, the first pruned index set including;
(i) each specific inner arc when a likelihood that the specific inner arc is appropriate is greater than a first threshold, and (ii) the outer arcs when a likelihood that there exists any possible outer arc that is appropriate is greater than the first threshold;pruning, at the computing device, the first pruned index set based on a second parsing algorithm to obtain a second pruned index set; determining, at the computing device, a most-likely parse for the sentence from the second pruned index set; and outputting, from the computing device, the most-likely parse. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computing device, comprising:
-
at least one processor; and a non-transitory computer-readable storage medium storing executable computer program code, the at least one processor configured to execute the executable computer program code to perform operations including; receiving a sentence including one or more words; determining an index set of possible head-modifier dependencies for the sentence, the index set including inner arcs and outer arcs, the inners arcs representing possible head-modifier dependency between words in the sentence separated by a first distance less than or equal to a distance threshold and outer arcs representing possible head-modifier dependency between words in the sentence separated by a second distance greater than the distance threshold; pruning the index set based on an augmented vine parsing algorithm to obtain a first pruned index set, the first pruned index set including;
(i) each specific inner arc when a likelihood that the specific inner arc is appropriate is greater than a first threshold, and (ii) the outer arcs when a likelihood that there exists any possible outer arc that is appropriate is greater than the first threshold;pruning the first pruned index set based on a second parsing algorithm to obtain a second pruned index set; determining a most-likely parse for the sentence from the second pruned index set; and outputting the most-likely parse. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
-
Specification