Abstractive sentence summarization
First Claim
1. A method comprising, by one or more computer server devices:
- receiving an input sentence comprising a sequence of input words, wherein a set of words in a vocabulary comprises the input words;
encoding each of the input words as an indicator vector, wherein the indicator vector captures features of the input word and a sequence of the indicator vectors captures features of the sequence of the input words;
mapping, using a neural network language model, the sequence of the indicator vectors to a distribution of a contextual probability of a first output word in a sequence of output words, wherein the set of words in the vocabulary comprises the output words;
for each of one or more next output words of the sequence of output words;
encoding the sequence of the indicator vectors with a context, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous ones of the output words; and
mapping the encoded sequence of the indicator vectors and the context to the distribution of the contextual probability of the next output word; and
generating a condensed summary using a decoder by maximizing the contextual probability of each of the output words in the sequence of the output words.
4 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, a sequence of input words is received. Each of the input words is encoded as an indicator vector, wherein a sequence of the indicator vectors captures features of the sequence of input words. The sequence of the indicator vectors is then mapped to a distribution of a contextual probability of a first output word in a sequence of output words. For each subsequent output word, the sequence of the indicator vectors is encoded with a context, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous output words; and the encoded sequence of the indicator vectors and the context is mapped to the distribution of the contextual probability of the subsequent output word. Finally, a condensed summary is generated using a decoder by maximizing the contextual probability of each of the output words.
31 Citations
18 Claims
-
1. A method comprising, by one or more computer server devices:
-
receiving an input sentence comprising a sequence of input words, wherein a set of words in a vocabulary comprises the input words; encoding each of the input words as an indicator vector, wherein the indicator vector captures features of the input word and a sequence of the indicator vectors captures features of the sequence of the input words; mapping, using a neural network language model, the sequence of the indicator vectors to a distribution of a contextual probability of a first output word in a sequence of output words, wherein the set of words in the vocabulary comprises the output words; for each of one or more next output words of the sequence of output words; encoding the sequence of the indicator vectors with a context, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous ones of the output words; and mapping the encoded sequence of the indicator vectors and the context to the distribution of the contextual probability of the next output word; and generating a condensed summary using a decoder by maximizing the contextual probability of each of the output words in the sequence of the output words. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. One or more computer-readable non-transitory storage media embodying software that is operable when executed to:
-
receive an input sentence comprising a sequence of input words, wherein a set of words in a vocabulary comprises the input words; encode each of the input words as an indicator vector, wherein the indicator vector captures features of the input word and a sequence of the indicator vectors captures features of the sequence of the input words; map, using a neural network language model, the sequence of the indicator vectors to a distribution of a contextual probability of a first output word in a sequence of output words, wherein the set of words in the vocabulary comprises the output words; for each of one or more next output words of the sequence of output words; encode the sequence of the indicator vectors with a context, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous ones of the output words; and map the encoded sequence of the indicator vectors and the context to the distribution of the contextual probability of the next output word; and generate a condensed summary using a decoder by maximizing the contextual probability of each of the output words in the sequence of the output words. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A system comprising:
- one or more processors; and
a memory coupled to the processors comprising instructions executable by the processors, the processors being operable when executing the instructions to;receive an input sentence comprising a sequence of input words, wherein a set of words in a vocabulary comprises the input words; encode each of the input words as an indicator vector, wherein the indicator vector captures features of the input word and a sequence of the indicator vectors captures features of the sequence of the input words; map, using a neural network language model, the sequence of the indicator vectors to a distribution of a contextual probability of a first output word in a sequence of output words, wherein the set of words in the vocabulary comprises the output words; for each of one or more next output words of the sequence of output words; encode the sequence of the indicator vectors with a context, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous ones of the output words; and map the encoded sequence of the indicator vectors and the context to the distribution of the contextual probability of the next output word; and generate a condensed summary using a decoder by maximizing the contextual probability of each of the output words in the sequence of the output words. - View Dependent Claims (14, 15, 16, 17, 18)
- one or more processors; and
Specification