DEEP REINFORCED MODEL FOR ABSTRACTIVE SUMMARIZATION
First Claim
1. A text summarization system comprising:
- an encoder for encoding input tokens of a document to be summarized; and
a decoder for emitting summary tokens which summarize the document based on the encoded input tokens, wherein at each iteration the decoder;
generates attention scores between a current hidden state of the decoder and previous hidden states of the decoder;
generates a current decoder context from the attention scores and the previous hidden states of the decoder; and
selects a next summary token based on the current decoder context and a current encoder context of the encoder;
wherein the attention scores penalize candidate summary tokens having high attention scores in previous iterations.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for text summarization includes an encoder for encoding input tokens of a document and a decoder for emitting summary tokens which summarize the document based on the encoded input tokens. At each iteration the decoder generates attention scores between a current hidden state of the decoder and previous hidden states of the decoder, generates a current decoder context from the attention scores and the previous hidden states of the decoder, and selects a next summary token based on the current decoder context and a current encoder context of the encoder. The attention scores penalize candidate summary tokens having high attention scores in previous iterations. In some embodiments, the attention scores include an attention score for each of the previous hidden states of the decoder. In some embodiments, the selection of the next summary token prevents emission of repeated summary phrases in a summary of the document.
-
Citations
20 Claims
-
1. A text summarization system comprising:
-
an encoder for encoding input tokens of a document to be summarized; and a decoder for emitting summary tokens which summarize the document based on the encoded input tokens, wherein at each iteration the decoder; generates attention scores between a current hidden state of the decoder and previous hidden states of the decoder; generates a current decoder context from the attention scores and the previous hidden states of the decoder; and selects a next summary token based on the current decoder context and a current encoder context of the encoder; wherein the attention scores penalize candidate summary tokens having high attention scores in previous iterations. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for summarizing text, the method comprising:
-
receiving a document to be summarized; encoding, using an encoder, input tokens of the document; generating, using a decoder, attention scores between a current hidden state of the decoder and previous hidden states of the decoder; generating, using the decoder, a current decoder context from the attention scores and the previous hidden states of the decoder; and selecting, using the decoder, a next summary token based on the current decoder context and a current encoder context of the encoder; wherein; the next summary token from each iteration summarizes the document; and the attention scores penalize candidate summary tokens having high attention scores in previous iterations. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A tangible non-transitory computer readable storage medium impressed with computer program instructions that, when executed on a processor, implement a method comprising:
-
receiving a document to be summarized; encoding, using an encoder, input tokens of the document; generating, using a decoder, attention scores between a current hidden state of the decoder and previous hidden states of the decoder; generating, using the decoder, a current decoder context from the attention scores and the previous hidden states of the decoder; and selecting, using the decoder, a next summary token based on the current decoder context and a current encoder context of the encoder; wherein; the next summary token from each iteration summarizes the document; and the attention scores penalize candidate summary tokens having high attention scores in previous iterations. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification