ABSTRACTIVE SENTENCE SUMMARIZATION

US 20190347328A1
Filed: 07/22/2019
Published: 11/14/2019
Est. Priority Date: 09/01/2016
Status: Active Grant

First Claim

Patent Images

1. A method comprising, by one or more computer server devices:

receiving an input sentence comprising a sequence of input words, wherein a set of words in a vocabulary comprises the input words;

determining, using a neural network language model, a contextual probability of a first output word in a sequence of output words, wherein the set of words in the vocabulary comprises the output words;

for each of one or more next output words of the sequence of output words;

determining a context of a sequence of the input words, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous ones of the output words; and

determining the contextual probability of the next output word; and

generating a condensed summary using a decoder by maximizing the contextual probability of each of the output words in the sequence of the output words.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In one embodiment, a sequence of input words is received. Each of the input words is encoded as an indicator vector, wherein a sequence of the indicator vectors captures features of the sequence of input words. The sequence of the indicator vectors is then mapped to a distribution of a contextual probability of a first output word in a sequence of output words. For each subsequent output word, the sequence of the indicator vectors is encoded with a context, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous output words; and the encoded sequence of the indicator vectors and the context is mapped to the distribution of the contextual probability of the subsequent output word. Finally, a condensed summary is generated using a decoder by maximizing the contextual probability of each of the output words.

16 Citations

View as Search Results

20 Claims

1. A method comprising, by one or more computer server devices:
- receiving an input sentence comprising a sequence of input words, wherein a set of words in a vocabulary comprises the input words;
  
  determining, using a neural network language model, a contextual probability of a first output word in a sequence of output words, wherein the set of words in the vocabulary comprises the output words;
  
  for each of one or more next output words of the sequence of output words;
  
  determining a context of a sequence of the input words, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous ones of the output words; and
  
  determining the contextual probability of the next output word; and
  
  generating a condensed summary using a decoder by maximizing the contextual probability of each of the output words in the sequence of the output words.

2. The method of claim 1, further comprising:
- encoding each of the input words as an indicator vector, wherein the indicator vector captures features of the input word and a sequence of the indicator vectors captures features of the sequence of the input words, wherein the determining the context of the sequence of the input words comprises encoding the sequence of the indicator vectors with the context.

3. The method of claim 2, wherein determining the contextual probability of the first output word comprises mapping the sequence of the indicator vectors to a distribution of the contextual probability of the next output word.

4. The method of claim 3, wherein determining the contextual probability of the next output word comprises mapping the encoded sequence of the indicator vectors and the context to the distribution of the contextual probability of the next output word.

5. The method of claim 2, wherein the encoding comprises using an attention-based encoder which is used to find a latent soft alignment between the indicator vectors and the context, and wherein the latent soft alignment points to a position in the sequence of the indicator vectors where a block of highly relevant information for generating the summary is concentrated.

6. The method of claim 1, wherein a number of the output words in the sequence of the output words is pre-determined.

7. The method of claim 1, wherein the decoder is a Viterbi decoder that finds an exact solution by searching through an entire distribution of the contextual probability.

8. The method of claim 1, wherein the decoder is a beam search decoder that finds an approximate solution by searching through a limited distribution of the contextual probability.

9. The method of claim 1, further comprising modifying a scoring function to find extractive word matches from the input sentences by directly estimating the contextual probability using a log-linear model.

10. One or more computer-readable non-transitory storage media embodying software that is operable when executed to:
- receive an input sentence comprising a sequence of input words, wherein a set of words in a vocabulary comprises the input words;
  
  determine, using a neural network language model, a contextual probability of a first output word in a sequence of output words, wherein the set of words in the vocabulary comprises the output words;
  
  for each of one or more next output words of the sequence of output words;
  
  determine a context of a sequence of the input words, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous ones of the output words; and
  
  determine the contextual probability of the next output word; and
  
  generate a condensed summary using a decoder by maximizing the contextual probability of each of the output words in the sequence of the output words.

11. The non-transitory storage media of claim 10, wherein the software is further operable when executed to:
- encoding each of the input words as an indicator vector, wherein the indicator vector captures features of the input word and a sequence of the indicator vectors captures features of the sequence of the input words, wherein the determining the context of the sequence of the input words comprises encoding the sequence of the indicator vectors with the context.

12. The non-transitory storage media of claim 11, wherein the software operable to determine the contextual probability of the first output word comprises software operable to map the sequence of the indicator vectors to a distribution of the contextual probability of the next output word.

13. The non-transitory storage media of claim 12, wherein the software operable to determine the contextual probability of the next output word comprises the software operable to map the encoded sequence of the indicator vectors and the context to the distribution of the contextual probability of the next output word.

14. The non-transitory storage media of claim 11, wherein the software operable to encode uses an attention-based encoder to find a latent soft alignment between the indicator vectors and the context, and wherein the latent soft alignment points to a position in the sequence of the indicator vectors where a block of highly relevant information for generating the summary is concentrated.

15. The non-transitory storage media of claim 10, wherein a number of the output words in the sequence of the output words is pre-determined.

16. The non-transitory storage media of claim 10, wherein the decoder is a Viterbi decoder that finds an exact solution by searching through an entire distribution of the contextual probability.

17. The non-transitory storage media of claim 10, wherein the decoder is a beam search decoder that finds an approximate solution by searching through a limited distribution of the contextual probability.

18. The non-transitory storage media of claim 10, further comprising software operable to modify a scoring function to find extractive word matches from the input sentences by directly estimating the contextual probability using a log-linear model.

19. A system comprising:
- one or more processors; and
  
  a memory coupled to the processors comprising instructions executable by the processors, the processors being operable when executing the instructions to;
  
  receive an input sentence comprising a sequence of input words, wherein a set of words in a vocabulary comprises the input words;
  
  determine, using a neural network language model, a contextual probability of a first output word in a sequence of output words, wherein the set of words in the vocabulary comprises the output words;
  
  for each of one or more next output words of the sequence of output words;
  
  determine a context of a sequence of the input words, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous ones of the output words; and
  
  determine the contextual probability of the next output word; and
  
  generate a condensed summary using a decoder by maximizing the contextual probability of each of the output words in the sequence of the output words.

20. The system of claim 19, the processors being further operable when executing the instructions to:
- encode each of the input words as an indicator vector, wherein the indicator vector captures features of the input word and a sequence of the indicator vectors captures features of the sequence of the input words, wherein the context of the sequence of the input words is determined by encoding the sequence of the indicator vectors with the context.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Meta Platforms, Inc. (f/k/a Facebook, Inc.)
Original Assignee
Meta Platforms, Inc. (f/k/a Facebook, Inc.)
Inventors
Rush, Alexander Matthew, Chopra, Sumit, Weston, Jason Edward

Granted Patent

US 10,643,034 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 40/30   Semantic analysis

G06F 40/44   Statistical methods, e.g. p...

G10L 15/197   Probabilistic grammars, e.g...

G10L 25/30   using neural networks

ABSTRACTIVE SENTENCE SUMMARIZATION

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

16 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

ABSTRACTIVE SENTENCE SUMMARIZATION

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

16 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links