Modeling interestingness with deep neural networks

US 9,846,836 B2
Filed: 06/13/2014
Issued: 12/19/2017
Est. Priority Date: 06/13/2014
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented process, comprising:

applying a computer to perform process actions for;

receiving a collection of source and target document pairs;

identifying a separate context for each source document, the context for each source document comprising a selection within the source document and a window of multiple words in the source document around that selection;

identifying a separate context for each target document, the context for each target document comprising a first fixed number of the first words in that target document;

mapping each context to a separate vector;

mapping each of the vectors to a convolutional layer of a neural network;

mapping the convolutional layer to a plurality of hidden layers of the neural network;

generating a learned interestingness model by learning weights for each of a plurality of transitions between the layers of the neural network, such that the learned weights minimize a distance between the vectors of the contexts of the source and target documents;

the interestingness model configured to determine a conditional likelihood of a user interest in transitioning to an arbitrary target document when that user is consuming an arbitrary source document in view of a context extracted from that arbitrary source document and a context extracted from that arbitrary target document; and

applying the interestingness model to recommend one or more arbitrary target documents to the user relative to an arbitrary source document being consumed by the user.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An “Interestingness Modeler” uses deep neural networks to learn deep semantic models (DSM) of “interestingness.” The DSM, consisting of two branches of deep neural networks or their convolutional versions, identifies and predicts target documents that would interest users reading source documents. The learned model observes, identifies, and detects naturally occurring signals of interestingness in click transitions between source and target documents derived from web browser logs. Interestingness is modeled with deep neural networks that map source-target document pairs to feature vectors in a latent space, trained on document transitions in view of a “context” and optional “focus” of source and target documents. Network parameters are learned to minimize distances between source documents and their corresponding “interesting” targets in that space. The resulting interestingness model has applicable uses, including, but not limited to, contextual entity searches, automatic text highlighting, prefetching documents of likely interest, automated content recommendation, automated advertisement placement, etc.

90 Citations

View as Search Results

19 Claims

1. A computer-implemented process, comprising:
- applying a computer to perform process actions for;
  
  receiving a collection of source and target document pairs;
  
  identifying a separate context for each source document, the context for each source document comprising a selection within the source document and a window of multiple words in the source document around that selection;
  
  identifying a separate context for each target document, the context for each target document comprising a first fixed number of the first words in that target document;
  
  mapping each context to a separate vector;
  
  mapping each of the vectors to a convolutional layer of a neural network;
  
  mapping the convolutional layer to a plurality of hidden layers of the neural network;
  
  generating a learned interestingness model by learning weights for each of a plurality of transitions between the layers of the neural network, such that the learned weights minimize a distance between the vectors of the contexts of the source and target documents;
  
  the interestingness model configured to determine a conditional likelihood of a user interest in transitioning to an arbitrary target document when that user is consuming an arbitrary source document in view of a context extracted from that arbitrary source document and a context extracted from that arbitrary target document; and
  
  applying the interestingness model to recommend one or more arbitrary target documents to the user relative to an arbitrary source document being consumed by the user.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The computer-implemented process of claim 1 further comprising:
    - identifying a focus for each source document and each target document; and
      
      wherein the separate vectors are constructed by mapping the focus and context of each source document and each target document to the separate vectors.
  - 3. The computer-implemented process of claim 2 wherein the focus of a source document is one or more selected words in the source document.
  - 4. The computer-implemented process of claim 2 wherein the focus of one or more of the target documents is a fixed number of words at the beginning of the target document.
  - 5. The computer-implemented process of claim 1 further comprising applying the learned interestingness model to one or more arbitrary source documents to extract semantic features from those arbitrary source documents.
  - 6. The computer-implemented process of claim 1 further comprising applying the learned interestingness model to one or more arbitrary target documents to extract semantic features from those arbitrary target documents.
  - 7. The computer-implemented process of claim 1 further comprising generating feature vectors from an output layer of the learned interestingness model, and applying those feature vectors as input to train a discriminative model.
  - 8. The computer-implemented process of claim 7 wherein the discriminative model is a boosted tree ranker trained by performing a plurality of iterations of boosting rounds, with each round constructing a regression tree.
  - 9. The computer-implemented process of claim 7 wherein the discriminative model is used to automatically highlight interesting content in an arbitrary document being consumed by the user.
  - 10. The computer-implemented process of claim 7 wherein the discriminative model is used to automatically perform contextual entity searches, for one or more entities automatically identified in an arbitrary document being consumed by the user, for entities likely to be of interest to the user.
  - 11. The computer-implemented process of claim 7 wherein the discriminative model is used to automatically prefetch one or more documents likely to be of interest to a user consuming an arbitrary document.
  - 12. The computer-implemented process of claim 7 wherein the discriminative model is used to automatically recommend one or more items that are likely to be of interest to a user consuming an arbitrary document.
  - 13. The computer-implemented process of claim 1 wherein the neural network is constructed from layers comprising:
    - an input layer comprising vectors derived from the context;
      
      the convolutional layer connected to the input layer via a first linear projection matrix, the convolutional layer extracting semantic features from the vectors of the input layer;
      
      a max pooling layer connected to the convolutional layer via a max pooling operation;
      
      the plurality of hidden layers connected to the max pooling layer via a second linear projection matrix; and
      
      an output layer connected to the plurality of hidden layers via a third linear projection matrix.
  - 14. The computer-implemented process of claim 1 wherein the context of one or more of the source documents is one or more anchors in combination with a window of words around the anchor.
  - 15. The computer-implemented process of claim 1 wherein the context of one or more of the source documents is a predefined size window of words around each of a plurality of entities identified in those source documents.

16. A system comprising:
- a general purpose computing device; and
  
  a computer program comprising program modules executable by the computing device, wherein the computing device is directed by the program modules of the computer program to;
  
  receive a collection of source and target document pairs;
  
  identify a separate focus and a separate context for each source document and each target document;
  
  the context of each source document comprising a selection of one or more words within the source document and a window of multiple words in the source document around that selection;
  
  the focus of each source document comprising a selected anchor within the source document;
  
  the context of each target document comprising a first fixed number of the first words in that target document;
  
  the focus of each target document comprising a second fixed number of the first words in that target document, the second fixed number being smaller than the first fixed number;
  
  map the words of each focus to a separate vector and the words of each context to a separate vector;
  
  for each document, concatenate the corresponding focus and context vectors into a combined vector;
  
  map each of the combined vectors to a convolutional layer of a neural network;
  
  map the convolutional layer to a hidden layer of the neural network;
  
  generate a learned interestingness model by learning weights for each of a plurality of transitions between the layers of the neural network, such that the learned weights minimize a distance between the combined vectors of the source and target documents;
  
  the interestingness model configured to determine a conditional likelihood of a user interest in transitioning to an arbitrary target document when that user is consuming an arbitrary source document in view of a context extracted from that arbitrary source document and a context extracted from that arbitrary target document; and
  
  applying the interestingness model to recommend one or more arbitrary target documents to the user relative to an arbitrary source document being consumed by the user.
- View Dependent Claims (17, 18)
- - 17. The system of claim 16 further comprising generating feature vectors from an output layer of the learned interestingness model, and applying those feature vectors as input to train a discriminative model.
  - 18. The system of claim 16 wherein mapping the words of each focus to a separate vector further comprises forming a one-hot vector and a tri-letter vector for each word in each focus.

19. A computer-readable storage device having computer executable instructions stored therein, said instructions causing a computing device to execute a method comprising:
- receiving a collection of source and target document pairs;
  
  identifying a separate context for each source document and each target document;
  
  the context for each source document comprising a selection within the source document and a window of multiple words in the source document around that selection;
  
  the context for each target document comprising a first fixed number of the first words in that target document;
  
  mapping each context to a separate vector;
  
  mapping each of the vectors to a convolutional layer of a neural network;
  
  mapping the convolutional layer to a plurality of hidden layers of the neural network;
  
  generating a learned interestingness model by learning weights for each of a plurality of transitions between the layers of the neural network, such that the learned weights minimize a distance between the vectors of the source and target documents; and
  
  training a discriminative model from an output layer of the learned interestingness model; and
  
  applying the discriminative model to automatically highlight content in an arbitrary document being consumed by that user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Gao, Jianfeng, Deng, Li, Gamon, Michael, He, Xiaodong, Pantel, Patrick
Primary Examiner(s)
Chace, Christian P
Assistant Examiner(s)
Lee, Tsu-Chang

Application Number

US14/304,863
Publication Number

US 20150363688A1
Time in Patent Office

1,285 Days
Field of Search

706 27, 706 31
US Class Current
CPC Class Codes

G06F 16/9032   Query formulation

G06N 20/10   using kernel methods, e.g. ...

G06N 20/20   Ensemble learning

G06N 3/04   Architecture, e.g. intercon...

G06N 3/042   Knowledge-based neural netw...

G06N 3/045   Combinations of networks

G06N 3/082   modifying the architecture,...

G06N 3/084   Backpropagation, e.g. using...

G06N 5/01   Dynamic search techniques; ...

Modeling interestingness with deep neural networks

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

90 Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Modeling interestingness with deep neural networks

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

90 Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links