Principles and methods for personalizing newsfeeds via an analysis of information novelty and dynamics
First Claim
1. A machine implemented system for distributing personalized information, comprising:
- a comparator that determines differences between two or more related information items, andan analyzer that automatically determines a subset of the related information items as personalized information based in part on the differences and as data relating to the information items evolves over time and at least one of;
stores the personalized information in a computer storage medium;
ordisplays the personalized information on an output device,wherein the personalized information adds maximum novel information to the subset of the related information, andwherein the subset of information items is at least one of stored in a computer storage medium or displayed on an output device and the analyzer employs the following algorithm;
Algorithm RANKNEWSBYNOVELTY (dist, seed, D, n)
R←
seed//initialization
for i=1 to min(n, |D|) do
d←
argmaxdi ε
D {dist(di,R)}
R←
R∪
{d};
D←
D\{d}where dist is a distance metric, seed—
seed story, D—
a set of relevant updates, d—
document, n—
desired number of updates to select and R—
list of articles ordered by novelty.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and methodology is provided for filtering temporal streams of information such as news stories by statistical measures of information novelty. Various techniques can be applied to custom tailor news feeds or other types of information based on information that a user has already reviewed. Methods for analyzing information novelty are provided along with a system that personalizes and filters information for users by identifying the novelty of stories in the context of stories they have already reviewed. The system employs novelty-analysis algorithms that represent articles as a bag of words and named entities. The algorithms analyze inter- and intra-document dynamics by considering how information evolves over time from article to article, as well as within individual articles.
222 Citations
39 Claims
-
1. A machine implemented system for distributing personalized information, comprising:
-
a comparator that determines differences between two or more related information items, and an analyzer that automatically determines a subset of the related information items as personalized information based in part on the differences and as data relating to the information items evolves over time and at least one of; stores the personalized information in a computer storage medium;
ordisplays the personalized information on an output device, wherein the personalized information adds maximum novel information to the subset of the related information, and wherein the subset of information items is at least one of stored in a computer storage medium or displayed on an output device and the analyzer employs the following algorithm;
Algorithm RANK NEWS BY NOVELTY (dist, seed, D, n)
R←
seed//initialization
for i=1 to min(n, |D|) do
d←
argmaxdi ε
D {dist(di,R)}
R←
R∪
{d};
D←
D\{d}where dist is a distance metric, seed—
seed story, D—
a set of relevant updates, d—
document, n—
desired number of updates to select and R—
list of articles ordered by novelty. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method for creating personalized information, comprising:
-
automatically analyzing documents from different information sources; automatically determining novelty of the documents; creating a personalized feed of information based on the novelty of the documents by implementing at least the following algorithm; and at least one of storing or displaying the personalized feed, wherein the personalized feed of information is at least one of stored in a computer storage medium or displayed on an output device and the analyzer employs the following algorithm;
Algorithm RANK NEWS BY NOVELTY (dist, seed, D, n)
R←
seed//initialization
for i=1 to min(n, |D|) do
d←
argmaxdi ε
D {dist(di,R)}
R←
R∪
{d};
D←
D\{d}where dist is a distance metric, seed—
seed story, D—
a set of relevant updates, d—
document, n—
desired number of updates to select and R—
list of articles ordered by novelty. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A method for performing a document analysis, comprising:
-
constructing a language model for each document in a set of documents; analyzing the documents based at least upon determining a fixed distance metric; sliding at least one window over words in the documents, wherein for each document a distance score of the sliding window versus a seed story is calculated and the results are passed through a median filter, the median filter identifies novel information in each; and at least one of storing or displaying the results, wherein the results are at least one of stored in a computer storage medium or displayed on an output device and the median filter comprises the following algorithm;
Algorithm RANK NEWS BY NOVELTY (dist, seed, D, n)
R←
seed//initialization
for i=1 to min(n, |D|) do
d←
argmaxdi ε
D {dist(di,R)}
R←
R∪
{d};
D←
D\{d}where dist is a distance metric, seed—
seed story, D—
a set of relevant updates, d—
document, n—
desired number of updates to select and R—
list of articles ordered by novelty. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36, 37, 38)
-
-
39. A machine implemented system for creating personalized information, comprising:
-
means for analyzing a plurality of documents from different information sources; means for determining a similarity of the documents; means for providing a personalized feed of novel information based on determined differences in similarity of the documents by implementing the following algorithm; and means for at least one of storing or displaying the personalized feed, wherein the personalized feed of information is at least one of stored in a computer storage medium or displayed on an output device and the algorithm implemented is;
Algorithm RANK NEWS BY NOVELTY (dist, seed, D, n)
R←
seed//initialization
for i=1 to min(n, |D|) do
d←
argmaxdi ε
D {dist(di,R)}
R←
R∪
{d};
D←
D\{d}where dist is a distance metric, seed—
seed story, D—
a set of relevant updates, d—
document, n—
desired number of updates to select and R—
list of articles ordered by novelty.
-
Specification