System and method for automating categorization and aggregation of content from network sites
First Claim
Patent Images
1. A method for providing aggregated content from a network, the method being implemented by one or more processors that perform steps comprising:
- creating a category definition for each of a plurality of categories, the category definition for each category comprising a category name and one or more terms, wherein each of the one or more terms associated with a particular category definition (1) identifies a term that is pertinent in determining whether an article should be associated with a particular category associated with the particular category definition, and (2) excludes any terms that are not pertinent in determining whether an article should be associated with the particular category associated with the particular category definition;
retrieving a plurality of articles from over a network;
analyzing each article of the plurality of articles, in order to associate each article of the plurality of articles, with one or more categories in the plurality of categories, wherein analyzing each article includes associating a particular article of the plurality of articles with a corresponding category based on (1) a presence of one or more character strings that appear in the particular article, wherein each of the one or more character strings (i) correspond to a particular term in the one or more terms in the category definition of the corresponding category, and (ii) is not the category name of the corresponding category, and (2) one or more additional criteria to weight the presence of said one or more character strings over other character strings that correspond to a term of a category definition of another category;
prior to analyzing each article, of the plurality of articles, assigning a portion of each of a plurality of web pages to one or more corresponding categories in the plurality of categories, so that each category in the plurality of categories is assigned to at least a portion of the plurality of web pages;
displaying, on each of the plurality of web pages, at least a portion of individual articles that have been associated with the one or more corresponding categories assigned to the web page, wherein each web page, of the plurality of web pages, displays at least one category name that is assigned thereto.
3 Assignments
0 Petitions
Accused Products
Abstract
A plurality of content items are retrieved from multiple network sites. Content from each content item is programmatically analyzed in order to associate that content item with one or more categories. The one or more categories may be part of a larger set of predefined categories. A network page is assigned to one or more corresponding categories in the set of predefined categories. At least some content is provided on the network page using one or more content items that were associated with the one or more categories assigned to that network page.
88 Citations
70 Claims
-
1. A method for providing aggregated content from a network, the method being implemented by one or more processors that perform steps comprising:
-
creating a category definition for each of a plurality of categories, the category definition for each category comprising a category name and one or more terms, wherein each of the one or more terms associated with a particular category definition (1) identifies a term that is pertinent in determining whether an article should be associated with a particular category associated with the particular category definition, and (2) excludes any terms that are not pertinent in determining whether an article should be associated with the particular category associated with the particular category definition; retrieving a plurality of articles from over a network; analyzing each article of the plurality of articles, in order to associate each article of the plurality of articles, with one or more categories in the plurality of categories, wherein analyzing each article includes associating a particular article of the plurality of articles with a corresponding category based on (1) a presence of one or more character strings that appear in the particular article, wherein each of the one or more character strings (i) correspond to a particular term in the one or more terms in the category definition of the corresponding category, and (ii) is not the category name of the corresponding category, and (2) one or more additional criteria to weight the presence of said one or more character strings over other character strings that correspond to a term of a category definition of another category; prior to analyzing each article, of the plurality of articles, assigning a portion of each of a plurality of web pages to one or more corresponding categories in the plurality of categories, so that each category in the plurality of categories is assigned to at least a portion of the plurality of web pages; displaying, on each of the plurality of web pages, at least a portion of individual articles that have been associated with the one or more corresponding categories assigned to the web page, wherein each web page, of the plurality of web pages, displays at least one category name that is assigned thereto. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A method for providing aggregated content from a network, the method being implemented by one or more processors that perform steps comprising:
-
retrieving a plurality of content items from one or more network sites; for each of the plurality of content items that are retrieved, programmatically analyzing content contained in each content item in order to associate that content item with one or more categories in a plurality of categories, wherein each of the one or more categories, in the plurality of categories, is associated with a category definition, wherein the category definition for each of the one or more categories comprises a category name and one or more terms that each (1) are pertinent in determining whether a content item should be associated with a particular category associated with the particular category definition, and (2) exclude any terms that are not pertinent in determining whether a content item should be associated with the particular category associated with the particular category definition, wherein the plurality of categories include categories that correspond to a plurality of geographic locations, and wherein programmatically analyzing content contained in each content item includes determining that at least some of the content items are each associated with a corresponding geographic location by identifying words, terms, or names in the analyzed content other than a proper name or zip code of the geographic location, and wherein analyzing each content item includes evaluating additional criteria to weight the presence of the identified words, terms or names in the analyzed content over other words, terms or names that correspond to a term of a category definition of another category. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
33. A method for providing aggregated content from a network, the method comprising the steps of:
-
(a) assigning each category, in a set of categories, with a corresponding network location, wherein the set of categories includes one thousand or more categories; (b) retrieving a plurality of content items from one or more network sites; (c) for each of the plurality of content items, programmatically analyzing each content item in order to associate that content item with one or more categories in the set, wherein programmatically analyzing comprises; creating a category definition for each category in the set of categories, wherein the category definition for each category in the set of categories includes a category name and one or more terms, wherein each of the one or more terms associated with a particular category definition (1) identifies a term that is pertinent in determining whether a content item should be associated with a particular category associated with the particular category definition, and (2) excludes any terms that are not pertinent in determining whether a content item should be associated with the particular category associated with the particular category definition, and associating each of the plurality of content items with at least one of the one or more categories based on (1) a presence of one or more character strings that appear in the particular article, wherein each of the one or more character strings (i) correspond to a particular term in the one or more terms in the category definition of the corresponding category, and (ii) is not the category name of the corresponding category, and (2) one or more additional criteria to weight the presence of said one or more character strings over other character strings that correspond to a term of a category definition of another category; and (d) displaying one or more content items for each category in the set at the corresponding network location for that category. - View Dependent Claims (34, 35, 36, 37)
-
-
38. A non-transitory computer readable medium storing instructions for providing aggregated content from a network, wherein when executed by one or more processors, the instructions cause the one or more processors to perform the steps comprising:
-
creating a category definition for each of a plurality of categories, the category definition for each category comprising a category name and one or more terms, wherein each of the one or more terms associated with a particular category definition (1) identifies a term that is pertinent in determining whether an article should be associated with a particular category associated with the particular category definition, and (2) excludes any terms that are not pertinent in determining whether an article should be associated with the particular category associated with the particular category definition; retrieving a plurality of articles from over a network; analyzing each article of the plurality of articles, in order to associate each article of the plurality of articles, with one or more categories in the plurality of categories, wherein analyzing each article includes associating a particular article of the plurality of articles with a corresponding category based on (1) a presence of one or more character strings that appear in the particular article, wherein each of the one or more character strings (i) correspond to a particular term in the one or more terms in the category definition of the corresponding category, and (ii) is not the category name of the corresponding category, and (2) one or more additional criteria to weight the presence of said one or more character strings over other character strings that correspond to a term of a category definition of another category; prior to analyzing each article, of the plurality of articles, assigning a portion of each of a plurality of web pages to one or more corresponding categories in the plurality of categories, so that each category in the plurality of categories is assigned to at least a portion of the plurality of web pages; displaying, on each of the plurality of web pages, at least a portion of individual articles that have been associated with the one or more corresponding categories assigned to the web page, wherein each web page, of the plurality of web pages, displays at least one category name that is assigned thereto.
-
-
39. A method for providing aggregated content from a network, the method being implemented by one or more processors that perform steps comprising:
-
performing an analysis on a text content of each of a plurality of content items, wherein the plurality of content items are provided at a plurality of network locations on one or more network sites, wherein performing the analysis comprises creating a category definition for each category in a set of categories, wherein the category definition for each category in the set of categories includes a category name and one or more terms, wherein each of the one or more terms associated with a particular category definition (1) identifies a term that is pertinent content item should be associated with a particular category associated with the particular category definition, and (2) excludes any terms that are not pertinent in determining whether a content item should be associated with the particular category associated with the particular category definition; for at least some of the plurality of content items, determining a geographic location that is pertinent to the text content of that content item based at least in part on the analysis, including indentifying one or more words, terms or names that are associated with the geographic location but which are not a proper name of the geographic location, wherein the geographic location pertinent to a particular content item is determined by associating the particular content item which a category in the set of categories using the category definitions for the set of categories; wherein performing an analysis on a text content includes evaluating additional criteria to weight the presence of the identified words, terms or names in the analyzed content over other words, terms or names that correspond to a term of a category definition of another category; and generating a presentation for each of a plurality of geographic locations, wherein each presentation makes available at least a portion of one or more content items that have been determined to be pertinent to that geographic location. - View Dependent Claims (40, 41, 42, 43, 44)
-
-
45. A method for providing aggregated content from a network, the method being implemented by one or more processors and comprising the steps of:
-
retrieving a plurality of content items from one or more network sites; for each of the plurality of content items that are retrieved, analyzing content contained in each content item in order to associate that content item with one or more categories in a plurality of categories, wherein each category in the plurality of categories is associated with a category definition that comprises a category name and one or more terms, wherein each of the one or more terms associated with a particular category definition (1) identifies a term that is pertinent in determining whether a content item should be associated with a particular category associated with the particular category definition, and (2) excludes any terms that are not pertinent in determining whether a content item should be associated with the particular category associated with the particular category definition, and wherein analyzing content contained in each content item includes (i) comparing words present within each content item with the one or more terms associated with each character definition, and (ii) evaluating additional criteria to weight the presence of said words present within each content item with words that correspond to a term of a category definition of another category; and wherein the plurality of categories include categories that correspond to a plurality of names of persons or places, and wherein analyzing content contained in each content item includes determining that at least some of the content items are each associated with one or more of the plurality of names, including associating individual content items with a corresponding one of the plurality of names based in part on identification of a character string that (i) corresponds to a term that is defined as being pertinent to the corresponding name, (ii) but not an explicit statement of the corresponding name. - View Dependent Claims (46, 47, 48, 49, 50, 51)
-
-
52. A method for providing aggregated content from a network, the method being implemented using one or more processors that perform steps comprising:
-
retrieving a plurality of content items from one or more network sites; for each of the plurality of content items that are retrieved, analyzing content contained in each content item by; for at least some of the plurality of content items, determining a geographic location that is pertinent to the text content of that content item based at least in part on the analysis, including indentifying one or more words, terms or names that are associated with the geographic location but which are not a proper name of the geographic location, wherein each category in a plurality of categories is associated with a category definition that comprises a category name and one or more terms, wherein each of the one or more terms associated with a particular category definition (1) identifies a term that is pertinent in determining whether a content item should be associated with a particular category associated with the particular category definition, and (2) excludes any terms that are not pertinent in determining whether a content item should be associated with the particular category associated with the particular category definition; determining that at least some of the content items are associated with one of a plurality of current event topics by comparing words present within each content item with the one or more terms associated with each character definition, wherein at least a portion of the plurality of categories are each associated with each of the plurality of current event topics; and evaluating additional criteria to weight the presence of the identified words, terms or names that are associated with a geographic location over other words, terms or names that are associated with another geographic location; and wherein the method further comprises; generating a presentation for each of a plurality of geographic locations, wherein each presentation makes available at least a portion of one or more content items that have been determined to be pertinent to that geographic location and which are associated with one or more of the current event topics, so that the presentation displays content items that are about current events that pertain to the geographic location of that presentation. - View Dependent Claims (53, 54, 55, 56, 57, 58)
-
-
59. A method for providing aggregated content from a network, the method being implemented using one or more processors that perform step comprising:
-
for each of a plurality of content items, programmatically analyzing a text of each content item in order to determine a subject of the content item, including identifying at least one term that is required for determining the subject, and one or more terms that are pertinent but not required for determining the subject, wherein each of the one or more terms excludes any terms that are not pertinent in determining the subject, wherein programmatically analyzing includes evaluating one or more additional criteria to weight the presence of the text of each content item that corresponds to a subject over other text that corresponds to a different subject; associating the content item with at least one of a presentation or a network location that is used to present content about the subject or a category of the subject; and making at least a portion of the content item available from a presentation provided at the network location. - View Dependent Claims (60, 61)
-
-
62. A method for providing aggregated content from a network, the method being implemented using one or more processors and comprising the steps of:
-
defining a plurality of geographic categories, each geographic category corresponding to a geographic location, each geographic category including a category definition that comprises (i) one or more category names that include a proper name of a geographic location that corresponds to that category, and (ii) one or more words, terms, and/or names other than the one or more category names, wherein each of the one or more words, terms, and/or names associated with a particular category definition (1) identifies a term that is pertinent in determining whether a content item should be associated with a particular category associated with the particular category definition, and (2) excludes any terms that are not pertinent in determining whether a content item should be associated with the particular category associated with the particular category definition; retrieving a plurality of content items from one or more network sites; for each of the plurality of content items that are retrieved, programmatically analyzing content contained in each content item in order to associate that content item with one or more of the plurality of geographic categories, wherein programmatically analyzing the content of each content item includes identifying, from the content of the analyzed content item, one or more words, terms, and/or names that are part of the definition of the associated geographic category and which are different than the name of the associated geographic category, wherein programmatically analyzing the content of each item includes evaluating additional criteria to weight the presence of the one or more words, terms, and/or names that are part of the definition of the associated category over other words, terms, and/or names that are part of a definition of another category. - View Dependent Claims (63, 64, 65, 66, 67, 68, 69)
-
-
70. A method for providing aggregated content from a network, the method being performed using one or more processors and comprising steps of:
-
for at least some of a plurality of geographic categories, defining that geographic category using one or more words, terms or names that are (i) known to be associated with a corresponding geographic location of that geographic category, (ii) but are not a proper name of the geographic category, wherein each of the one or more words, terms or names associated with a particular category (1) is pertinent in determining whether a content item should be associated with the particular category, and (2) excludes any terms that are not pertinent in determining whether a content item should be associated with the particular category; performing an analysis on a text content of each of a plurality of content items, wherein the plurality of content items are provided at a plurality of network locations on one or more network sites; wherein performing the analysis includes identifying, from the text content of individual content items, one of the one or more words, terms or names that are used to define one or more of the geographic categories, and wherein performing the analysis includes evaluating one or more additional criteria to weight the presence of the one or more words, terms or names that are used to define one or more of the geographic categories over other words, terms or names that are used to define another geographic category; as a result of performing the analysis, associating one or more content items with one or more of the plurality of geographic categories; generating a presentation for at least some of the plurality of geographic categories, wherein each presentation makes available at least a portion of one or more content items that are associated with that geographic category.
-
Specification