Data gathering and distribution system
First Claim
1. A system for gathering and classifying relevant data from the world wide web, the system comprising:
- an extractor for crawling the world wide web and producing extracted information from at least one website;
an industry database containing a list of industry groups;
a company database containing profiles of companies;
an information database containing data records, each of said data records having an associated industry group selected from said list of industry groups;
a classifier for receiving said extracted information, said classified including a company comparison component for determining if said extracted information relates to a company profiled in said company database, and an industry component for determining if said extracted information relates to an industry listed in said list of industry groups, and a classification component responsive to said company comparison component and said industry comparison component for storing said extracted information in said information database as one of said data records.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for gathering and distributing data. The system is for extracting information from the world wide web and classifying the information in accordance with certain profiles. The information may include business intelligence, which may be categorized according to its relevance to predefined industry profiles or company profiles. The information may be further categorized according to its relevance to particular countries. If the information relates to a new company, the system builds a new company profile based upon the information. Users may create a user profile containing their information preferences, such as industry groups or particular countries or companies, and the system provides reports or alerts to the users referencing extracted information that is filtered by the user profile.
-
Citations
37 Claims
-
1. A system for gathering and classifying relevant data from the world wide web, the system comprising:
-
an extractor for crawling the world wide web and producing extracted information from at least one website;
an industry database containing a list of industry groups;
a company database containing profiles of companies;
an information database containing data records, each of said data records having an associated industry group selected from said list of industry groups;
a classifier for receiving said extracted information, said classified including a company comparison component for determining if said extracted information relates to a company profiled in said company database, and an industry component for determining if said extracted information relates to an industry listed in said list of industry groups, and a classification component responsive to said company comparison component and said industry comparison component for storing said extracted information in said information database as one of said data records. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system for gathering and classifying relevant data from the world wide web, the system comprising:
-
an extractor for navigating the world wide web and extracting information from at least one website;
a first memory storing a categorization scheme, said categorization scheme defining a plurality of categories;
a second memory storing a plurality of entity profiles, each entity profile being associated with at least one of said categories;
a third memory storing information records, each of said information records having an associated category selected from said plurality of categories;
a classifier for receiving said extracted information, said classified including an entity comparison component for determining if said extracted information relates to one of said entity profiles, and a categorization component for determining if said extracted information relates to one of said categories, and a classification component responsive to said entity comparison component and said categorization component for storing said extracted information in said third memory as one of said information records. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
-
-
24. A method for gathering and classifying information from the world wide web, the method comprising the steps of:
-
providing a categorization scheme of industry groups;
extracting information from a website;
classifying said extracted information by determining if said extracted information is relevant to at least one of said industry groups; and
linking said extracted information with said at least one of said industry groups. - View Dependent Claims (36, 37)
-
Specification