Systems and methods for enhancing web-based searching
First Claim
1. An information gathering system for optimizing searching comprising:
- a persistent storage storing data concerning entities, where each entity is stored according to a classification scheme that includes one or more predefined classifications;
a data extraction tool that extracts website content for a plurality of the entities associated with at least one of the predefined classifications; and
a content analyzer analyzing the extracted content, where the analyzed content is used to update the classification scheme.
2 Assignments
0 Petitions
Accused Products
Abstract
A system for enhancing web-based searching is provided. Categorizing and clustering techniques are used to optimize searching. Businesses are classified using a control group of predetermined categories. The predetermined categories may be SIC codes or headings that are used to describe business activities. The website addresses for a business listed in the control group is determined, and the content of the business'"'"'s website is extracted. The extracted content is associated with the predetermined category that the business is classified under. The extracted content is used to further enhance the overall classification scheme. The system may compare and match the extracted content with content of other business'"'"' websites, which are similarly categorized. If a relevant keyword match is identified in several of the websites, the keyword may be used to update the classification scheme. A new category or sub-category can be created based on this keyword. Furthermore, when a search is performed, the search results are organized by these categories, and using various processes, the most common results are kept and the less relevant results are discarded.
348 Citations
95 Claims
-
1. An information gathering system for optimizing searching comprising:
-
a persistent storage storing data concerning entities, where each entity is stored according to a classification scheme that includes one or more predefined classifications;
a data extraction tool that extracts website content for a plurality of the entities associated with at least one of the predefined classifications; and
a content analyzer analyzing the extracted content, where the analyzed content is used to update the classification scheme. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A method of optimizing searching in a data processing system comprising:
-
storing data concerning entities, where each entity is stored based on a classification scheme that includes one or more predefined classifications;
extracting website content for a plurality of the entities associated with at least one of the predefined classifications; and
analyzing the extracted content, where the analyzed content is used to update the classification scheme. - View Dependent Claims (29, 30, 34, 37, 38, 39, 43, 44, 45, 46, 50, 51)
-
-
31. (canceled).
-
32. (canceled).
-
33. (canceled).
-
35. (canceled).
-
36. (canceled).
-
40. (canceled).
-
41. (canceled).
-
42. (canceled).
-
47. (canceled).
-
48. (canceled).
-
49. (canceled).
-
52. A system for optimizing searching comprising:
-
means for storing data concerning entities, where each entity is stored according to a classification scheme that includes one or more predefined classifications;
means for extracting website content for a plurality of the entities associated with at least one of the predefined classifications; and
means for analyzing the extracted content, where the analyzed content is used to update the classification scheme.
-
-
53. A method of optimizing searching in a data processing system comprising:
-
storing data concerning entities, where each entity is stored according to a classification scheme that includes one or more predefined classifications;
storing website content for a plurality of the classified entities associated with at least one of the predefined classifications, where the website content is stored according to the classification scheme;
processing at least a portion of the website content to update the classification scheme;
searching for unclassified entities using at least a portion of the website content;
classifying the unclassified entities according to the classification scheme by identifying relationships between the unclassified entities and the classified entities; and
using the classification scheme, clustering search results. - View Dependent Claims (54, 57, 59, 60, 61)
-
-
55. (canceled).
-
56. (canceled).
-
58. (canceled).
-
62. (canceled).
-
63. (canceled).
-
64. (canceled).
-
65. (canceled).
-
66. (canceled).
-
67. (canceled).
-
68. (canceled).
-
69. (canceled).
-
70. A method of optimizing searching in a data processing system comprising:
-
obtaining information about the entities that are included in a control group, where the control group includes categorized entities;
searching for unclassified entities using the information about the entities in the control group; and
classifying the unclassified entities based on the information about the entities in the control group. - View Dependent Claims (71, 72, 73, 74, 75, 78, 79, 80, 81)
-
-
76. (canceled).
-
77. (canceled).
-
82. A method of optimizing searching in a data processing system comprising:
-
storing information about the entities that are included in a control group, where the control group includes categorized entities stored according to a classification scheme;
responding to a search request by processing search results responsive to the request; and
clustering the search results according to the classification scheme by comparing information about the control group with the search results. - View Dependent Claims (94, 95)
-
-
83. (canceled).
-
84. (canceled).
-
85. (canceled).
-
86. (canceled).
-
87. (canceled).
-
88. (canceled).
-
89. (canceled).
-
90. (canceled).
-
91. (canceled).
-
92. (canceled).
-
93. (canceled).
Specification