Categorizing objects, such as documents and/or clusters, with respect to a taxonomy and data structures derived from such categorization
First Claim
Patent Images
1. A computer-implemented method to associate a semantic cluster with one or more categories of a predefined taxonomy, the method comprising:
- a) accepting, by a computer system including at least one computer, a plurality of semantic clusters of re-occurring terms within a document, and having a frequency based on the reoccurrence of the term;
b) identifying, by the computer system based on the accepted clusters of re-occurring terms within the document, one or more concepts for the document, each concept identifying different re-occurring terms having identical meanings;
c) scoring, by the computer system, the identified one or more concepts, the score of each of the one or more concepts weighted by cluster frequency of each of the re-occurring terms identified by said concept;
d) identifying, by the computer system, a set of one or more categories using at least some of the one or more scored concepts to look up one or more categories in a concept-category index, wherein a category corresponds to a node of the predefined taxonomy, which defines a structured set of categories; and
e) associating, by the computer system, at least some of the one or more categories with the semantic cluster.
2 Assignments
0 Petitions
Accused Products
Abstract
A Website may be automatically categorized by (a) accepting Website information, (b) determining a set of scored clusters (e.g., semantic, term co-occurrence, etc.) for the Website using the Website information, and (c) determining at least one category (e.g., a vertical category) of a predefined taxonomy using at least some of the set of clusters.
36 Citations
13 Claims
-
1. A computer-implemented method to associate a semantic cluster with one or more categories of a predefined taxonomy, the method comprising:
-
a) accepting, by a computer system including at least one computer, a plurality of semantic clusters of re-occurring terms within a document, and having a frequency based on the reoccurrence of the term; b) identifying, by the computer system based on the accepted clusters of re-occurring terms within the document, one or more concepts for the document, each concept identifying different re-occurring terms having identical meanings; c) scoring, by the computer system, the identified one or more concepts, the score of each of the one or more concepts weighted by cluster frequency of each of the re-occurring terms identified by said concept; d) identifying, by the computer system, a set of one or more categories using at least some of the one or more scored concepts to look up one or more categories in a concept-category index, wherein a category corresponds to a node of the predefined taxonomy, which defines a structured set of categories; and e) associating, by the computer system, at least some of the one or more categories with the semantic cluster. - View Dependent Claims (2, 3, 4, 5, 6, 12, 13)
-
-
7. A computer-implemented method to associate a property with one or more categories of a predefined taxonomy, the method comprising:
-
a) accepting, by a computer system including at least one computer, information about the property; b) identifying, by the computer system, a set of one or more scored semantic clusters using the accepted property information, wherein each of the one or more scored semantic clusters comprise a re-occurring term within (A) a search session on a search engine, or (B) a document available on the World Wide Web, at a frequency based on the reoccurrence; c) identifying, by the computer system based on the identified scored semantic clusters of re-occurring terms within the search session or document, one or more concepts for the property, each concept identifying different re-occurring terms having identical meanings; d) scoring, by the computer system, the identified one or more concepts, the score of each of the one or more concepts weighted by cluster frequency of each of the re-occurring terms identified by said concept; e) identifying, by the computer system, a set of one or more categories according to the scores of the one or more concepts; and f) associating, by the computer system, at least some of the one or more categories with the property. - View Dependent Claims (8, 9, 10, 11)
-
Specification