Rule based product classification
First Claim
1. A non-transitory computer-readable medium embodying a program executable in a computing device, the program comprising a set of instructions that, when executed by a processor of the computing device, cause the computing device to at least:
- facilitate access to a catalog of items, the catalog of items comprising an amount of unstructured text for describing each item, wherein each item corresponds to at least one category;
identify a plurality of keywords expressed within the amount of unstructured text with respect to a selected category;
select a plurality of subsets from the plurality of keywords, wherein each of the subsets comprises a unique combination of the plurality of keywords with respect to each other and wherein each of the subsets includes a predetermined number of search terms;
generate a respective rule for each corresponding subset, wherein an application of each respective rule specifies a respective binary result depending upon whether the corresponding subset is included in a prospective seller product description;
apply each respective rule to the prospective seller product description to determine whether a predefined number of the rules indicate a like binary result for indicating whether the prospective seller product description expresses an item within the selected category;
calculate a defect rate corresponding to each respective rule, the defect rate representing a percentage of a plurality of product descriptions correctly classified with each respective rule; and
update each respective rule based at least in part on the corresponding defect rate.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed are various embodiments for item categorizer. The item categorizer is configured to parse, in at least one computing device, at least a plurality of product descriptions of a like item category for identifying a plurality of keywords with regard to the like item category. Furthermore, the item categorizer selects a plurality of subsets of keywords from the plurality of keywords, each subset of keywords comprises a unique combination of the plurality of keywords with respect to each other. Moreover, the item categorizer is configured to generate a respective rule for each corresponding subset of keywords, wherein an application of each respective rule specifies a respective binary result depending upon whether the corresponding subset of keywords is included in a seller product description.
23 Citations
23 Claims
-
1. A non-transitory computer-readable medium embodying a program executable in a computing device, the program comprising a set of instructions that, when executed by a processor of the computing device, cause the computing device to at least:
-
facilitate access to a catalog of items, the catalog of items comprising an amount of unstructured text for describing each item, wherein each item corresponds to at least one category; identify a plurality of keywords expressed within the amount of unstructured text with respect to a selected category; select a plurality of subsets from the plurality of keywords, wherein each of the subsets comprises a unique combination of the plurality of keywords with respect to each other and wherein each of the subsets includes a predetermined number of search terms; generate a respective rule for each corresponding subset, wherein an application of each respective rule specifies a respective binary result depending upon whether the corresponding subset is included in a prospective seller product description; apply each respective rule to the prospective seller product description to determine whether a predefined number of the rules indicate a like binary result for indicating whether the prospective seller product description expresses an item within the selected category; calculate a defect rate corresponding to each respective rule, the defect rate representing a percentage of a plurality of product descriptions correctly classified with each respective rule; and update each respective rule based at least in part on the corresponding defect rate. - View Dependent Claims (2, 3, 4)
-
-
5. A system, comprising:
-
at least one computing device comprising a processor and a memory; and an application stored in the memory and comprising instructions executable by the processor of the at least one computing device, wherein the application, when executed, causes the at least one computing device to at least; access a catalog that organizes a plurality of items into a categorical hierarchy, wherein each item is expressed in terms of a product description; parse at least a portion of the product descriptions associated with a selected category to generate a plurality of keywords for the selected category; select a subset of the plurality of keywords; generate at least one rule based on the subset, wherein an application of the at least one rule specifies a respective binary result depending upon whether the subset of the plurality of keywords is included in a prospective seller product description; apply the at least one rule to the prospective seller product description to determine that the prospective seller product description expresses an item within the selected category; calculate a percentage of a plurality of product descriptions correctly categorized with the at least one rule; and update the at least one rule based at least in part on the calculated percentage. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method, comprising:
-
parsing, in at least one computing device, at least a plurality of product descriptions of a like item category for identifying a plurality of keywords associated with the like item category; selecting, in the at least one computing device, a plurality of subsets of keywords from the plurality of keywords, wherein each of the subsets of keywords comprises a unique combination of the plurality of keywords with respect to each other; generating, in the at least one computing device, a respective rule for each corresponding subset of keywords, wherein an application of each respective rule specifies a respective binary result depending upon whether the corresponding subset of keywords is included in a seller product description; calculating, in the at least one computing device, a percentage of the plurality of product descriptions correctly classified by the respective rule to generate a defect rate for the respective rule; and updating, in the at least one computing device, the respective rule based at least in part on the defect rate. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23)
-
Specification