ATTRIBUTE FILL USING TEXT EXTRACTION
First Claim
Patent Images
1. A computer-implemented method, comprising:
- identifying, for a first item of a plurality of items, at least one category associated with the first item;
determining, by a computer system, a plurality of attributes common to the identified at least one category;
identifying at least one attribute of the plurality of attributes that is not populated for the first item;
extracting, from at least one second item of the plurality of items, a plurality of existing values associated with the at least one attribute of the plurality of attributes;
identifying, from text associated with the first item, a plurality of candidate values, the plurality of candidate values comprising at least one candidate value for the at least one attribute of the plurality of attributes;
filtering, based at least in part on the plurality of existing values associated with the at least one attribute of the plurality of attributes, the plurality of candidate values to determine a likely value; and
populating, by the computing system, the at least one attribute of the plurality of attributes of the first item with the likely value.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods involve filling missing attribute values from unstructured text. A computing device may provide a plurality of items, such as an item catalog for an electronic marketplace. When an item is found to have a missing attribute value, a plurality of existing values for that attribute is compiled by mining other items. Text associated with the item is parsed to determine possible values for the attribute. From those possible values, the most likely value is identified and the missing attribute value is populated with that value.
31 Citations
20 Claims
-
1. A computer-implemented method, comprising:
-
identifying, for a first item of a plurality of items, at least one category associated with the first item; determining, by a computer system, a plurality of attributes common to the identified at least one category; identifying at least one attribute of the plurality of attributes that is not populated for the first item; extracting, from at least one second item of the plurality of items, a plurality of existing values associated with the at least one attribute of the plurality of attributes; identifying, from text associated with the first item, a plurality of candidate values, the plurality of candidate values comprising at least one candidate value for the at least one attribute of the plurality of attributes; filtering, based at least in part on the plurality of existing values associated with the at least one attribute of the plurality of attributes, the plurality of candidate values to determine a likely value; and populating, by the computing system, the at least one attribute of the plurality of attributes of the first item with the likely value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory computer-readable storage medium storing computer-executable instructions that, when executed by a processor, configure the processor to perform operations comprising:
-
identifying at least one first item of a plurality of items for which an attribute is unpopulated; extracting a plurality of existing values for the attribute based at least in part on at least one second item of the plurality of items, the at least one second item having the attribute populated; determining, based at least in part on the plurality of existing values, an appropriate value for the attribute from text associated with the at least one first item; and populating the attribute of the at least one first item of the plurality of items with the determined appropriate value. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A system, comprising:
-
a processor; and a memory device including instructions that, when executed by the processor, cause the system to; provide a first item categorization for a set of items; determine at least one attribute associated with the first item categorization; identify at least one first item of the set of items for which the associated at least one attribute is empty; determine, from one or more second items of the set of items, a range of existing values for the attribute; identify, from data associated with the first item, at least one potential value for the attribute based at least in part on the determined range of values for the attribute; determine, from the at least one potential value for the attribute, a probable value; and set the at least one attribute associated with the first item to the probable value. - View Dependent Claims (18, 19, 20)
-
Specification