EXTRACTION OF ATTRIBUTES AND VALUES FROM NATURAL LANGUAGE DOCUMENTS
First Claim
1. A method for identifying at least one attribute and at least one value for a product based on at least one natural language document, the method comprising:
- identifying the at least one attribute and the at least one value for the product via a classification algorithm operating upon the at least one natural language document; and
storing the at least one attribute and the at least one value.
2 Assignments
0 Petitions
Accused Products
Abstract
One or more classification algorithms are applied to at least one natural language document in order to extract both attributes and values of a given product. Supervised classification algorithms, semi-supervised classification algorithms, unsupervised classification algorithms or combinations of such classification algorithms may be employed for this purpose. The at least one natural language document may be obtained via a public communication network. Two or more attributes (or two or more values) thus identified may be merged to form one or more attribute phrases or value phrases. Once attributes and values have been extracted in this manner, association or linking operations may be performed to establish attribute-value pairs that are descriptive of the product. In a presently preferred embodiment, an (unsupervised) algorithm is used to generate seed attributes and values which can then support a supervised or semi-supervised classification algorithm.
-
Citations
39 Claims
-
1. A method for identifying at least one attribute and at least one value for a product based on at least one natural language document, the method comprising:
-
identifying the at least one attribute and the at least one value for the product via a classification algorithm operating upon the at least one natural language document; and storing the at least one attribute and the at least one value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An apparatus for identifying at least one attribute and at least one value for a product based on at least one natural language document, comprising:
-
a classification algorithm module operative to identify the at least one attribute and the at least one value for the product based on the at least one natural language document; and a machine readable store operative to store the at least one attribute and the at least one value. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer-readable medium having stored thereon executable instructions that, when executed, cause a computer to:
-
identify at least one attribute and at least one value for a product via a classification algorithm operating upon at least one natural language document; and store the at least one attribute and the at least one value. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A method for identifying at least one attribute and at least one value of a product based on at least one natural language document, the method comprising:
-
identifying a first set of attributes and a first set of values of the product via a supervised algorithm as applied to the at least one natural language document; identifying a second set of attributes and a second set of values of the product via a semi-supervised algorithm as applied to the at least one natural language document based at least in part upon the first set of attributes and the first set of values; providing the first set of attributes and the second set of attributes as the at least one attribute; providing the first set of values and the second set of values as the at least one value; and storing the at least one attribute and the at least one value. - View Dependent Claims (32, 33)
-
-
34. An apparatus for identifying at least one attribute and at least one value of a product based on at least one natural language document, comprising:
-
a supervised classification algorithm module operative to identify a first set of attributes and a first set of values of the product based on the at least one natural language document; a semi-supervised classification algorithm module operative to identify a second set of attributes and a second set of values of the product based at least in part upon the at least one natural language document and the first set of attributes and the first set of values; and a machine readable store operative to store the first set of attributes and the second set of attributes as the at least one attribute, and to store the first set of values and the second set of values as the at least one value. - View Dependent Claims (35, 36)
-
-
37. A computer-readable medium having stored thereon executable instructions that, when executed, cause the computer to:
-
identify a first set of attributes and a first set of values of a product via a supervised classification algorithm as applied to at least one natural language document; identify a second set of attributes and a second set of values of the product via a semi-supervised classification algorithm as applied to the at least one natural language document based at least in part upon the first set of attributes and the first set of values; provide the first set of attributes and the second set of attributes as at least one attribute; provide the first set of values and the second set of values as at least one value; and store the at least one attribute and the at least one value. - View Dependent Claims (38, 39)
-
Specification