Attribute scoring for unstructured content
First Claim
1. A method for scoring of an attribute exhibited in a corpus of unstructured documents, the method comprising:
- decomposing each unstructured document into subdocuments;
determining an attribute score for each subdocument in said corpus corresponding to the level said subdocument exhibits a feature associated with the attribute; and
combining subdocument attribute scores for each document to produce a document attribute score.
12 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides for a system and method of assigning attribute scores, including those for abstract and diverse concepts such as sentiment (e.g., happiness, anger, sadness) to documents containing unstructured content (natural language content). Such content may include, but is not limited to, Web pages, e-mails, word processing documents, computer logs, chat logs, audio files, graphical images, text files, books, magazines, articles, etc. The attributes that are scored are specialized views of the unstructured content. Accordingly, attribute scoring denotes the processes of assigning one or more new measures to a document containing unstructured content. The attribute score or measure indicates the degree of the attributes found within the document.
111 Citations
58 Claims
-
1. A method for scoring of an attribute exhibited in a corpus of unstructured documents, the method comprising:
-
decomposing each unstructured document into subdocuments;
determining an attribute score for each subdocument in said corpus corresponding to the level said subdocument exhibits a feature associated with the attribute; and
combining subdocument attribute scores for each document to produce a document attribute score. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 21, 22, 23, 24, 27, 31, 32, 36, 37, 42, 47)
-
-
20. A computer readable medium containing computer readable instructions for scoring of an attribute exhibited in a corpus of unstructured documents comprising
learning a component for learning to what level features are associated with the attribute; - and
scoring a component for determining attribute scores by;
decomposing each unstructured document into subdocuments;
determining an attribute score for each subdocument in said corpus corresponding to the level said subdocument exhibits a feature associated with the attribute; and
combining subdocument attribute scores for each document to produce a document attribute score. - View Dependent Claims (25, 26, 28, 29, 30, 33, 34, 38, 39, 46, 51, 52, 53, 56, 57, 58)
- and
-
35. A computing apparatus, comprising a processor and a memory containing computer executable instructions operative to score an attribute exhibited in a corpus of unstructured documents by:
-
decomposing each unstructured document into subdocuments;
determining an attribute score for each subdocument in said corpus corresponding to the level said subdocument exhibits a feature associated with the attribute; and
combining subdocument attribute scores for each document to produce a document attribute score. - View Dependent Claims (40, 41, 43, 44, 45, 48, 49, 54)
-
-
50. A computer readable medium containing:
-
a data structure comprised of;
a plurality of unstructured documents; and
a plurality of attribute scores associated with said unstructured documents, wherein said attribute scores were derived by;
separating said unstructured documents into component parts;
determining to what extent each of said component parts embodies a desired attribute;
assigning a value corresponding to the extent that each component part embodies said desired attribute; and
aggregating said values to form document values that embody the extent that said unstructured documents embody said desired attribute. - View Dependent Claims (55)
-
Specification