Mining semi-structured social media
First Claim
Patent Images
1. A method, comprising:
- performing, by one or more computing devices;
collecting user-generated content entries from one or more social media data sources;
classifying the user-generated content entries into categories based at least in part on an analysis of respective structured components of at least a subset of the entries, the structured components including data organized in a defined set of fields of respective data types;
determining, based at least in part on a semantic analysis of unstructured components of the entries of a particular category, a set of representative user-generated content elements of the entries of the particular category comprising a set of frequent word combinations, the unstructured components including additional data of arbitrary length and not organized in the defined set of fields of the respective data types;
generating a report that comprises one or more of the representative user-generated content elements of the set of representative user-generated content elements of the entries of the particular category; and
generating, based at least in part on the report, a survey question in requesting additional feedback associated with a particular frequent word combination of the set of frequent word combinations.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for analysis of semi-structured social media are described. A method comprises classifying a plurality of user-generated content entries into a plurality of categories based at least in part on an analysis of respective structured components of at least a subset of the plurality of entries. The method further includes determining, based at least in part on an analysis of additional components of entries of a particular category, a set of representative content elements of entries of the particular category, and generating a report that comprises one or more representative content elements of the particular category.
-
Citations
20 Claims
-
1. A method, comprising:
performing, by one or more computing devices; collecting user-generated content entries from one or more social media data sources; classifying the user-generated content entries into categories based at least in part on an analysis of respective structured components of at least a subset of the entries, the structured components including data organized in a defined set of fields of respective data types; determining, based at least in part on a semantic analysis of unstructured components of the entries of a particular category, a set of representative user-generated content elements of the entries of the particular category comprising a set of frequent word combinations, the unstructured components including additional data of arbitrary length and not organized in the defined set of fields of the respective data types; generating a report that comprises one or more of the representative user-generated content elements of the set of representative user-generated content elements of the entries of the particular category; and generating, based at least in part on the report, a survey question in requesting additional feedback associated with a particular frequent word combination of the set of frequent word combinations. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
9. A system, comprising:
-
one or more processors; and a memory comprising program instructions executable by the one or more processors to; collect user-generated content entries from one or more social media data sources; classify the user-generated content entries into categories based at least in part on an analysis of respective structured components of at least a subset of the entries, the structured components including data organized in a defined set of fields of respective data types; determine, based at least in part on a statistical analysis of unstructured components of the entries of a particular category, a set of representative user-generated content elements of the entries of the particular category comprising a set of frequent word combinations, the unstructured components including additional data of arbitrary length and not organized in the defined set of fields of the respective data types; generate a report that comprises one or more of the representative user-generated content elements of the set of representative user-generated content elements of the entries of the particular category; and generate, based at least in part on the report, a survey question in requesting additional feedback associated with a particular frequent word combination of the set of frequent word combinations. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable storage medium storing program instructions that when executed by a computing device implement:
-
collecting user-generated content entries from one or more social media data sources; classifying the user-generated content entries into categories based at least in part on an analysis of respective structured components of at least a subset of the entries, the structured components including data organized in a defined set of fields of respective data types; determining, based at least in part on a combination of statistics and semantic analysis of unstructured components of the entries of a particular category, a set of representative user-generated content elements of the entries of the particular category comprising a set of frequent word combinations, the unstructured components including additional data of arbitrary length and not organized in the defined set of fields of the respective data types; generating a report that comprises one or more of the representative user-generated content elements of the set of representative user-generated content elements of the entries of the particular category; and generating, based at least in part on the report, a survey question in requesting additional feedback associated with a particular frequent word combination of the set of frequent word combinations. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification