SYSTEM FOR MONITORING GLOBAL ONLINE OPINIONS VIA SEMANTIC EXTRACTION
First Claim
1. A system for transforming domain specific unstructured data into structured data, said system comprising:
- an intake platform comprising;
an intake acquisition module acquiring structured data, semi-structured data, and associated metadata related to a domain and problem of interest, said intake acquisition module developing baseline data related to said domain and problem of interest from said structured data, semi-structured data, and associated metadata;
an intake pre-processing module receiving structured and unstructured content related to said domain and problem of interest;
an intake language module providing word equivalents to words within said structured and unstructured content according to a domain and problem of interest;
an intake application descriptors module providing definitions of key descriptors within said domain and problem of interest; and
an intake adjudication module processing said structured and unstructured content using said baseline data, said word equivalents, and said key descriptors to develop a workflow for classifying said structured and unstructured content for said domain and problem of interest; and
a control platform comprising;
a control data acquisition module identifying data acquisition and data analysis errors relating to said receiving of said structured and unstructured content and said classifying of said structured and unstructured content;
a control data consistency collator analyzing states of said data within said workflow to identify state errors;
a control auditor monitoring sources of said structured and unstructured content and monitoring said data within said workflow to identify source and processing errors;
a control event definition and policy repository maintaining policies for resolving said data acquisition and data analysis errors, said state errors, and said source and processing errors; and
an error resolver resolving said data acquisition and data analysis errors, said state errors, and said source and processing errors according to said policies; and
an output outputting results of said workflow, said results of said workflow comprising said structured and unstructured content classified into structured data enabled to be used in data analytics.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for transforming domain specific unstructured data into structured data including an intake platform controlled by feed back from a control platform. The intake platform includes an intake acquisition module for acquiring data building baseline data related to a domain and problem of interest, an intake pre-processing module, an intake language module, an intake application descriptors module, and an intake adjudication module. The control platform includes a control data acquisition module, a control data consistency collator, a control auditor, a control event definition and policy repository, an error resolver, and an output that outputs results of the workflow into structured data enabled to be used in data analytics.
-
Citations
17 Claims
-
1. A system for transforming domain specific unstructured data into structured data, said system comprising:
-
an intake platform comprising; an intake acquisition module acquiring structured data, semi-structured data, and associated metadata related to a domain and problem of interest, said intake acquisition module developing baseline data related to said domain and problem of interest from said structured data, semi-structured data, and associated metadata; an intake pre-processing module receiving structured and unstructured content related to said domain and problem of interest; an intake language module providing word equivalents to words within said structured and unstructured content according to a domain and problem of interest; an intake application descriptors module providing definitions of key descriptors within said domain and problem of interest; and an intake adjudication module processing said structured and unstructured content using said baseline data, said word equivalents, and said key descriptors to develop a workflow for classifying said structured and unstructured content for said domain and problem of interest; and a control platform comprising; a control data acquisition module identifying data acquisition and data analysis errors relating to said receiving of said structured and unstructured content and said classifying of said structured and unstructured content; a control data consistency collator analyzing states of said data within said workflow to identify state errors; a control auditor monitoring sources of said structured and unstructured content and monitoring said data within said workflow to identify source and processing errors; a control event definition and policy repository maintaining policies for resolving said data acquisition and data analysis errors, said state errors, and said source and processing errors; and an error resolver resolving said data acquisition and data analysis errors, said state errors, and said source and processing errors according to said policies; and an output outputting results of said workflow, said results of said workflow comprising said structured and unstructured content classified into structured data enabled to be used in data analytics. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A programmable storage medium tangibly embodying/for storing a program of machine-readable instructions executable by a digital processing apparatus to perform operations supporting a method of transforming domain specific unstructured and broken language data into structured data for enabling custom data analytics, the operations comprising:
-
acquiring structured data, semi-structured data, and associated metadata related to a domain and problem of interest; developing baseline data related to said domain and problem of interest from said structured data, semi-structured data, and associated metadata; receiving structured and unstructured content related to said domain and problem of interest; providing word equivalents to words within said structured and unstructured content according to a domain and problem of interest; providing definitions of key descriptors within said domain and problem of interest; processing said structured and unstructured content using said baseline data, said word equivalents, and said key descriptors to develop a workflow for classifying said structured and unstructured content for said domain and problem of interest; identifying data acquisition and data analysis errors relating to said receiving of said structured and unstructured content and said classifying of said structured and unstructured content; analyzing states of said data within said workflow to identify state errors; monitoring sources of said structured and unstructured content and monitoring said data within said workflow to identify source and processing errors; maintaining policies for resolving said data acquisition and data analysis errors, said state errors, and said source and processing errors; resolving said data acquisition and data analysis errors, said state errors, and said source and processing errors according to said policies; and outputting results of said workflow, said results of said workflow comprising said structured and unstructured content classified into structured data enabled to be used in data analytics. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A programmable storage medium tangibly embodying/for storing a program of machine-readable instructions executable by a digital processing apparatus to perform operations supporting a method of transforming domain specific unstructured and broken language data into structured data for enabling custom data analytics, the operations comprising:
-
acquiring structured data, semi-structured data, and associated metadata related to a domain and problem of interest; developing baseline data related to said domain and problem of interest from said structured data, semi-structured data, and associated metadata; receiving structured and unstructured content related to said domain and problem of interest; providing word equivalents to words within said structured and unstructured content according to a domain and problem of interest; providing definitions of key descriptors within said domain and problem of interest; processing said structured and unstructured content using said baseline data, said word equivalents, and said key descriptors to develop a workflow for classifying said structured and unstructured content for said domain and problem of interest; identifying data acquisition and data analysis errors relating to said receiving of said structured and unstructured content and said classifying of said structured and unstructured content; analyzing states of said data within said workflow to identify state errors; monitoring sources of said structured and unstructured content and monitoring said data within said workflow to identify source and processing errors; maintaining policies for resolving said data acquisition and data analysis errors, said state errors, and said source and processing errors; resolving said data acquisition and data analysis errors, said state errors, and said source and processing errors according to said policies; and outputting results of said workflow, said results of said workflow comprising said structured and unstructured content classified into structured data enabled to be used in data analytics, wherein said acquiring is further to one of websites, forums and feeds which provide one of user opinions, behavior and preferences, and said user opinions, behavior and preferences are be supplemented by one of user demographics, domain and application relevant structured and semi-structured repositories are used to provide baseline data.
-
Specification