Learning-based data decontextualization
First Claim
1. A computer-implemented method, comprising:
- accessing a question having a predetermined answer, the question being associated with operations of at least one computing system;
accessing at least one dataset associated with the question, the at least one dataset including data describing the operations of the at least one computing system and contextual information about the data;
selecting at least one decontextualization operation from a plurality of decontextualization operations, wherein the at least one decontextualization operation at least partly alters the contextual information about the data included in the at least one dataset;
applying the at least one decontextualization operation to the at least one dataset to determine at least one modified dataset in which the contextual information about the data included in the at least one dataset is at least partly altered;
sending the at least one modified dataset including the data and the altered contextual information and the question to a plurality of worker devices associated with a plurality of workers in a crowdsourcing framework;
receiving, from the plurality of worker devices, a plurality of answers to the question, the plurality of answers generated by the plurality of workers analyzing the at least one modified dataset having the contextual information about the data at least partly altered and the data in view of the question;
incorporating, into training data, the plurality of answers and information describing the at least one decontextualization operation; and
employing the training data in machine learning to train a decontextualizer to be used in subsequent data decontextualization to answer subsequent questions using the crowdsourcing framework.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are described for employing a crowdsourcing framework to analyze data related to the performance or operations of computing systems, or to analyze other types of data. A question is analyzed to determine data that is relevant to the question. The relevant data may be decontextualized to remove or alter contextual information included in the data, such as sensitive, personal, or business-related data. The question and the decontextualized data may then be presented to workers in a crowdsourcing framework, and the workers may determine an answer to the question based on an analysis or an examination of the decontextualized data. The answers may be combined, correlated, or otherwise processed to determine a processed answer to the question. Machine learning techniques are employed to adjust and refine the decontextualization.
16 Citations
20 Claims
-
1. A computer-implemented method, comprising:
-
accessing a question having a predetermined answer, the question being associated with operations of at least one computing system; accessing at least one dataset associated with the question, the at least one dataset including data describing the operations of the at least one computing system and contextual information about the data; selecting at least one decontextualization operation from a plurality of decontextualization operations, wherein the at least one decontextualization operation at least partly alters the contextual information about the data included in the at least one dataset; applying the at least one decontextualization operation to the at least one dataset to determine at least one modified dataset in which the contextual information about the data included in the at least one dataset is at least partly altered; sending the at least one modified dataset including the data and the altered contextual information and the question to a plurality of worker devices associated with a plurality of workers in a crowdsourcing framework; receiving, from the plurality of worker devices, a plurality of answers to the question, the plurality of answers generated by the plurality of workers analyzing the at least one modified dataset having the contextual information about the data at least partly altered and the data in view of the question; incorporating, into training data, the plurality of answers and information describing the at least one decontextualization operation; and employing the training data in machine learning to train a decontextualizer to be used in subsequent data decontextualization to answer subsequent questions using the crowdsourcing framework. - View Dependent Claims (2, 3, 4)
-
-
5. A system, comprising:
-
at least one computing device configured to implement one or more services, wherein the one or more services are configured to; access at least one dataset associated with a question, the at least one dataset including data and contextual information about the data and the question having a predetermined answer; determine at least one decontextualization operation that at least partly alters the contextual information about the data included in the at least one dataset; apply the at least one decontextualization operation to the contextual information about the data of the at least one dataset to determine at least one modified dataset in which the contextual information about the data included in the at least one dataset is at least partly altered; send the at least one modified dataset that includes the data and the at least partly altered contextual information about the data and the question to a plurality of worker devices associated with a plurality of workers in a crowdsourcing framework; receive, from the plurality of worker devices, a plurality of answers to the question, the plurality of answers generated by the plurality of workers analyzing the at least one modified dataset that includes the data and the at least partly altered contextual information about the data in view of the question; incorporate, into training data, the plurality of answers and information describing the at least one decontextualization operation; and employ the training data in machine learning to train a decontextualizer to be used in subsequent data decontextualization to answer subsequent questions using the crowdsourcing framework. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. One or more computer-readable media storing instructions which, when executed by at least one processor, instruct the at least one processor to perform actions comprising:
-
accessing at least one dataset associated with a question, the at least one dataset including data and contextual information about the data; determining at least one decontextualization operation that at least partly alters the contextual information about the data included in the at least one dataset with the data; applying the at least one decontextualization operation to the at least one dataset to determine at least one modified dataset that includes the data and the contextual information about the data included in the at least one modified dataset is at least partly altered; receiving, from a plurality of worker devices associated with a plurality of workers in a crowdsourcing framework, a plurality of answers to the question, the plurality of answers generated by the plurality of workers analyzing the at least one modified dataset having data and at least partly altered contextual information about the data in view of the question; incorporating, into training data, the plurality of answers and information describing the at least one decontextualization operation; and employing the training data in machine learning to train a decontextualizer to be used in subsequent data decontextualization to answer subsequent questions using the crowdsourcing framework. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification