One-step data mining with natural language specification and results
First Claim
1. A method for describing the goal of a data mining operation, the method comprising providing a user interface having a control for receiving natural language input;
- receiving natural language input describing the goal of the data mining operation from the control on the user interface.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus in various embodiments for controlling a data mining operation by specifying the goal of data mining in natural language, processing the data mining operation without any further input beyond the problem specification, and displaying key performance results of a data mining operation in natural language. One embodiment includes provides a user interface having a control for receiving natural language input describing the goal of the data mining operation from the control on the user interface. A second embodiment identifies key performance results, providing a user interface having a control for communicating information, and communicating a natural language description of the key performance results using the control on the user interface. In a third embodiment input data determining a data mining operation goal is the only input required by the data mining application.
55 Citations
35 Claims
-
1. A method for describing the goal of a data mining operation, the method comprising
providing a user interface having a control for receiving natural language input; receiving natural language input describing the goal of the data mining operation from the control on the user interface. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
12. A method in a computer system for communicating results of a data mining operation, the method comprising:
-
identifying key performance results;
providing a user interface having a control for communicating information;
communicating a natural language description of the key performance results using the control on the user interface. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A method in a computer system for controlling a data mining operation, the method comprising:
receiving problem specification input determining a data mining operation goal, wherein the input data determining a data mining operation goal is the only input required by the data mining application. - View Dependent Claims (18, 19, 20)
-
21. A data mining application user interface comprising:
-
a control that receives natural language input describing the goal of a data mining operation; and
an interface that sends the natural language input to a text parser. - View Dependent Claims (22, 23)
-
-
24. A computer data signal stream for communicating the goal of a data mining operation, the data signal stream comprising:
-
natural language input data describing the goal of the data mining operation, the natural language input data being available for lexical analysis to identify at least one candidate data field;
problem specification data which specifies a goal of the data mining operation based on the at least one candidate data field identified by lexical analysis.
-
-
25. A computer data signal stream for controlling a data mining operation, the data signal stream consisting essentially of input data specifying the goal of the data mining operation, whereby no additional input is required to obtain useful results.
-
26. An article of manufacture for a data mining application, the data mining application being available to perform a data mining operation on a database having fields, the data mining operation based on a dependent variable, the article of manufacture comprising a computer readable medium, the computer readable medium containing:
-
computer program code that provides for receiving natural language data describing the goal of a data mining operation;
computer program code that provides for sending the natural language data to a text parser;
computer program code that provides for performing a lexical analysis of the natural language data using a Bayesian network;
computer program code that compares results of the lexical analysis to a database field to calculate a maximum a posteriori probability that the database field is the dependent variable;
computer program code that outputs the identity of candidate database fields more likely than other database fields to be the dependent variable; and
computer program code that provides for receiving problem specification data based on the candidate database fields.
-
-
27. An article of manufacture for a data mining application, the article of manufacture comprising a computer readable medium containing
a plurality of natural language text templates for communicating the key performance results; computer program code that selects one text templates from among the plurality of text templates for communicating the key performance results, whereby the user interface does not display the same text template for every data mining operation.
-
28. An article of manufacture for a data mining application, the article of manufacture comprising a computer readable medium containing computer program code that provides for receiving input determining a data mining operation goal, wherein the input determining a data mining operation goal is the only input required by the data mining application.
-
29. An article of manufacture for a data mining application, the article of manufacture comprising a computer readable medium containing computer program code selected from the group consisting of:
- computer program code that receives natural language text providing a data mining operation goal;
computer program code that displays key data mining performance results in natural language text; and
computer program code that receives input providing a data mining operation goal, wherein the input providing a data mining operation goal is the only input required by the data mining application. - View Dependent Claims (31, 32, 33, 34)
- computer program code that receives natural language text providing a data mining operation goal;
-
30. A user control method for a data mining application, the user control method comprising:
-
specifying a goal of data mining in natural language text; and
displaying key data mining performance results in natural language text.
-
-
35. A problem specification method for mapping a data mining goal expressed in natural language to data fields, the method comprising:
-
providing a set of fields having field names, receiving natural language text describing a data mining operation goal, wherein a data mining operation goal includes at least one dependent variable, identifying key words in the natural language text, performing lexical analysis on the natural language text with a Bayesian network, calculating maximum a posteriori probabilities for fields by comparing lexical analysis results with field names, recommending a small number of fields relatively likely to be candidates for the at least one dependent variable of the data mining operation goal, communicating the fields relatively likely to be candidates to the user, receiving additional user input specifying the dependent variable, and for each target candidate, ranking input features based on their level of contribution to the expected data mining performance.
-
Specification