Interactive concept editing in computer-human interactive learning
First Claim
1. One or more hardware computer-storage device having embodied thereon computer-usable instructions that, when executed, facilitate a method of interactively generating dictionaries for machine learning, the method comprising:
- presenting a user interface which receives user input that is utilized for generating a new dictionary that represents a single concept, wherein the new dictionary includes a list of one or both of words or n-grams that are positive examples of the single concept and is usable as a feature for training a classifier;
presenting on the user interface a positive-example field configured to receive user-input words or n-grams that are positive examples of the single concept, wherein the user-input words or n-grams are utilized to determine a generalized concept that corresponds to the positive examples, and wherein the positive examples are received from one or more ofa typed entry ora selection of one or more suggested words or n-grams from one or more suggestion-set fields;
presenting on the user interface the one or more suggestion-set fields configured to display one or more system-generated lists that contain suggested words or n-grams that are generated based on the generalized concept, wherein the suggested words or n-grams are examples of the generalized concept, and wherein the suggested words or n-grams are selectable for inclusion in the positive-example field; and
presenting on the user interface a working set field configured to display words or n-grams added to a working set from the one or more suggestion-set fields, wherein the words or n-grams displayed in the working set field are saved in the new dictionary.
2 Assignments
0 Petitions
Accused Products
Abstract
A collection of data that is extremely large can be difficult to search and/or analyze. Relevance may be dramatically improved by automatically classifying queries and web pages in useful categories, and using these classification scores as relevance features. A thorough approach may require building a large number of classifiers, corresponding to the various types of information, activities, and products. Creation of classifiers and schematizers is provided on large data sets. Exercising the classifiers and schematizers on hundreds of millions of items may expose value that is inherent to the data by adding usable meta-data. Some aspects include active labeling exploration, automatic regularization and cold start, scaling with the number of items and the number of classifiers, active featuring, and segmentation and schematization.
58 Citations
20 Claims
-
1. One or more hardware computer-storage device having embodied thereon computer-usable instructions that, when executed, facilitate a method of interactively generating dictionaries for machine learning, the method comprising:
-
presenting a user interface which receives user input that is utilized for generating a new dictionary that represents a single concept, wherein the new dictionary includes a list of one or both of words or n-grams that are positive examples of the single concept and is usable as a feature for training a classifier; presenting on the user interface a positive-example field configured to receive user-input words or n-grams that are positive examples of the single concept, wherein the user-input words or n-grams are utilized to determine a generalized concept that corresponds to the positive examples, and wherein the positive examples are received from one or more of a typed entry or a selection of one or more suggested words or n-grams from one or more suggestion-set fields; presenting on the user interface the one or more suggestion-set fields configured to display one or more system-generated lists that contain suggested words or n-grams that are generated based on the generalized concept, wherein the suggested words or n-grams are examples of the generalized concept, and wherein the suggested words or n-grams are selectable for inclusion in the positive-example field; and presenting on the user interface a working set field configured to display words or n-grams added to a working set from the one or more suggestion-set fields, wherein the words or n-grams displayed in the working set field are saved in the new dictionary. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of interactively generating dictionaries for machine learning, the method comprising:
-
presenting a user interface which receives user input that is utilized for generating a new dictionary that represents a single concept, wherein the new dictionary includes a list of one or both of words or n-grams that are positive examples of the single concept and is usable as a feature for training a classifier; presenting on the user interface a positive-example field configured to receive user-input words or n-grams that are positive examples of the single concept, wherein the user-input words or n-grams are utilized to determine a generalized concept that corresponds to the positive examples, and wherein the positive examples are received from one or more of a typed entry or a selection of one or more suggested words or n-grams from one or more suggestion-set fields; presenting on the user interface the one or more suggestion-set fields configured to display one or more system-generated lists that contain suggested words or n-grams that are generated based on the generalized concept, wherein the suggested words or n-grams are examples of the generalized concept, and wherein the suggested words or n-grams are selectable for inclusion in the positive-example field; and presenting on the user interface a working set field configured to display words or n-grams added to a working set from the one or more suggestion-set fields, wherein the words or n-grams displayed in the working set field are saved in the new dictionary. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A system for interactively constructing dictionaries that represent respective concepts usable for machine learning, the method comprising:
-
one more memory devices; and one or more processors configured to; generate an interface for editing a dictionary that represents a respective concept, wherein the dictionary represents the respective concept by including a list of words that are positive examples of the respective concept, and wherein the dictionary is usable as a feature for training a classifier; present on the interface a positive-example input field configured to receive user-input words that are positive examples of the respective concept represented by the dictionary; present on the user interface a negative-example input field configured to receive user-input words that are negative examples of the respective concept represented by the dictionary; present on the user interface a suggestion-set field configured to display a system-generated list of suggested words, wherein the suggested words are examples of a generalized concept that is determined from words in one or both of the positive-example input field or the negative-example field; receive one or more user-input words that are positive or negative examples of the respective concept represented by the dictionary, wherein the one or more user-input words are received from one or more of a typed entry, or a selection of one or more suggested words from the suggestion-set field; determine a first generalized concept from the one or more user-input positive or negative examples; based on the first generalized concept, generate a set of suggested words that are examples of the first generalized concept; present the set of suggested words in the suggestion-set field on the user interface; receive a user selection of a first suggested word from the set of suggested words; include the first suggested word in the positive-example field or the negative-example field; refine the set of suggested words based at least on the first suggested word that was included in the positive-example field or the negative-example field, wherein the refined set of suggested words represents a refined generalized concept; present the refined set of suggested words in the first suggestion-set field; receive an indication that the user has finished editing the dictionary; and save the contents of the positive-example input field, comprising the user-input words that are positive examples of the respective concept, in the dictionary that represents the respective concept, in the one or more memory devices. - View Dependent Claims (19, 20)
-
Specification