Crowdsourced training of textual natural language understanding systems
First Claim
1. A method in a computing system for adapting a virtual assistant to operate with respect to a plurality of user intents, comprising:
- for each user intent of the plurality of user intents, receiving, by the computing system, (1) a sample expression of the user intent, and (2) an enumeration of entities of direct relevance to the user intent;
for each worker in a first pool crowdsourced workers;
providing, by the computing system, the sample expression of at least a portion of the plurality of user intents;
for each provided sample expression, obtaining, by the computing system, one or more alternative expressions of user intent that each use an expression of the user intent that differs from the sample expression of the user intent;
from each worker in a third pool crowdsourced workers;
for each of at least a portion of the obtained alternative expressions of user intent;
providing, by the computing system, the obtained one or more alternative expressions;
obtaining, by the computing system, a selection of the user intent expressed by the obtained one or more alternative expressions;
obtaining, by the computing system, a selection of the entities included in the obtained one or more alternative expressions,wherein at least one of the first pool of crowdsourced workers and the third pool of crowdsourced workers comprise a user, a bot, or a combination thereof;
training, by the computing system, a virtual assistant using the obtained one or more alternative expressions and their selected user intents and included entities; and
for each worker in a fourth set of crowdsourced users;
providing at least a portion of the selection of the user intents expressed by the obtained one or more alternative expressions;
obtaining expression validation indicators that each represent whether the at least a portion of the selection of the user intents expressed by the obtained one or more alternative expressions are correct;
providing at least a portion of the selection of the entities included in the obtained one or more alternative expressions; and
obtaining entities validation indicators that each represent whether the at least a portion of the selection of the entities included in the obtained one or more alternative expressions are correct.
3 Assignments
0 Petitions
Accused Products
Abstract
A facility to crowdsource training of virtual assistants and other textual natural language understanding systems is described. The facility first specifies a set of possible user intents (e.g., a kind of question asked by users). As part of specifying an intent, entities, that represent salient items of information associated with the intent are identified. Then, for each of the intents, the facility directs users of a crowdsourcing platform to input a number of different textual queries they might use to express this intent. Then, additional crowdsourcing platform users are asked to perform semantic annotation of the cleaned queries, for each selecting its intent and entities from predefined lists. Next, still other crowdsourcing platform users are asked whether the selection of intents and entities during semantic annotation was correct for each query. Once validated, the annotated queries are used to train the assistant.
-
Citations
22 Claims
-
1. A method in a computing system for adapting a virtual assistant to operate with respect to a plurality of user intents, comprising:
-
for each user intent of the plurality of user intents, receiving, by the computing system, (1) a sample expression of the user intent, and (2) an enumeration of entities of direct relevance to the user intent; for each worker in a first pool crowdsourced workers; providing, by the computing system, the sample expression of at least a portion of the plurality of user intents; for each provided sample expression, obtaining, by the computing system, one or more alternative expressions of user intent that each use an expression of the user intent that differs from the sample expression of the user intent; from each worker in a third pool crowdsourced workers; for each of at least a portion of the obtained alternative expressions of user intent; providing, by the computing system, the obtained one or more alternative expressions; obtaining, by the computing system, a selection of the user intent expressed by the obtained one or more alternative expressions; obtaining, by the computing system, a selection of the entities included in the obtained one or more alternative expressions, wherein at least one of the first pool of crowdsourced workers and the third pool of crowdsourced workers comprise a user, a bot, or a combination thereof; training, by the computing system, a virtual assistant using the obtained one or more alternative expressions and their selected user intents and included entities; and for each worker in a fourth set of crowdsourced users; providing at least a portion of the selection of the user intents expressed by the obtained one or more alternative expressions; obtaining expression validation indicators that each represent whether the at least a portion of the selection of the user intents expressed by the obtained one or more alternative expressions are correct; providing at least a portion of the selection of the entities included in the obtained one or more alternative expressions; and obtaining entities validation indicators that each represent whether the at least a portion of the selection of the entities included in the obtained one or more alternative expressions are correct. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method in a computing system for adapting a virtual assistant to operate with respect to a plurality of user intents, comprising:
for each user intent of the plurality of user intents; receiving a set of entities of direct relevance to the user intent; receiving, from a first set of crowdsourced users, a set of queries of direct relevance to the user intent; for each query in the set of queries of direct relevance to the user intent; presenting the query to a third set of crowdsourced users; receiving, from the third set of crowdsourced users, a selection of the user intents expressed by the query; and receiving, from the third set of crowdsourced users, a selection of the set of entities included in the query; for each user in a fourth set of crowdsourced users; presenting at least a portion of the selection of the user intents received from the third set of crowdsourced users and at least a portion of the selection of the set of entities included in the query received from the third set of crowdsourced users; and receiving validating indicators that represent whether the selection of the user intents received from the third set of crowdsourced users and the selection of the set of entities included in the query received from the third set of crowdsourced users are correct; and training a virtual assistant using the set of queries and their selected user intents and the set of entities. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16)
-
17. A memory, which is not a signal per se, and whose contents are capable of causing a computing system to perform a method adapting a virtual assistant to operate with respect to a plurality of user intents, the method comprising:
-
for each user intent of the plurality of user intents, receiving (1) a sample expression of the user intent, and (2) an enumeration of entities of direct relevance to the user intent; for each worker in a first pool crowdsourced workers; providing the sample expression of at least a portion of the plurality of user intents; for each provided sample expression, obtaining one or more alternative expressions of user intent that each use an expression of the user intent that differs from the sample expression of the distinguished user intent; from each worker in a third pool crowdsourced workers; for each of at least a portion of the obtained alternative expressions of user intent; providing alternative expression; obtaining a selection of the user intent expressed by the alternative expression; obtaining a selection of the entities included in the alternative expression, wherein at least one of the first pool of crowdsourced works and the third pool of crowdsourced workers comprise a user, a bot, or a combination thereof; training a virtual assistant using the alternative expressions and their selected user intents and included entities; and for each worker in fourth set of crowdsourced users; providing at least a portion of the selection of the user intents expressed by the alternative expression; obtaining expression validation indicators that represent whether the at least a portion of the selection of the user intents expressed by the alternative expression are correct; providing at least a portion of the selection of the entities included in the alternative expression; and obtaining entities validation indicators that represent whether the at least a portion of the selection of the entities included in the alternative expression are correct. - View Dependent Claims (18, 19, 20, 21, 22)
-
Specification