Task-independent conversational systems
First Claim
Patent Images
1. A method comprising:
- obtaining multi-task training data, the multi-task training data comprising a plurality of sequences of conversational inputs, wherein each sequence corresponds to a respective task, and the multi-task training data comprises sequences corresponding to multiple different tasks, wherein the multi-task training data comprises a respective reward and a respective conversational output for each conversational input, and wherein the respective rewards are generated based on one or more observable metrics that relate to a quality of conversational outputs generated by the conversational machine learning model; and
training a conversational machine learning model on the multi-task training data to determine trained values of the parameters of the conversational machine learning model, wherein the conversational machine learning model is configured to receive as input a conversational input and to generate as output a conversational output that defines a response to a user that is independent of a task being performed when the conversational input was generated, wherein training the conversational machine learning model comprises training the conversational machine learning model using the respective rewards using reinforcement learning.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating responses using task-independent conversational systems.
-
Citations
30 Claims
-
1. A method comprising:
-
obtaining multi-task training data, the multi-task training data comprising a plurality of sequences of conversational inputs, wherein each sequence corresponds to a respective task, and the multi-task training data comprises sequences corresponding to multiple different tasks, wherein the multi-task training data comprises a respective reward and a respective conversational output for each conversational input, and wherein the respective rewards are generated based on one or more observable metrics that relate to a quality of conversational outputs generated by the conversational machine learning model; and training a conversational machine learning model on the multi-task training data to determine trained values of the parameters of the conversational machine learning model, wherein the conversational machine learning model is configured to receive as input a conversational input and to generate as output a conversational output that defines a response to a user that is independent of a task being performed when the conversational input was generated, wherein training the conversational machine learning model comprises training the conversational machine learning model using the respective rewards using reinforcement learning. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
obtaining multi-task training data, the multi-task training data comprising a plurality of sequences of conversational inputs, wherein each sequence corresponds to a respective task, and the multi-task training data comprises sequences corresponding to multiple different tasks, wherein the multi-task training data comprises a respective reward and a respective conversational output for each conversational input, and wherein the respective rewards are generated based on one or more observable metrics that relate to a quality of conversational outputs generated by the conversational machine learning model; and training a conversational machine learning model on the multi-task training data to determine trained values of the parameters of the conversational machine learning model, wherein the conversational machine learning model is configured to receive as input a conversational input and to generate as output a conversational output that defines a response to a user that is independent of a task being performed when the conversational input was generated, wherein training the conversational machine learning model comprises training the conversational machine learning model using the respective rewards using reinforcement learning. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. One or more non-transitory computer readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
obtaining multi-task training data, the multi-task training data comprising a plurality of sequences of conversational inputs, wherein each sequence corresponds to a respective task, and the multi-task training data comprises sequences corresponding to multiple different tasks, wherein the multi-task training data comprises a respective reward and a respective conversational output for each conversational input, and wherein the respective rewards are generated based on one or more observable metrics that relate to a quality of conversational outputs generated by the conversational machine learning model; and training a conversational machine learning model on the multi-task training data to determine trained values of the parameters of the conversational machine learning model, wherein the conversational machine learning model is configured to receive as input a conversational input and to generate as output a conversational output that defines a response to a user that is independent of a task being performed when the conversational input was generated, wherein training the conversational machine learning model comprises training the conversational machine learning model using the respective rewards using reinforcement learning. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
Specification