×

Dialogue act estimation method, dialogue act estimation apparatus, and storage medium

  • US 10,460,721 B2
  • Filed: 06/07/2017
  • Issued: 10/29/2019
  • Est. Priority Date: 06/23/2016
  • Status: Active Grant
First Claim
Patent Images

1. A dialogue act estimation method, in a dialogue act estimation system, comprising:

  • acquiring sounds by a microphone in a terminal;

    determining, by a processor in the terminal, whether the acquired sounds are uttered sentences of one or more speakers or noise;

    outputting the uttered sentences to communication transmitter only when the processor determines that the acquired sounds are uttered sentences of the one or more speakers and are not noise;

    converting the uttered sentences of the one or more speakers to one or more formatted communication signals when the processor determines that the acquired sounds are uttered sentences of the one or more speakers;

    transmitting the one or more formatted communication signals from the terminal over a communication network to a server;

    receiving the one or more formatted communication signals by the server;

    converting the received one or more formatted communication signals by a processor in the server to the uttered sentences of the one or more speakers;

    acquiring first training data by the server from the converted uttered sentences of the one or more speakers indicating, in a mutually associated manner, text data of a first sentence that can be a current uttered sentence, text data of a second sentence that can be an uttered sentence immediately previous to the first sentence, first speaker change information indicating whether a speaker of the first sentence is the same as a speaker of the second sentence, and dialogue act information indicating a class of the first sentence;

    learning an association between the current uttered sentence and the dialogue act information by applying the first training data to a model;

    storing a result of the learning as learning result information in a memory in the server;

    acquiring dialogue data including text data of a third sentence of a current uttered sentence uttered by a user, text data of a fourth sentence of an uttered sentence immediately previous to the third sentence, and second speaker change information indicating whether the speaker of the third sentence is the same as a speaker of the fourth sentence;

    estimating a dialogue act to which the third sentence is classified by applying the dialogue data to the model based on the learning result information; and

    generating a correct response to the uttered sentences of the one or more speakers,wherein the model includesa first model that outputs a first feature vector based on the text data of the first sentence, the text data of the second sentence, the first speaker identification information, the second speaker identification information, and a first weight parameter, anda second model that outputs a second feature vector based on the text data of the first sentence, the text data of the second sentence, the first speaker change information, and a second weight parameter,wherein the first model determines the first feature vector from the first sentence and the second sentence according to a first RNN-LSTM (Recurrent Neural Network-Long Short Term Memory) having the first weight parameter dependent on the first speaker identification information and the second speaker identification information, andwherein the second model determines the second feature vector from the first sentence and the second sentence according to a second RNN-LSTM having the second weight parameter dependent on first speaker change information.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×