×

Information processing method, information processing device, and recording medium for determining registered speakers as target speakers in speaker recognition

  • US 11,417,344 B2
  • Filed: 10/21/2019
  • Issued: 08/16/2022
  • Est. Priority Date: 10/24/2018
  • Status: Active Grant
First Claim
Patent Images

1. An information processing method performed by a computer, the information processing method comprising:

  • detecting at least one speech segment from speech utterances that are sequentially input to a speech input unit;

    extracting, from each of the at least one speech segment, a first feature quantity identifying a speaker whose voice is contained in the speech segment;

    performing a comparison between the first feature quantity extracted and each of second feature quantities stored in a second storage as targets in speaker recognition for identifying respective voices of registered speakers, the second feature quantities being among second feature quantities pre-stored in a first storage and identifying respective voices of registered speakers;

    performing a parsing and management of the registered speakers in the second storage, based on results of the comparison, which is performed for each consecutive speech segment of the at least one speech segment, of;

    deleting, from the second storage, at least one second feature quantity among the second features quantities when a degree of similarity between the first feature quantity in the consecutive speech segments, which is present for a fixed period of time or for a fixed number of times, and the at least one second feature quantity stored in the second storage is less than or equal to a threshold and a predetermined condition is satisfied, to remove at least one registered speaker identified by the at least one second feature quantity from the registered speakers stored in the second storage and reduce a total number of registered speakers as target speakers for speaker recognition; and

    when a first feature quantity having a degree of similarity between the first feature quantity and each of the second feature quantities stored in the second storage, which is less than or equal to a threshold, appears among first feature quantities in speech segments that follow the consecutive speech segments,storing, in the second storage, a second feature quantity having a degree of similarity between the first feature quantity that appeared among the first feature quantities and the second feature quantities stored in the first storage that is greater than a threshold, based on comparing the first feature quantity that appeared among the first features quantities and each of the second feature quantities stored in the first storage; and

    adding, to the second storage, the first feature quantity that appeared among the first feature quantities as a feature quantity identifying a voice of a new registered speaker when a degree of similarity between the first feature quantity that appeared among the first features quantities and each of the second feature quantities stored in the first storage is less than or equal to a threshold based on a result of comparing the first feature quantity that appeared among the first features quantities and each of the second feature quantities stored in the first storage, to increase the total number of registered speakers who are target speakers for speaker recognition.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×