INFORMATION PROCESSING METHOD, INFORMATION PROCESSING DEVICE, AND RECORDING MEDIUM

US 20200135211A1
Filed: 10/21/2019
Published: 04/30/2020
Est. Priority Date: 10/24/2018
Status: Active Grant

First Claim

Patent Images

1. An information processing method performed by a computer, the information processing method comprising:

detecting at least one speech segment from speech input to a speech input unit;

extracting, from each of the at least one speech segment, a first feature quantity identifying a speaker whose voice is contained in the speech segment;

performing a comparison between the first feature quantity extracted and each of second feature quantities stored in storage and identifying respective voices of registered speakers who are target speakers in speaker recognition; and

determining registered speakers by performing the comparison for each of consecutive speech segments detected in the detecting and, under a predetermined condition, deleting, from the storage, at least one second feature quantity having a degree of similarity less than or equal to a threshold among the second feature quantities stored in the storage, to remove at least one registered speaker identified by the at least one second feature quantity, the degree of similarity being a degree of similarity with the first feature quantity.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The information processing method in the present disclosure is performed as below. At least one speech segment is detected from speech input to a speech input unit. A first feature quantity is extracted from each speech segment detected, the first feature quantity identifying a speaker whose voice is contained in the speech segment. The first feature quantity extracted is compared with each of second feature quantities stored in storage and identifying the respective voices of registered speakers who are target speakers in speaker recognition. The comparison is performed for each of consecutive speech segments, and under a predetermined condition, among the second feature quantities stored in the storage, at least one second feature quantity whose similarity with the first feature quantity is less than or equal to a threshold is deleted, thereby removing the at least one registered speaker identified by the at least one second feature quantity.

4 Citations

12 Claims

1. An information processing method performed by a computer, the information processing method comprising:
- detecting at least one speech segment from speech input to a speech input unit;
  
  extracting, from each of the at least one speech segment, a first feature quantity identifying a speaker whose voice is contained in the speech segment;
  
  performing a comparison between the first feature quantity extracted and each of second feature quantities stored in storage and identifying respective voices of registered speakers who are target speakers in speaker recognition; and
  
  determining registered speakers by performing the comparison for each of consecutive speech segments detected in the detecting and, under a predetermined condition, deleting, from the storage, at least one second feature quantity having a degree of similarity less than or equal to a threshold among the second feature quantities stored in the storage, to remove at least one registered speaker identified by the at least one second feature quantity, the degree of similarity being a degree of similarity with the first feature quantity.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The information processing method according to claim 1,wherein in the determining, as a result of the comparison, when degrees of similarity between the first feature quantity and all the second feature quantities stored in the storage are less than or equal to the threshold, the storage stores the first feature quantity as a feature quantity identifying a voice of a new registered speaker.
  - 3. The information processing method according to claim 1,wherein in the determining, when the second feature quantities stored in the storage include a second feature quantity having a degree of similarity higher than the threshold, the second feature quantity having a degree of similarity higher than the threshold is updated to a feature quantity including the first feature quantity and the second feature quantity having a degree of similarity higher than the threshold, to update information on a registered speaker identified by the second feature quantity having a degree of similarity higher than the threshold and stored in the storage, the degree of similarity being a degree of similarity with the first feature quantity.
  - 4. The information processing method according to claim 1,wherein the storage pre-stores the second feature quantities.
  - 5. The information processing method according to claim 1, further comprising:
    - registering target speakers before the computer performs the determining, by (i) instructing each of the target speakers to utter first speech and inputting the respective first speech to the speech input unit, (ii) detecting first speech segments from the respective first speech, (iii) extracting, from the first speech segments, feature quantities in speech identifying the respective target speakers, and (iv) storing the feature quantities in the storage as the second feature quantities.
  - 6. The information processing method according to claim 1,wherein in the determining,as the predetermined condition, the comparison is performed a total of m times for the consecutive speech segments, where m is an integer greater than or equal to 2, andas a result of the comparison performed m times, when at least one second feature quantity having a degree of similarity less than or equal to the threshold is included, at least one registered speaker identified by the at least one second feature quantity is removed, the degree of similarity being a degree of similarity with the first feature quantity extracted in each of the consecutive speech segments.
  - 7. The information processing method according to claim 1,wherein in the determining,as the predetermined condition, the comparison is performed for a predetermined period, andas a result of the comparison performed for the predetermined period, when at least one second feature quantity having a degree of similarity less than or equal to the threshold is included, at least one registered speaker identified by the at least one second feature quantity is removed, the degree of similarity being a degree of similarity with the first feature quantity.
  - 8. The information processing method according to claim 1,wherein in the determining, when the storage stores, as the second feature quantities, second feature quantities identifying two or more respective registered speakers who are target speakers in speaker recognition, at least one registered speaker identified by the at least one second feature quantity is removed.
  - 9. The information processing method according to claim 1,wherein in the detecting, speech segments are detected consecutively in a time sequence from speech input to the speech input unit.
  - 10. The information processing method according to claim 1,wherein in the detecting, speech segments are detected at predetermined intervals from speech input to the speech input unit.

11. An information processing device comprising:
- a detector that detects at least one speech segment from speech input to a speech input unit;
  
  a feature quantity extraction unit configured to extract, from each of the at least one speech segment, a first feature quantity identifying a speaker whose voice is contained in the speech segment;
  
  a comparator that performs a comparison between the first feature quantity extracted and each of second feature quantities stored in storage and identifying respective registered speakers who are target speakers in speaker recognition; and
  
  a registered speaker determination unit configured to perform the comparison for each of consecutive speech segments detected in the detecting and, under a predetermined condition, remove at least one registered speaker identified by at least one second feature quantity having a degree of similarity less than or equal to a threshold among the second feature quantities stored in the storage, the degree of similarity being a degree of similarity with the first feature quantity.

12. A non-transitory computer-readable recording medium for use in a computer, the recording medium having a program recorded thereon for causing the computer to perform an information processing method, the information processing method comprising:
- detecting at least one speech segment from speech input to a speech input unit;
  
  extracting, from each of the at least one speech segment, a first feature quantity identifying a speaker whose voice is contained in the speech segment;
  
  performing a comparison between the first feature quantity extracted and each of second feature quantities stored in storage and identifying respective registered speakers who are target speakers in speaker recognition; and
  
  determining registered speakers by performing the comparison for each of consecutive speech segments detected in the detecting and, under a predetermined condition, removing at least one registered speaker identified by at least one second feature quantity having a degree of similarity less than or equal to a threshold among the second feature quantities stored in the storage, the degree of similarity being a degree of similarity with the first feature quantity.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Panasonic Intellectual Property Corporation of America (Panasonic Holdings Corporation)
Original Assignee
Panasonic Intellectual Property Corporation of America (Panasonic Holdings Corporation)
Inventors
DOI, Misaki

Granted Patent

US 11,417,344 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G10L 17/02   Preprocessing operations, e...

G10L 17/04   Training, enrolment or mode...

G10L 17/06   Decision making techniques;...

G10L 17/22   Interactive procedures; Man...

INFORMATION PROCESSING METHOD, INFORMATION PROCESSING DEVICE, AND RECORDING MEDIUM

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

4 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

INFORMATION PROCESSING METHOD, INFORMATION PROCESSING DEVICE, AND RECORDING MEDIUM

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

4 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links