Voice focus enabled by predetermined triggers
First Claim
Patent Images
1. A computer system, comprising:
- one or more processors, one or more computer-readable memories and one or more computer-readable, tangible storage devices; and
program instructions, stored on at least one of the one or more computer-readable, tangible storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to perform operations comprising;
using voice recognition to identify one or more pre-determined triggers from each voice of multiple speakers addressing a listener nearly simultaneously; and
in response to identifying the one or more pre-determined triggers,for each of the multiple speakers that does not have a previously stored voice recognition template, dynamically creating a voice recognition template to store voice biometrics of that speaker;
for each of the multiple speakers that does have a stored voice recognition template, updating the voice recognition template;
selecting a speaker from among the multiple speakers to focus on based on clarity of that speaker, direction of that speaker, one or more keywords spoken by that speaker, and whether there is a previously stored voice recognition template for that speaker; and
using the voice recognition template and voice isolation to focus on the voice from the selected speaker.
1 Assignment
0 Petitions
Accused Products
Abstract
Provided are techniques for voice focus enabled by predetermined triggers. Voice recognition is used to identify one or more pre-determined triggers from a voice of a speaker. In response to identifying the one or more pre-determined triggers, a voice recognition template is dynamically created for the voice of the speaker, and the voice recognition template and voice isolation are used to focus on the voice from the speaker.
29 Citations
14 Claims
-
1. A computer system, comprising:
-
one or more processors, one or more computer-readable memories and one or more computer-readable, tangible storage devices; and program instructions, stored on at least one of the one or more computer-readable, tangible storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to perform operations comprising; using voice recognition to identify one or more pre-determined triggers from each voice of multiple speakers addressing a listener nearly simultaneously; and in response to identifying the one or more pre-determined triggers, for each of the multiple speakers that does not have a previously stored voice recognition template, dynamically creating a voice recognition template to store voice biometrics of that speaker; for each of the multiple speakers that does have a stored voice recognition template, updating the voice recognition template; selecting a speaker from among the multiple speakers to focus on based on clarity of that speaker, direction of that speaker, one or more keywords spoken by that speaker, and whether there is a previously stored voice recognition template for that speaker; and using the voice recognition template and voice isolation to focus on the voice from the selected speaker. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product, the computer program product comprising a computer readable storage medium having program code embodied therewith, the program code executable by at least one processor to perform:
-
using, by the at least one processor, voice recognition to identify one or more pre-determined triggers from each voice of multiple speakers addressing a listener nearly simultaneously; and in response to identifying the one or more pre-determined triggers, for each of the multiple speakers that does not have a previously stored voice recognition template, dynamically creating, by the at least one processor, a voice recognition template to store voice biometrics of that speaker; for each of the multiple speakers that does have a stored voice recognition template, updating the voice recognition template; selecting a speaker from among the multiple speakers to focus on based on clarity of that speaker, direction of that speaker, one or more keywords spoken by that speaker, and whether there is a previously stored voice recognition template for that speaker; and using, by the at least one processor, the voice recognition template and voice isolation to focus on the voice from the selected speaker. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification