Voice authentication and speech recognition system and method

US 9,424,837 B2
Filed: 01/23/2013
Issued: 08/23/2016
Est. Priority Date: 01/24/2012
Status: Active Grant

First Claim

Patent Images

1. A method for configuring a speech recognition system, the method comprising:

obtaining a speech sample from a user utilised to authenticate the user as part of an authentication process;

processing the speech sample to train one or more generic acoustic model(s) for units of speech associated with the speech sample;

storing the trained acoustic model(s) in a personalised acoustic model set for the user;

selectively re-training the acoustic model(s) in the personalised model set based on additional speech samples provided by the user containing corresponding units of speech;

responsive to determining that the user has accessed a speech recognition function, directing a speech recognition process to access the personalised model set for recognising subsequent user utterances; and

further comprising determining a measure of quality for each of the stored acoustic models and wherein the acoustic modules are re-trained based on additional speech samples until the corresponding quality measure meets a predefined threshold.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for configuring a speech recognition system comprises obtaining a speech sample utilized by a voice authentication system in a voice authentication process. The speech sample is processed to generate acoustic models for units of speech associated with the speech sample. The acoustic models are stored for subsequent use by the speech recognition system as part of a speech recognition process.

11 Citations

View as Search Results

16 Claims

1. A method for configuring a speech recognition system, the method comprising:
- obtaining a speech sample from a user utilised to authenticate the user as part of an authentication process;
  
  processing the speech sample to train one or more generic acoustic model(s) for units of speech associated with the speech sample;
  
  storing the trained acoustic model(s) in a personalised acoustic model set for the user;
  
  selectively re-training the acoustic model(s) in the personalised model set based on additional speech samples provided by the user containing corresponding units of speech;
  
  responsive to determining that the user has accessed a speech recognition function, directing a speech recognition process to access the personalised model set for recognising subsequent user utterances; and
  
  further comprising determining a measure of quality for each of the stored acoustic models and wherein the acoustic modules are re-trained based on additional speech samples until the corresponding quality measure meets a predefined threshold.
- View Dependent Claims (2, 3, 4, 5, 8, 14)
- - 2. A method in accordance with claim 1, wherein the speech units comprise triphones, diphones, senones, phonemes, words or phrases.
  - 3. A method in accordance with claim 2, further comprising evaluating speech content data associated with the speech sample to determine an audible identifier for each of the speech units and classifying the acoustic models based on the determined audible identifier.
  - 4. A method in accordance with claim 1, wherein the acoustic models comprise language models for the units of speech.
  - 5. A method in accordance with claim 1, wherein the speech recognition process is automatically directed to access the personalised model set responsive to the user being successfully authenticated.
  - 8. A non-transitory computer readable medium implementing a computer program comprising one or more instructions for controlling a computer system to implement a method in accordance with claim 1.
  - 14. A method in accordance with claim 1, wherein the acoustic models comprise grammar models for the units of speech.

6. A combined speech recognition and voice authentication method, comprising:
- responsive to a user being successfully authenticated by a voice authentication function, accessing a personalised set of acoustic language models for use by a speech recognition function in recognising one or more utterances by the user, the acoustic model set containing acoustic language models which have been trained using voice data derived from utterances provided by the user either during enrolment with the authentication function or during one or more subsequent authentications.
- View Dependent Claims (7)
- - 7. A method in accordance with claim 6, wherein the personalised model set comprises acoustic models of speech units which have been trained utilising speech samples provided by one or more other users having a shared voice profile to the user.

9. A speech recognition system comprising:
- a processing module programmed to;
  
  obtain a speech sample utilised to authenticate a user as part of an authentication process;
  
  process the speech sample to train one or more generic acoustic models for speech units associated with the speech sample and to subsequently store the trained acoustic model(s) in a personalised acoustic model set;
  
  selectively re-train the acoustic model(s) based on additional speech samples provided by the user containing corresponding units of speech;
  
  responsive to determining that the user has accessed a speech recognition function, the processing module is further arranged to access the personalised acoustic model set for recognising subsequent user utterances; and
  
  the processing module being further programmed to determine a measure of quality for each of the stored acoustic models and continuing to regenerate the acoustic models until the quality measure reaches a predefined threshold.
- View Dependent Claims (10, 11, 12, 13)
- - 10. A system in accordance with claim 9, wherein the speech units comprise triphones, diphones, senones, phonemes, words or phrases.
  - 11. A system in accordance with claim 9, wherein the processing module is further operable to evaluate speech content data associated with the speech sample to determine an audible identifier for each of the speech units and classify the acoustic models based on the relevant identifier.
  - 12. A system in accordance with claim 9, wherein the additional speech samples are provided by the user either during enrolment with the authentication system or as part of a subsequent authentication session.
  - 13. A system in accordance with claim 9, wherein the personalised acoustic model set is automatically accessed for performing the speech recognition process responsive to the user being successfully authenticated by the authentication system.

15. A combined speech recognition and voice authentication method, comprising:
- responsive to a user being successfully authenticated by a voice authentication function, accessing a personalised set of acoustic grammar models for use by a speech recognition function in recognising one or more utterances by the user, the acoustic model set containing acoustic grammar models which have been trained using voice data derived from utterances provided by the user either during enrolment with the authentication function or during one or more subsequent authentications; and
  
  further comprising determining a measure of quality for each of the stored acoustic models and wherein the acoustic modules are re-trained based on additional speech samples until the corresponding quality measure meets a predefined threshold.
- View Dependent Claims (16)
- - 16. A method in accordance with claim 15, wherein the personalised model set comprises acoustic models of speech units which have been trained utilising speech samples provided by one or more other users having a shared voice profile to the user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Auraya Pty Limited
Original Assignee
Auraya Pty Limited
Inventors
Talhami, Habib Emile, Malegaonkar, Amit Sadanand, Malegaonkar, Renuka Amit, Summerfield, Clive David
Primary Examiner(s)
Riley, Marcus T

Application Number

US14/374,225
Publication Number

US 20150019220A1
Time in Patent Office

1,308 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G10L 15/063   Training

G10L 15/065   Adaptation

G10L 15/07   to the speaker

G10L 17/00   Speaker identification or v...

G10L 17/04   Training, enrolment or mode...

G10L 2015/0635   updating or merging of old ...

G10L 2015/0638   Interactive procedures

Voice authentication and speech recognition system and method

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

11 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Voice authentication and speech recognition system and method

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

11 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links