Mobile phone having speaker dependent voice recognition method and apparatus

US 6,260,012 B1
Filed: 03/01/1999
Issued: 07/10/2001
Est. Priority Date: 02/27/1998
Status: Expired due to Term

First Claim

Patent Images

1. An apparatus for performing speech recognition in a communication terminal with a voice dialing function, comprising:

a memory having a first region for registration of feature data with respect to an input voice, a second region for storing a number of trials upon every recognition with respect to the feature data, a third region for storing an accumulative mean value with respect to a series of threshold values obtained from a corresponding number of trials, stored in the second region to and through the preceding number of trials, and a fourth region for storing a specified threshold value;

a vocoder for generating packet data according to an input voice;

a voice recognition means for analyzing the packet data currently provided from the vocoder to thereby generate corresponding feature data, comparing the generated feature data with feature data of reference voices pre-registered in the memory to thereby search any similar data, and if it is searched the similar data, then outputting an index of the searched feature data and a difference value between the generated feature data and the registered feature data; and

a controller for comparing the difference value outputted from the voice recognition means with a predetermined threshold value, so that if the difference value is less than the threshold value, then the feature data corresponding to the index are read out from the memory and delivered to the vocoder, calculating an accumulative mean value of threshold values for every trial of recognition with respect to the feature data to and through the present time, the accumulative mean value being stored in the third region of the memory, and by reflecting the accumulative mean value into the threshold value, updating the threshold value stored in the fourth region of the memory.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An apparatus and method for performing improved speech recognition in a communication terminal, e.g., a mobile phone with a hands-free voice dialing function. In a speech recognition mode, a user'"'"'s input speech such as a desired called party name, number or a phone command, is converted to feature data and compared to individual pre-stored feature data sets corresponding to pre-recorded speech obtained during a registration process. Difference values representing the respective differences between the current user'"'"'s input speech and the respective data sets are computed. A first closest (most similar) and second closest feature data set correspond to the first smallest and second smallest difference values so obtained. A closeness threshold is computed as the sum of a small, predetermined threshold and a differential value between the first and second difference values. If the first difference value is less than the computed closeness threshold, then the input speech is determined to match the first feature data set, whereby a positive speech recognition result is obtained. When a match occurs, an automatic dialing operation may be carried out in one application.

72 Citations

View as Search Results

21 Claims

1. An apparatus for performing speech recognition in a communication terminal with a voice dialing function, comprising:
- a memory having a first region for registration of feature data with respect to an input voice, a second region for storing a number of trials upon every recognition with respect to the feature data, a third region for storing an accumulative mean value with respect to a series of threshold values obtained from a corresponding number of trials, stored in the second region to and through the preceding number of trials, and a fourth region for storing a specified threshold value;
  
  a vocoder for generating packet data according to an input voice;
  
  a voice recognition means for analyzing the packet data currently provided from the vocoder to thereby generate corresponding feature data, comparing the generated feature data with feature data of reference voices pre-registered in the memory to thereby search any similar data, and if it is searched the similar data, then outputting an index of the searched feature data and a difference value between the generated feature data and the registered feature data; and
  
  a controller for comparing the difference value outputted from the voice recognition means with a predetermined threshold value, so that if the difference value is less than the threshold value, then the feature data corresponding to the index are read out from the memory and delivered to the vocoder, calculating an accumulative mean value of threshold values for every trial of recognition with respect to the feature data to and through the present time, the accumulative mean value being stored in the third region of the memory, and by reflecting the accumulative mean value into the threshold value, updating the threshold value stored in the fourth region of the memory.
- View Dependent Claims (2)
- - 2. The apparatus as claimed in claim 1, wherein the updated threshold value is a value that a standard deviation multiplied by a given weighted value is added to the accumulative mean value.

3. A method for performing speech recognition in a communication terminal equipment, comprising the steps of:
- (a) entering a speech recognition mode;
  
  (b) upon receipt of a voice input in the speech recognition mode, processing the voice input and transmitting the processed voice input to a speech recognition circuit;
  
  (c) receiving from the voice recognition means previously stored first data, being most similar to the processed voice input, and previously stored second data, being second most similar thereto, together with first and second difference values corresponding to the first and second data respectively, and then calculating a new threshold value based on a differential value between the first and second difference values; and
  
  (d) comparing the new threshold value with the first difference value, and if the first difference value is less than the new threshold value, generating an audible output of speech corresponding to said first data.
- View Dependent Claims (4, 5, 6)
- - 4. The method as claimed in claim 3, further comprising the step of providing an information message when the first difference value exceeds the new threshold value in step (d).
  - 5. The method as claimed in claim 3, wherein the new threshold value is substantially equal to a value that the differential value between the first and second difference values, multiplied by a given weighted value, is added to a preceding threshold value.
  - 6. The method as claimed in claim 3, wherein the first difference value is less than the second difference value.

7. A speech recognition method in a communication terminal, comprising the steps of:
- (a) entering a speech recognition mode;
  
  (b) upon receipt of a voice input in the speech recognition mode, processing the voice input and transmitting the processed voice input to a speech recognition circuit;
  
  (c) receiving from the speech recognition circuit previously stored first data that are most similar to a the processed voice input, and previously stored second data that are second most similar thereto, together with first and second difference values corresponding to the first and second data respectively;
  
  (d) comparing a predetermined threshold value with the first difference value, and if the first difference value is less than the predetermined threshold value, then generating an audible output of speech corresponding to the first data; and
  
  (e) calculating an accumulative mean value for preceding threshold values obtained from every last recognition with respect to the voice data, in order to compensate for an error resulting from selection of feature data of the corresponding voice data subsequently to the above step (d), and reflecting the calculated accumulative mean value into the present threshold value to thereby set a new threshold value, thereafter returning to the above step (b) of transmitting.
- View Dependent Claims (8, 9, 10, 11)
- - 8. The method as claimed in claim 7, wherein the new threshold value is substantially equal to a value that a standard deviation value multiplied by a given weighted value and the present threshold value are added together.
  - 9. The method as claimed in claim 7, wherein the new threshold value is substantially equal to a value that a standard deviation value multiplied by a given weighted value and the accumulative mean value are added together.
  - 10. The method as claimed in claim 7, wherein the new threshold value is substantially equal to a value that the accumulative mean value multiplied by a given weighted value and the present threshold value are added together.
  - 11. The method as claimed in claim 7, further comprising the step of providing a message informing a user that the corresponding voice data have not been registered, in case where the first difference value is no less than the threshold value, in the above step (d).

12. A method for performing speech recognition in a communication terminal, comprising the steps of:
- (a) entering a speech recognition mode;
  
  (b) upon receipt of a voice input in the speech recognition mode, processing the voice input in the form of packet data and then transmitting the processed voice input to a speech recognition circuit;
  
  (c) receiving from the speech recognition circuit a first set of data being most similar to a pre-registered voice feature, and a second set of data being second most similar thereto, together with first and second difference values corresponding to the first and second set of data respectively;
  
  (d) comparing a predetermined threshold value with the first difference value, so that if the first difference value is less than the predetermined threshold value, then an audible tone responsive to a corresponding voice data is reproduced in a speaker; and
  
  (e) calculating an accumulative mean value for preceding threshold values obtained in every past recognition with respect to all the recorded voice data, in order to compensate for an error resulting from a diversity of users subsequently to reproduction in the above step (d), and adding the calculated accumulative mean value multiplied by a weighted value to the present threshold value to thereby set up a new threshold value, then returning to the above step (b).
- View Dependent Claims (13, 14, 15, 16)
- - 13. The method as claimed in claim 12, wherein the new threshold value is substantially equal to a value that a standard deviation value multiplied by a given weighted value and the accumulative mean value are added together.
  - 14. The method as claimed in claim 12, wherein the new threshold value is substantially equal to a value that a standard deviation value multiplied by a given weighted value and the present threshold value are added together.
  - 15. The method as claimed in claim 12, wherein the new threshold value is substantially equal to a value that the accumulative mean value multiplied by a given weighted value and the present threshold value are added together.
  - 16. The method as claimed in claim 12, further comprising the step of providing a message informing a user that the corresponding voice data have not been registered, in case where the first difference value is no less than the threshold value, in the above step (d).

17. A method for performing speech recognition in a communication terminal, comprising the steps of:
- (a) entering a speech recognition mode;
  
  (b) upon receipt of a voice input in the speech recognition mode, processing the voice input in the form of packet data and then transmitting the processed voice input to a speech recognition circuit;
  
  (c) receiving from the speech recognition circuit a first set of data being most similar to a pre-registered voice feature, and a second set of data being secondly similar thereto, together with first and second difference values corresponding to the first and second set of data respectively;
  
  (d) comparing a predetermined threshold value with the first difference value, so that if the first difference value is less than the predetermined threshold value, then an audible tone responsive to a corresponding voice data is reproduced in a speaker; and
  
  (e) determining whether a response from a user is detected, and calculating a rate of recognition error upon absence of a said response, and if the calculated rate of recognition error is less than a predetermined reference value, then returning to the step (b) for re-recognition of the voice input.
- View Dependent Claims (18, 19, 20, 21)
- - 18. The method as claimed in claim 17, further comprising the step of determining that there existed an error upon an initial registration of the corresponding voice data, in case where the calculated rate of recognition error is no less than the predetermined reference value, and then proceeding to a registration processing routine for enabling a re-registration.
  - 19. The method as claimed in claim 17, further comprising the step of determining that upon detection of the response of the user, a normal recognition has been carried out, and increasing a number of trials by a unit value, thereafter returning to the preceding step (b).
  - 20. The method as claimed in claim 19, further comprising the sub-step of increasing a number of retrials by a unit value prior to calculation of the rate of recognition error upon absence of detection of the response from the user in the above step (e), and determining as the rate of recognition error a value that the number of retrials is divided by the number of trials.
  - 21. The method as claimed in claim 17, further comprising the sub-step of providing a message informing the user that the corresponding voice data have not been registered, in case where the first difference value is no less than the threshold value, in the above step (d).

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Samsung Electronics Co. Ltd.
Original Assignee
Samsung Electronics Co. Ltd.
Inventors
Park, Joung-Kyou
Primary Examiner(s)
Korzuch, William R.
Assistant Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US09/260,188
Time in Patent Office

862 Days
Field of Search

704/236, 704/246, 704/251, 704/253, 704/239, 704/243, 704/247, 379/120, 379/127, 379/207, 379/88
US Class Current

704/236
CPC Class Codes

G10L 15/08 Speech classification or se...

H04M 1/271 controlled by voice recogni...

Mobile phone having speaker dependent voice recognition method and apparatus

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

72 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Mobile phone having speaker dependent voice recognition method and apparatus

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

72 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links