Apparatus, system and method for voice dialogue activation and/or conduct

US 9,020,823 B2
Filed: 10/29/2010
Issued: 04/28/2015
Est. Priority Date: 10/30/2009
Status: Active Grant

First Claim

Patent Images

1. An apparatus for at least one of voice dialogue activation and voice dialogue conduct, for use in a vehicle, comprising:

at least one input for a voice signal;

a voice recognition unit configured to establish one or more command words contained in the voice signal;

a speaker recognition unit configured to determine a current speaker using the voice signal and at least one stored speaker profile;

a decision-maker unit comprising;

a voice recognition unit connection coupled to an output of the voice recognition unit configured to perform a result action based on the one or more command words, anda speaker recognition unit connection coupled to the speaker recognition unit,the decision-maker unit being configured such that the activation of the result action is dependent, at least in the case of at least one command word, on whether the at least one command word has been identified as coming from a speaker associated with a speaker profile; and

an echo cancellation unit that receives a multichannel voice signal and, on the basis of transit time differences among components of the multichannel signal with respect to the at least one input, removes all components from non-authorized speakers,wherein;

the speaker recognition unit is configured to identify the current speaker by extracting speaker features from the voice signal and comparing the speaker features with stored speaker-dependent features, and comprises a further unit configured for speaker adaptation to continually ascertain refined speaker-dependent features and store the refined speaker-dependent features in the stored speaker profiles, andthe speaker recognition unit is configured to, in the case that a plurality of speakers are speaking simultaneously, attribute the voice signal to no speaker.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An apparatus, a system and a method for voice dialogue activation and/or conduct. The apparatus for voice dialogue activation and/or conduct has a voice recognition unit, a speaker recognition unit and a decision-maker unit. The decision-maker unit is designed to activate a result action on the basis of results from the voice and speaker recognition units.

54 Citations

View as Search Results

35 Claims

1. An apparatus for at least one of voice dialogue activation and voice dialogue conduct, for use in a vehicle, comprising:
- at least one input for a voice signal;
  
  a voice recognition unit configured to establish one or more command words contained in the voice signal;
  
  a speaker recognition unit configured to determine a current speaker using the voice signal and at least one stored speaker profile;
  
  a decision-maker unit comprising;
  
  a voice recognition unit connection coupled to an output of the voice recognition unit configured to perform a result action based on the one or more command words, anda speaker recognition unit connection coupled to the speaker recognition unit,the decision-maker unit being configured such that the activation of the result action is dependent, at least in the case of at least one command word, on whether the at least one command word has been identified as coming from a speaker associated with a speaker profile; and
  
  an echo cancellation unit that receives a multichannel voice signal and, on the basis of transit time differences among components of the multichannel signal with respect to the at least one input, removes all components from non-authorized speakers,wherein;
  
  the speaker recognition unit is configured to identify the current speaker by extracting speaker features from the voice signal and comparing the speaker features with stored speaker-dependent features, and comprises a further unit configured for speaker adaptation to continually ascertain refined speaker-dependent features and store the refined speaker-dependent features in the stored speaker profiles, andthe speaker recognition unit is configured to, in the case that a plurality of speakers are speaking simultaneously, attribute the voice signal to no speaker.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The apparatus as claimed in claim 1, wherein the decision-maker unit is configured to align and correlate results from the speaker recognition unit and from the voice recognition unit with speaker-specific information stored in a speaker profile, wherein performance of at least one command-word-dependent result action is suppressed if a current speaker is not authorized to perform the result actions.
  - 3. The apparatus as claimed in claim 1, wherein the apparatus is configured as a combined apparatus for voice dialogue conduct and activation.
  - 4. The apparatus as claimed in claim 1, wherein the voice evaluation unit comprises a word recognition unit configured to recognize words and also a downstream structure evaluation unit configured to recognize command-forming structures.
  - 5. The apparatus as claimed in claim 1, wherein the echo cancellation unit is connected directly or indirectly upstream of at least one of the speaker recognition unit and the voice recognition unit, wherein the echo cancellation unit has one or more inputs for loudspeaker signals that comprise at least one of mono, stereo, and multichannel loudspeaker signals, the echo cancellation unit configured to compensate for the influence of the loudspeaker signals on the voice signal.
  - 6. The apparatus as claimed in claim 5, wherein the echo cancellation unit comprises a subunit configured to compensate for voice components from other persons, said subunit connected to at least one input for the connection of additional microphones.
  - 7. The apparatus as claimed in claim 1, wherein at least one of the speaker recognition unit and the voice recognition unit has a noise rejection unit connected directly or indirectly upstream.
  - 8. The apparatus as claimed in claim 1, wherein at least one of the speaker recognition unit and the voice recognition unit is configured to synchronize an output from a speaker recognized by the speaker recognition unit to the decision-maker unit with an output of command words recognized by the voice recognition unit.
  - 9. The apparatus as claimed in claim 1, wherein a driver state sensing unit for sensing a state of the driver using the voice signal is arranged in parallel with the speaker recognition unit and the voice recognition unit.
  - 10. The apparatus as claimed in claim 1, wherein the voice recognition unit comprises an additional unit configured to capture time-related alterations in the speaker features of a speaker as an attribute and to store them in a stored speaker profile associated with the speaker.
  - 11. The apparatus as claimed in claim 1, further comprising at least one memory apparatus configured to store at least one of user profiles and speaker profiles.
  - 12. The apparatus as claimed in claim 11, wherein the at least one memory apparatus has at least one interface configured to input or output the stored at least one of the user profiles and speaker profiles such that the stored at least one of the user profiles and speaker profiles may be transferred to/from another vehicle.
  - 13. The apparatus as claimed in claim 1, wherein the apparatus is activated to evaluate the voice signals even during the performance of a result action, such that recognition of a command from an authorized speaker prompts at least partial interruption of the performance of a result action triggered by a prior command.
  - 14. The apparatus as claimed in claim 1, wherein the decision-maker unit is configured such that some command words are performed independently of the recognition of a speaker associated with the speaker profile.
  - 15. The apparatus as claimed in claim 1, further comprising at least one memory apparatus configured to store speaker profiles, wherein the at least one memory apparatus has at least one interface configured to input or output the stored speaker profiles such that the stored speaker profiles may be transferred to/from another vehicle.

16. A system for voice dialogue activation and/or voice dialogue conduct comprising:
- at least one input for a voice signal;
  
  a voice recognition unit configured to establish one or more command words contained in the voice signal;
  
  a speaker recognition unit configured to determine a current speaker using the voice signal and at least one stored speaker profile;
  
  a decision-maker unit comprising;
  
  a voice recognition unit connection coupled to an output of the voice recognition unit configured to perform a result action based on the one or more command words, anda speaker recognition unit connection coupled to the speaker recognition unit,the decision-maker unit being configured such that the activation of the result action is dependent, at least in the case of at least one command word, on whether the at least one command word has been identified as coming from a speaker associated with a speaker profile;
  
  at least one microphone coupled to the voice recognition unit; and
  
  at least one loudspeaker coupled to the voice recognition unit; and
  
  an echo cancellation unit that receives a multichannel voice signal and, on the basis of transit time differences among components of the multichannel signal with respect to the at least one input, removes all components from non-authorized speakers,wherein;
  
  the speaker recognition unit is configured to identify the current speaker by extracting speaker features from the voice signal and comparing the speaker features with stored speaker-dependent features, and comprises a further unit configured for speaker adaptation to continually ascertain refined speaker-dependent features and store the refined speaker-dependent features in the stored speaker profiles, andthe speaker recognition unit is configured to, in the case that a plurality of speakers are speaking simultaneously, attribute the voice signal to no speaker.
- View Dependent Claims (17, 18)
- - 17. The system as claimed in claim 16, further comprising at least one of a plurality of microphones and at least one microphone array arranged such that areas of optimum reception provided by directional characteristics of the microphones, for at least some of the microphones overlap in the presumed area of authorized speakers.
  - 18. The system as claimed in claim 17, wherein the microphones are configured to automatically orient to a position of the speaker sensed by the microphones.

19. A method for voice dialogue activation and/or conduct comprising:
- picking up a voice signal;
  
  recognizing at least one of a command word and a command word structure from the voice signal;
  
  recognizing a speaker using the voice signal and at least one stored speaker profile;
  
  performing a result action based on a recognized command word and a recognized speaker, wherein the voice signal is a multichannel voice signal;
  
  removing, on the basis of transit time differences among components of the multichannel signal with respect to at least one microphone, all components from non-authorized speakers,wherein recognizing an authorized speaker involves speaker features being extracted from the voice signal and being aligned with individual speaker features stored in a speaker profile,wherein speaker adaptation is performed which continuously refines and complements the individual speaker features stored in the speaker profile, andwherein speaker recognition, in the case that a plurality of speakers are speaking simultaneously, attributes the voice signal to no speaker.
- View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
- - 20. The method as claimed in claim 19, wherein the recognizing the at least one of the command word or the command word structure contains further comprises:
    - recognizing words contained in the voice signal; and
      
      recognizing command structures formed by the words.
  - 21. The method as claimed in claim 19, wherein at least one of the recognition of the command word and the recognition of a speaker is preceded by performance of echo cancellation such that overlays from a loudspeaker signal produced by reflections in a passenger compartment are removed from the voice signal by calculating the overlays produced by the loudspeaker signal and subtracting them from the voice signal.
  - 22. The method as claimed in claim 21, wherein voice signal components of the voice signal by further persons are determined and at least partially removed from the voice signal.
  - 23. The method as claimed in claim 22, wherein the voice signal is a multichannel voice signal.
  - 24. The method as claimed in claim 23, further comprising chronological correlation of differently time-shifted signals from different channels of the multichannel voice signal to extract and separate those components of the voice signal which come from one of the locations of authorized speakers.
  - 25. The method as claimed in claim 19, wherein a dependency of performance of a result action on a recognized command word and a recognized speaker involves performance of a result action being suppressed if the associated speaker one of has not been recognized and is not authorized to instruct the result action.
  - 26. The method as claimed in claim 19, wherein the recognition of one of the command word or of the speaker is preceded by performance of noise rejection.
  - 27. The method as claimed in claim 19, wherein time-related alterations in the speaker features are captured as an attribute and stored in the speaker profile.
  - 28. The method as claimed in claim 27, wherein the recognizing command words comprises extracting voice features from the voice signal aligned with individual voice features stored in the speaker profile.
  - 29. The method as claimed in claim 28, wherein speaker adaptation is performed which continuously refines and complements the individual voice features stored in the speaker profile.
  - 30. The method as claimed in claim 29, wherein time-related alterations in the voice features are captured as an attribute and stored in the speaker profile.
  - 31. The method as claimed in claim 30, wherein the voice signal is used to sense a driver state.
  - 32. The method as claimed in claim 19, wherein subsequent performance of the result action is interrupted by input of a further voice signal, containing a further command word from an authorized speaker.
  - 33. The method as claimed in claim 19, wherein the voice signal is used to ascertain locations of authorized speakers comprising producing a control signal for orienting at least one microphone to locations of the authorized speakers independent of a command word contained in the voice signal.
  - 34. The method as claimed in claim 33, wherein the performance of the result action involves output of a voice dialogue signal.
  - 35. The method as claimed in claim 34, wherein the performance of the result action comprises signal output of a control signal to controls a function of an apparatus integrated in a vehicle.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Continental Automotive GmbH (Continental AG)
Original Assignee
Continental Automotive GmbH (Continental AG)
Inventors
Knobl, Karl-Heinz, Ruehl, Hans-Wilhelm, Hoepken, Harro, Kmpf, David
Primary Examiner(s)
JACKSON, JAKIEDA R

Application Number

US12/915,879
Publication Number

US 20110145000A1
Time in Patent Office

1,642 Days
Field of Search

704/275
US Class Current

704/275
CPC Class Codes

G10L 15/06   Creation of reference templ...

G10L 15/20   Speech recognition techniqu...

G10L 15/22   Procedures used during a sp...

G10L 17/00   Speaker identification or v...

G10L 17/26   Recognition of special voic...

G10L 2021/02166   Microphone arrays; Beamforming

G10L 21/0208   Noise filtering

Apparatus, system and method for voice dialogue activation and/or conduct

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

54 Citations

35 Claims

Specification

Use Cases

Quick Links

Others

Apparatus, system and method for voice dialogue activation and/or conduct

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

54 Citations

35 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others