×

Process for automatic control of one or more devices by voice commands or by real-time voice dialog and apparatus for carrying out this process

  • US 6,839,670 B1
  • Filed: 09/09/1996
  • Issued: 01/04/2005
  • Est. Priority Date: 09/11/1995
  • Status: Expired due to Term
First Claim
Patent Images

1. A process for the automatic control of one or several devices by speech command or by speech dialog in real-time operation, wherein:

  • entered speech commands are recognized by a speaker-independent compound-word speech recognizer and a speaker-dependent speech recognizer and are classified according to their recognition probability;

    recognized, admissible speech commands are checked for their plausibility, and the admissible and plausible speech command with the highest recognition probability is identified as the entered speech command, and the functions assigned to this speech command of the device or devices or responses of the speech dialog system are initiated or generated;

    a process wherein;

    one of the speech commands and the speech dialogs is formed or controlled, respectively on the basis of at least one syntax structure, at least one base command vocabulary and, if necessary, at least one speaker-specific additional command vocabulary;

    the syntax structure and the base command vocabulary are provided in speaker-independent form and are fixed during real-time operation;

    the speaker-specific additional command vocabulary or vocabularies are entered or changed by the respective speaker in that during training phases during or outside of the real-time operation, the speech recognizer that operates on the basis of a speaker-dependent recognition method is trained by the respective speaker through single or multiple input of the additional commands for the speech-specific features of the respective speaker;

    in the real-time operation, the speech dialog or the control of the device or devices takes place as follows;

    speech commands spoken in by the respective speaker are transmitted to a speaker-independent compound-word recognizer operating on the basis of phonemes or whole-word models and to the speaker-dependent speech recognizer, where they are respectively subjected to a feature extraction and are examined and classified in the compound-word speech recognizer with the aid of the features extracted there to determine the existence of base commands from the respective base command vocabulary according to the respectively specified syntax structure, and are examined and classified in the speaker-dependent speech recognizer with the aid of the features extracted there to determine the existence of additional commands from the respective additional command vocabulary;

    the commands that have been classified as recognized with a certain probability and the syntax structures of the two speech recognizers are then joined to form hypothetical speech commands, and that these are examined and classified according to the specified syntax structure as to their reliability and recognition probability;

    the admissible hypothetical speech commands are subsequently examined as to their plausibility on the basis of predetermined criteria, and that among the hypothetical speech commands recognized as plausible, the one with the highest recognition probability is selected and is identified as the speech command entered by the respective speaker;

    that subsequently a function or functions assigned to the identified speech command of the respective device to be controlled are initiated or a response or responses are generated in accordance with a specified speech dialog structure for continuing the speech dialog.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×