AUTOMATIC SPEECH RECOGNITION WITH DETECTION OF AT LEAST ONE CONTEXTUAL ELEMENT, AND APPLICATION MANAGEMENT AND MAINTENANCE OF AIRCRAFT

US 20170076722A1
Filed: 09/14/2016
Published: 03/16/2017
Est. Priority Date: 09/15/2015
Status: Active Grant

First Claim

Patent Images

1-15. -15. (canceled)

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An automatic speech recognition with detection of at least one contextual element, and application to aircraft flying and maintenance are provided. The automatic speech recognition device comprises a unit for acquiring an audio signal, a device for detecting the state of at least one contextual element, and a language decoder for determining an oral instruction corresponding to the audio signal. The language decoder comprises at least one acoustic model defining an acoustic probability law and at least two syntax models each defining a syntax probability law. The language decoder also comprises an oral instruction construction algorithm implementing the acoustic model and a plurality of active syntax models taken from among the syntax models, a contextualization processor to select, based on the state of the order each contextual element detected by the detection device, at least one syntax model selected from among the plurality of active syntax models, and a processor for determining the oral instruction corresponding to the audio signal.

Citations

31 Claims

1-15. -15. (canceled)

16. An automatic speech recognition device comprising:
- an acquisition unit for acquiring an audio signal, a forming member for forming the audio signal, to divide the audio signal into frames, a detection device to detect the state of at least one contextual element, and a language decoder for determining an oral instruction corresponding to the audio signal, the language decoder comprising;
  
  at least one acoustic model defining an acoustic probability law for calculating, for each phoneme of a sequence of phonemes, an acoustic probability of that phoneme and a corresponding frame of the audio signal matching;
  
  at least two syntax models defining a syntax probability law for calculating, for each phoneme of a sequence of phonemes analyzed using said acoustic model, a syntax probability of that phoneme following the phoneme or group of phonemes preceding said phoneme in the sequence of phonemes;
  
  an oral instruction construction algorithm implementing the acoustic model and a plurality of active syntax models from among the syntax models to build, for each active syntax model, a candidate sequence of phonemes associated with said active syntax model so that the product of the acoustic and syntax probabilities of the different phonemes making up said candidate sequence of phonemes is maximal;
  
  a contextualization processor to select, based on the state of the or each contextual element detected by the detection device, at least one syntax model selected from among the plurality of active syntax models; and
  
  a determination processor for determining the oral instruction corresponding to the audio signal, to define the candidate sequence of phonemes associated with the selected syntax model or, if several syntax models are selected, the sequence of phonemes, from among the candidate sequences of phonemes associated with the selected acoustic models, for which the product of the acoustic and syntax probabilities of different phonemes making up said sequence of phonemes is maximal, as constituting the oral instruction corresponding to the audio signal.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23)
- - 17. The automatic speech recognition device according to claim 16, wherein the contextualization processor is configured for:
    - assigning, based on the state of the detected contextual element, an order number to each active syntax model,seeking, among the active syntax models, candidate syntax models with which candidate sequences of phonemes are associated for which the product of the acoustic and syntax probabilities of the different phonemes making up said candidate sequences of phonemes is above a predetermined threshold, andselecting the candidate syntax model(s) having the highest order number.
  - 18. The automatic speech recognition device according to claim 16, wherein at least one contextual element is independent from the audio signal.
  - 19. The automatic speech recognition device according to claim 16, wherein the detection device comprises a gaze detector configured for detecting the direction of a user'"'"'s gaze or a pointing detector configured for detecting the position of a pointing member.
  - 20. The automatic speech recognition device according to claim 19, wherein the pointing member is a cursor.
  - 21. The automatic speech recognition device as recited in claim 16 wherein the contextualization processor is configured for:
    - assigning, based on the state of the detected contextual element, an order number to each active syntax model,seeking, among the active syntax models, candidate syntax models with which candidate sequences of phonemes are associated for which the product of the acoustic and syntax probabilities of the different phonemes making up said candidate sequences of phonemes is above a predetermined threshold, andselecting the candidate syntax model(s) having the highest order number,wherein at least one contextual element is independent from the audio signal,wherein the detection device comprises a gaze detector configured for detecting the direction of a user'"'"'s gaze or a pointing detector configured for detecting the position of a pointing member,the automatic speech recognition device further comprising a display device displaying objects, each syntax model being associated with a respective object from among the displayed objects, the contextualization processor being configured for assigning an order number thereof to each syntax model based on the distance between the direction of the user'"'"'s gaze or the position of the pointer and the displayed object with which said syntax model is associated.
  - 22. An assistance system to assist with the piloting or maintenance of an aircraft, comprising:
    - the automatic speech recognition device according to claim 16; and
      
      a command execution unit configured to execute the oral instruction corresponding to the audio signal.
  - 23. The assistance system according to claim 22, wherein the detection device comprises a detector for detecting a flight phase of the aircraft or a system status of the aircraft.

24. An automatic speech recognition method comprising:
- determining an oral instruction corresponding to an audio signal, the determining of the oral instruction being implemented by an automatic speech recognition device comprising;
  
  at least one acoustic model defining an acoustic probability law for calculating, for each phoneme of a sequence of phonemes, an acoustic probability of that phoneme and a corresponding frame of the audio signal matching,at least two syntax models defining a syntax probability law for calculating, for each phoneme of a sequence of phonemes analyzed using said acoustic model, a syntax probability of that phoneme following the phoneme or group of phonemes preceding said phoneme in the sequence of phonemes,acquiring the audio signal,detecting the status of at least one contextual element,activating a plurality of syntax models forming active syntax models,forming the audio signal, said forming comprising dividing the audio signal into frames,building, for each active syntax model, using the acoustic model and said active syntax model, a candidate sequence of phonemes associated with said active syntax model so that the product of the acoustic and syntax probabilities of the different phonemes making up said candidate sequence of phonemes is maximal,selecting, based on the state of the detected contextual element, at least one syntax model from among the active syntax models, anddefining the candidate sequence of phonemes associated with the selected syntax model or, if several syntax models are selected, the sequence of phonemes, from among the candidate sequences of phonemes associated with the selected syntax models, for which the product of the acoustic and syntax probabilities of different phonemes making up said sequence of phonemes is maximal, as constituting the oral instruction corresponding to the audio signal.
- View Dependent Claims (25, 26, 27, 28, 29, 30, 31)
- - 25. The automatic speech recognition method according to claim 24, wherein the selection step comprises the following sub-steps:
    - assigning, based on the state of the detected contextual element, an order number to each active syntax model,seeking, among the active syntax models, candidate syntax models with which candidate sequences of phonemes are associated for which the product of the acoustic and syntax probabilities of the different phonemes making up said candidate sequences of phonemes is above a predetermined threshold, andselecting candidate syntax model(s) having the highest order number.
  - 26. The automatic speech recognition device according to claim 24, wherein at least one contextual element is independent from the audio signal.
  - 27. The automatic speech recognition method according to claim 24, wherein the contextual element comprises a direction of a user'"'"'s gaze or a position of a pointing member such as a cursor.
  - 28. The automatic speech recognition method according to claim 24, wherein the selection step comprises the following sub-steps:
    - assigning, based on the state of the detected contextual element, an order number to each active syntax model,seeking, among the active syntax models, candidate syntax models with which candidate sequences of phonemes are associated for which the product of the acoustic and syntax probabilities of the different phonemes making up said candidate sequences of phonemes is above a predetermined threshold, andselecting candidate syntax model(s) having the highest order number,wherein at least one contextual element is independent from the audio signal,wherein the contextual element comprises a direction of a user'"'"'s gaze or a position of a pointing member such as a cursor,wherein objects are displayed on a display device, each syntax model being associated with a respective object from among the displayed objects, and the order number is assigned to each syntax model based on the distance between the direction of the user'"'"'s gaze or the position of the pointing member and the displayed object with which said syntax model is associated.
  - 29. The automatic speech recognition method according to claim 28, wherein the direction of the user'"'"'s gaze consists in a direction of the user'"'"'s gaze at the end of the acquisition of the audio signal.
  - 30. An assistance method for assisting with the piloting or maintenance of an aircraft, implemented by a piloting aid system or a maintenance aid system of said aircraft, the assistance method comprising:
    - determining, using the automatic speech recognition method according to claim 24, an oral instruction corresponding to a recorded audio signal; and
      
      executing the oral instruction via the assistance system.
  - 31. The assistance method according to claim 30, wherein the contextual element comprises a flight phase of the aircraft or a system status of the aircraft.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dassault Aviation (Groupe Industriel Marcel Dassault)
Original Assignee
Dassault Aviation (Groupe Industriel Marcel Dassault)
Inventors
GIROD, Herv, SAEZ, Jean-Franois, KOU, Paul

Granted Patent

US 10,403,274 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

B64D 43/00   Arrangements or adaptations...

B64U 2201/20   Remote controls

G06F 3/012   Head tracking input arrange...

G06F 3/013   Eye tracking input arrangem...

G10L 15/02   Feature extraction for spee...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/22   Procedures used during a sp...

G10L 2015/025   Phonemes, fenemes or fenone...

G10L 2015/223   Execution procedure of a sp...

G10L 2015/227   of the speaker; Human-fact...

G10L 2015/228   of application context

AUTOMATIC SPEECH RECOGNITION WITH DETECTION OF AT LEAST ONE CONTEXTUAL ELEMENT, AND APPLICATION MANAGEMENT AND MAINTENANCE OF AIRCRAFT

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

31 Claims

Specification

Solutions

Use Cases

Quick Links

AUTOMATIC SPEECH RECOGNITION WITH DETECTION OF AT LEAST ONE CONTEXTUAL ELEMENT, AND APPLICATION MANAGEMENT AND MAINTENANCE OF AIRCRAFT

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

31 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links