Multiple-user voice-based control of devices in an endoscopic imaging system

US 7,752,050 B1
Filed: 09/03/2004
Issued: 07/06/2010
Est. Priority Date: 09/03/2004
Status: Active Grant

First Claim

Patent Images

1. A multi-user voice control system for use in an endoscopic imaging system, the multi-user voice control system comprising:

a first input channel to receive speech of a first user of the endoscopic imaging system;

a second input channel to receive speech of a second user of the endoscopic imaging system, wherein the first and second input channels are formed in a single device;

a selection unit to select the first input channel or the second input channel by applying a selection priority to the first and second input channels, wherein the selection unit comprises a voice activity detector (VAD) module to determine a first signal received on the first input channel when the first user starts speaking exceeds a first threshold and to determine whether a second signal received on the second input channel exceeds a second threshold, wherein the first threshold is less than the second threshold;

an automatic speech recognizer (ASR) to recognize speech received on a channel selected by the channel selector; and

a control unit to enable the multi-user voice control system to control a device in the endoscopic imaging system in response to speech recognized by the ASR.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A multi-user voice control system for use in endoscopic imaging system includes a first input channel, a second input channel, an automatic speech recognizer (ASR), a control unit, and a selector. The first input channel receives speech of a first user, and the second input channel receives speech of a second user. The ASR recognizes speech received on the first channel and recognizes speech received on the second channel. The control unit enables the voice control system to control a device in the endoscopic imaging system in response to recognized speech. The selector selectively determines whether recognized speech associated with the first channel or recognized speech associated with the second channel is used to control the device, by applying a selection priority to the first and second channels.

36 Citations

View as Search Results

34 Claims

1. A multi-user voice control system for use in an endoscopic imaging system, the multi-user voice control system comprising:
- a first input channel to receive speech of a first user of the endoscopic imaging system;
  
  a second input channel to receive speech of a second user of the endoscopic imaging system, wherein the first and second input channels are formed in a single device;
  
  a selection unit to select the first input channel or the second input channel by applying a selection priority to the first and second input channels, wherein the selection unit comprises a voice activity detector (VAD) module to determine a first signal received on the first input channel when the first user starts speaking exceeds a first threshold and to determine whether a second signal received on the second input channel exceeds a second threshold, wherein the first threshold is less than the second threshold;
  
  an automatic speech recognizer (ASR) to recognize speech received on a channel selected by the channel selector; and
  
  a control unit to enable the multi-user voice control system to control a device in the endoscopic imaging system in response to speech recognized by the ASR.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. A multi-user voice control system as recited in claim 1, wherein the selection unit is to selectively determine whether recognized speech associated with the first input channel or recognized speech associated with the second input channel is used to control the device, by applying the selection priority to the first and second input channels.
  - 3. A multi-user voice control system as recited in claim 1, wherein the selection unit is to pass the first signal for recognition by the ASR and to not pass the second signal for recognition by the ASR, when the first signal exceeds the first threshold.
  - 4. A multi-user voice control system as recited in claim 3, wherein the selection unit further is to pass the second signal for recognition by the ASR only when the second signal exceeds the second threshold while the first signal is below the first threshold.
  - 5. A multi-user voice control system as recited in claim 1, further comprising a device control interface through which to communicate with the device.
  - 6. A multi-user voice control system as recited in claim 1, the endoscopic imaging system including a plurality of voice-controllable devices, the multi-user voice control system further comprising:
    - means for allowing speech received on the first input channel to control any of the plurality of voice-controllable devices; and
      
      means for allowing speech received on the second input channel to control only a subset of the plurality of voice-controllable devices.
  - 7. A multi-user voice control system as recited in claim 1, wherein the selection unit overrides the selection priority in response to a predetermined utterance on the second input channel and selects the second input channel.
  - 8. A multi-user voice control system as recited in claim 1, further comprising:
    - means for buffering a segment of a first signal received on the first input channel, which is below a first threshold; and
      
      means for providing the buffered segment of the first signal to the ASR for recognition.
  - 9. A multi-user voice control system as recited in claim 8, wherein said buffering comprises buffering a sliding segment of the first signal.
  - 10. A multi-user voice control system as recited in claim 8, further comprising:
    - means for buffering a segment of a second signal received on the second input channel, which is below a second threshold; and
      
      means for providing the buffered segment of the second signal to the ASR for recognition.

11. A multi-user apparatus for use in an endoscopic imaging system, the apparatus comprising:
- a first input channel to receive a first signal representing speech of a first user of the endoscopic imaging system;
  
  a second input channel to receive a second signal representing speech of a second user of the endoscopic imaging system, wherein the first and second input channels are formed in a single device;
  
  means for selecting the first signal and ignoring the second signal when the first signal exceeds a first threshold, and for selecting the second signal when the second signal exceeds a second threshold and the first signal is below the first threshold, wherein the means for selecting comprises a voice activity detection means to determine whether a first signal received on the first input channel when the first user starts speaking exceeds a first threshold and to determine whether a second signal received on the second input channel exceeds a second threshold, wherein the first threshold is less than the second threshold;
  
  an automatic speech recognizer (ASR) to recognize speech of the first or second user from a signal selected by the selecting means; and
  
  means for controlling a device in the endoscopic imaging system external to the apparatus in response to speech of the first user or the second user recognized by the ASR.
- View Dependent Claims (12, 13, 14, 15, 16)
- - 12. A multi-user apparatus as recited in claim 11, wherein the endoscopic imaging system includes a plurality of voice-controllable devices;
    - the voice control system further comprising;
      
      means for allowing speech received on the first input channel to control any of the plurality of voice-controllable devices; and
      
      means for allowing speech received on the second input channel to control only a subset of the plurality of voice-controllable devices.
  - 13. A multi-user apparatus as recited in claim 11, wherein the selecting means selects the second signal in response to a predetermined utterance on the second input channel regardless of whether the first signal exceeds the first threshold.
  - 14. A multi-user apparatus as recited in claim 11, further comprising:
    - means for buffering a segment of the first signal which is below the first threshold; and
      
      means for providing the buffered segment of the first signal to the ASR for recognition.
  - 15. A multi-user apparatus as recited in claim 14, wherein said buffering comprises buffering a sliding segment of the first signal.
  - 16. A multi-user apparatus as recited in claim 15, further comprising:
    - means for buffering a segment of the second signal which is below the second threshold; and
      
      means for providing the buffered segment of the second signal to the ASR for recognition.

17. A method of controlling a device in an endoscopic imaging system based on speech, the method comprising:
- receiving speech of a first user on a first channel and speech of a second user on a second channel, wherein the first and second users are users of the endoscopic imaging system, and the first and second channels are formed in a single device;
  
  determining whether speech associated with the first channel or speech associated with the second channel will be used to control the device in the endoscopic imaging system, by applying a prioritization to the first and second channels wherein said determining comprisesdetermining whether a first signal received on the first channel when the first user starts speaking exceeds a first threshold, anddetermining whether a second signal received on the second channel exceeds a second threshold, wherein the first threshold is less than the second threshold;
  
  automatically recognizing speech of the first or second user according to a result of said determining; and
  
  using the automatically recognized speech to control the device.
- View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25)
- - 18. A method as recited in claim 17, wherein said determining comprises passing the first signal for automatic speech recognition and not passing the second signal for automatic speech recognition, if the first signal exceeds the first threshold.
  - 19. A method as recited in claim 18, wherein said determining comprises passing the second signal for automatic speech recognition only if the second signal exceeds the second threshold and the first signal is below the first threshold.
  - 20. A method as recited in claim 18, further comprising:
    - buffering a segment of the first signal which is below the first threshold; and
      
      providing the buffered segment of the first signal to an automatic speech recognizer for recognition.
  - 21. A method as recited in claim 20, wherein said buffering comprises buffering a sliding segment of the first signal.
  - 22. A method as recited in claim 20, further comprising:
    - buffering a segment of the second signal which is below the second threshold; and
      
      providing the buffered segment of the second signal to the automatic speech recognizer for recognition.
  - 23. A method as recited in claim 18, further comprising:
    - buffering a sliding segment of the first signal, including a segment of the first signal which is below the first threshold;
      
      providing the buffered segment of the first signal, including the segment of the first signal which is below the first threshold, to an automatic speech recognizer for recognition;
      
      buffering a sliding segment of the second signal, including a segment of the second signal which is below the second threshold; and
      
      providing the buffered segment of the second signal, including the segment of the second signal which is below the second threshold, to the automatic speech recognizer for recognition.
  - 24. A method as recited in claim 17, wherein the endoscopic imaging system includes a plurality of voice-controllable devices;
    - the method further comprising;
      
      allowing speech received on the first channel to control each of the plurality of voice-controllable devices; and
      
      allowing speech received on the second channel to control only a subset of the plurality of voice-controllable devices.
  - 25. A method as recited in claim 17, wherein the prioritization is such that the first channel has a higher priority than the second channel;
    - and the determining further comprises overriding the prioritization in response to a predetermined utterance on the second channel, and determining that speech associated with the second channel will be used to control the device in the endoscopic imaging system.

26. A method comprising:
- receiving a first signal representing speech of a first user on a first channel and a second signal representing speech of a second user on a second channel, wherein the first and second users are users of an endoscopic imaging system, and the first and second channels are formed in a single device;
  
  if the first signal exceeds a first threshold when the first user starts speaking, then enabling automatic speech recognition with respect to the first signal while preventing automatic speech recognition with respect to the second signal;
  
  if the second signal exceeds a second threshold while the first signal is below the first threshold, then enabling automatic speech recognition with respect to the second signal, wherein the first threshold is less than the second threshold; and
  
  controlling a device in the endoscopic imaging system in response to the recognized speech.
- View Dependent Claims (27, 28, 29, 30, 31)
- - 27. A method as recited in claim 26, further comprising:
    - buffering a segment of the first signal which is below the first threshold; and
      
      providing the buffered segment of the first signal to an automatic speech recognizer for recognition.
  - 28. A method as recited in claim 27, wherein said buffering comprises buffering a sliding segment of the first signal.
  - 29. A method as recited in claim 27, further comprising:
    - buffering a segment of the second signal which is below the second threshold; and
      
      providing the buffered segment of the second signal to the automatic speech recognizer for recognition.
  - 30. A method as recited in claim 26, further comprising:
    - buffering a sliding segment of the first signal, including a segment of the first signal which is below the first threshold;
      
      providing the buffered segment of the first signal, including the segment of the first signal which is below the first threshold, to an automatic speech recognizer for recognition;
      
      buffering a sliding segment of the second signal, including a segment of the second signal which is below the second threshold; and
      
      providing the buffered segment of the second signal, including the segment of the second signal which is below the second threshold, to the automatic speech recognizer for recognition.
  - 31. A method as recited in claim 26, wherein the method is performed in a voice control system within the endoscopic imaging system that includes a plurality of voice-controllable devices, the method further comprising:
    - allowing speech received on the first channel to control any of the plurality of voice-controllable devices; and
      
      allowing speech received on the second channel to control only a subset of the plurality of voice-controllable devices.

32. A method of operating a voice control system (VCS) for controlling a voice-controllable device in an endoscopic imaging system, the method comprising:
- receiving at the VCS a first signal for conveying speech of a first user on a first channel, wherein the first user is a user of the endoscopic imaging system;
  
  receiving at the VCS a second signal for conveying speech of a second user on a second channel, wherein the second user is a user of the endoscopic imaging system, and the first and second channels are formed in a single device;
  
  buffering a sliding segment of the first signal and a sliding segment of the second signal;
  
  detecting when the first signal exceeds a first threshold and detecting when the second signal exceeds a second threshold;
  
  in response to the first signal exceeding the first threshold when the first user starts speaking, enabling automatic speech recognition to be performed with respect to the first signal, including a leading segment and a trailing segment of the first signal which are below the first threshold, in the VCS, while preventing automatic recognition from being performed with respect to the second signal;
  
  in response to the second signal exceeding the second threshold while the first signal is below the first threshold, enabling automatic speech recognition to be performed with respect to the second signal, including a leading segment and a trailing segment of the second signal which are below the second threshold, in the VCS, wherein the first threshold is less than the second threshold; and
  
  using recognized speech associated with the first or second signal to control the voice-controllable device.
- View Dependent Claims (33, 34)
- - 33. A method as recited in claim 32, wherein the endoscopic imaging system includes a plurality of voice-controllable devices;
    - the method further comprising;
      
      allowing speech received on the first channel to control any of the plurality of voice-controllable devices; and
      
      allowing speech received on the second channel to control only a subset of the plurality of voice-controllable devices.
  - 34. A method as recited in claim 32, further comprising in response to a predetermined utterance on the second channel, enabling automatic speech recognition to be performed with respect to the second signal, including a leading segment and a trailing segment of the second signal which are below the second threshold, in the VCS, regardless of whether the first signal exceeds the first threshold.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Stryker Corporation
Original Assignee
Stryker Corporation
Inventors
Javadekar, Kiran A., Mahadik, Amit A., Hameed, Salmaan
Primary Examiner(s)
Wozniak; James S

Application Number

US10/934,019
Time in Patent Office

2,132 Days
Field of Search

704/233, 704/270, 704/275, 600/101, 600/118
US Class Current

704/275
CPC Class Codes

A61B 1/00042   for mechanical operation

G10L 15/26   Speech to text systems G10L...

G10L 25/78   Detection of presence or ab...

Multiple-user voice-based control of devices in an endoscopic imaging system

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

36 Citations

34 Claims

Specification

Solutions

Use Cases

Quick Links

Multiple-user voice-based control of devices in an endoscopic imaging system

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

36 Citations

34 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links