Speechreading using facial feature parameters from a non-direct frontal view of the speaker

US 5,806,036 A
Filed: 08/17/1995
Issued: 09/08/1998
Est. Priority Date: 08/17/1995
Status: Expired due to Term

First Claim

Patent Images

1. A system for performing recognition comprising:

a telephone transmitter contained in a movable telephone transmitter housing;

a camera directly mounted to and positioned with respect to the telephone transmitter to obtain video information, corresponding to at least one facial feature for speechreading, from a non-direct frontal view of a speaker;

a data channel coupled to the camera to transfer the video information from the camera; and

a recognition processing logic coupled to the data channel to perform speechreading recognition of the video information.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system for performing recognition having a telephone transmitter, a camera, a data channel and recognition processing logic, in which the camera is directly mounted to and positioned with respect to the telephone housing to obtain video information from a non-direct frontal view of the speaker corresponding to at least one facial feature for use in speechreading. The facial features that may be obtained include the position of the tongue, separation of the teeth and the rounding protrusion of the lips. Using this data, recognition processing logic performs speechreading recognition of the video information.

42 Citations

View as Search Results

22 Claims

1. A system for performing recognition comprising:
- a telephone transmitter contained in a movable telephone transmitter housing;
  
  a camera directly mounted to and positioned with respect to the telephone transmitter to obtain video information, corresponding to at least one facial feature for speechreading, from a non-direct frontal view of a speaker;
  
  a data channel coupled to the camera to transfer the video information from the camera; and
  
  a recognition processing logic coupled to the data channel to perform speechreading recognition of the video information.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 21)
- - 2. The system defined in claim 1 wherein the telephone transmitter housing comprises a telephone handset.
  - 3. The system defined in claim 1 wherein the telephone transmitter housing comprises a telephone headset.
  - 4. The system defined in claim 1 wherein the camera comprises a digital camera.
  - 5. The system defined in claim 1 wherein the video information comprises position of a tongue of a user and the rounding protrusion of the lips.
  - 6. The system defined in claim 5 wherein the video information further comprises the position of the jaw.
  - 7. The system defined in claim 6 wherein the position of the jaw is based on separation of teeth.
  - 8. The system defined in claim 1 further comprising a light source mounted on the telephone device to illuminate a user'"'"'s mouth.
  - 9. The system defined in claim 1 further comprising an infrared (IR) source mounted to the telephone device to illuminate a user'"'"'s mouth.
  - 10. The system defined in claim 9 wherein the camera comprises an IR-sensitive camera.
  - 11. The system defined in claim 9 wherein the camera comprises an optical camera.
  - 21. The system defined in claim 1 wherein the camera images a portion of the speaker'"'"'s mouth from a location which is at an angle with respect to a direct frontal view of the speaker that is dependent on the speaker'"'"'s facial features and the speaker'"'"'s positioning of the telephone transmitter.

12. An apparatus for obtaining data for use by recognition processing logic, said apparatus comprising:
- a telephone transmitter;
  
  a camera coupled to the telephone transmitter and positioned with respect to the telephone transmitter to obtain video information, corresponding to at least one facial feature for speechreading, from a non-direct frontal view of a speaker, wherein the video information comprises position of a tongue of a user and the rounding protrusion of the lips;
  
  a data channel coupled to the camera to transfer the video information from the camera to the recognition processing logic to enable speechreading recognition of the video information.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 22)
- - 13. The apparatus defined in claim 12 wherein the video information further comprises the position of the jaw.
  - 14. The apparatus defined in claim 13 wherein the position of the jaw is based on separation of teeth.
  - 15. The apparatus defined in claim 12 wherein the camera comprises a digital camera.
  - 16. The apparatus defined in claim 12 further comprising a light source mounted to the telephone device to illuminate a user'"'"'s mouth.
  - 17. The apparatus defined in claim 12 further comprising an infrared (IR) source mounted to the telephone device to illuminate a user'"'"'s mouth.
  - 18. The apparatus defined in claim 17 wherein the camera comprises an IR-sensitive camera.
  - 19. The apparatus defined in claim 17 wherein the camera comprises an IR optical-sensitive camera.
  - 22. The apparatus defined in claim 12 wherein the camera images a portion of the speaker'"'"'s mouth from a location which is at an angle with respect to a direct frontal view of the speaker that is dependent on the speaker'"'"'s facial features and the speaker'"'"'s positioning of the telephone transmitter.

20. A method for performing recognition comprising the steps of:
- receiving audio information from a speaker using a telephone transmitter;
  
  receiving video information, corresponding to at least one facial feature of the speaker, from a non-direct frontal view of the speaker, using a camera coupled to the telephone transmitter;
  
  transferring the audio and video information by a data channel to recognition logic for speech and pattern recognition.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Ricoh Company Limited, Ricoh Corporation (Ricoh Company Limited)
Original Assignee
Ricoh Company Limited, Ricoh Corporation (Ricoh Company Limited)
Inventors
Stork, David G.
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Smits, Talivaldis Ivars

Application Number

US08/516,090
Time in Patent Office

1,118 Days
Field of Search

395/2.69, 395/2.8, 379/52, 348/14, 704/260, 704/271
US Class Current

704/260
CPC Class Codes

G10L 15/25   using position of the lips,...

H04M 1/22   Illumination; Arrangements ...

H04M 1/271   controlled by voice recogni...

H04M 2250/52   including functional featur...

Speechreading using facial feature parameters from a non-direct frontal view of the speaker

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

42 Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Speechreading using facial feature parameters from a non-direct frontal view of the speaker

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

42 Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links