Method and apparatus for combining information from speech signals for adaptive interaction in teaching and testing

US 5,870,709 A
Filed: 11/25/1996
Issued: 02/09/1999
Est. Priority Date: 12/04/1995
Status: Expired due to Term

First Claim

Patent Images

1. A computer-assisted method, comprising selecting a next prompt for presentation to a user based at least in part on one or more extra-linguistic measures estimated from one or more spoken responses received from the user in response to one or more prior prompts, the extra-linguistic measures being chosen from a group including the latency of the spoken response, the amplitude of the spoken response and fundamental frequency values of the spoken response.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A computer system with a speech recognition component provides a method and apparatus for instructing and evaluating the proficiency of human users in skills that can be exhibited through speaking. The computer system tracks linguistic, indexical and paralinguistic characteristics of the spoken input of users, and implements games, data access, instructional systems, and tests. The computer system combines characteristics of the spoken input automatically to select appropriate material and present it in a manner suitable for the user. In one embodiment, the computer system measures the response latency and speaking rate of the user and presents its next spoken display at an appropriate speaking rate. In other embodiments, the computer system identifies the gender and native language of the user, and combines that information with the relative accuracy of the linguistic content of the user'"'"'s utterance to select and display material that may be easier or more challenging for speakers with these characteristics.

Citations

59 Claims

1. A computer-assisted method, comprising selecting a next prompt for presentation to a user based at least in part on one or more extra-linguistic measures estimated from one or more spoken responses received from the user in response to one or more prior prompts, the extra-linguistic measures being chosen from a group including the latency of the spoken response, the amplitude of the spoken response and fundamental frequency values of the spoken response.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The computer-assisted method of claim 1 wherein the next prompt is selected based at least in part on the one or more extra-linguistic measures and one or more prosodic measures estimated from the one or more spoken responses, the prosodic measures being chosen from a group including the rate of speech of the user during the period of the spoken response and the fluency of the spoken response.
  - 3. The computer-assisted method of claim 2 wherein the next prompt is selected based at least in part on the extra-linguistic and prosodic measures and one or more indexical values, the indexical values being chosen from a group including speaker identity, speaker native language, speaker age, and speaker gender, the indexical values being estimated from the one or more spoken responses or directly provided by the user.
  - 4. The computer-assisted method of claim 3 wherein the next prompt is selected based at least in part on the extra-linguistic measures, prosodic measures and indexical values and one or more production quality measures estimated from the one or more spoken responses, the production quality measures from a group including the pronunciation quality of the spoken response.
  - 5. The computer-assisted method of claim 4 wherein the next prompt is selected based at least in part on the indexical values, the extra-linguistic, prosodic, and production quality measures and the identity of one or more linguistic units which comprise the one or more spoken responses.
  - 6. The computer-assisted method of claim 3 wherein the next prompt is selected based at least in part on the extra-linguistic measures, prosodic measures and indexical values and the identity of one or more linguistic units which comprise the one or more spoken responses.
  - 7. The computer-assisted method of claim 2 wherein the next prompt is selected based at least in part on the extra-linguistic and prosodic measures and one or more production quality measures estimated from the one or more spoken responses, the production quality measures being chosen from a group including the pronunciation quality of the spoken response.
  - 8. The computer-assisted method of claim 2 wherein the next prompt is selected based at least in part on the extra-linguistic and prosodic measures and the identity of one or more linguistic units which comprise the one or more spoken responses.
  - 9. The computer-assisted method of claim 1 wherein the next prompt is selected based at least in part on the one or more extra-linguistic measures and one or more indexical values, the indexical values being chosen from a group including speaker identity, speaker native language, speaker age, and speaker gender, the indexical values being estimated from the one or more spoken responses or directly provided by the user.
  - 10. The computer-assisted method of claim 1 wherein the next prompt is selected based at least in part on the one or more extra-linguistic measures and one or more production quality measures estimated from the one or more spoken responses, the production quality measures being chosen from a group including the pronunciation quality of the spoken response.
  - 11. The computer-assisted method of claim 1 wherein the next prompt is selected based at least in part on the one or more extra-linguistic measures and the identity of one or more linguistic units which comprise the one or more spoken responses.
  - 12. The computer-assisted method of claim 1 wherein the next prompt comprises at least one of:
    - a request for information;
      
      a request to read a linguistic unit;
      
      a request to repeat a linguistic unit;
      
      or a request to complete, fill in or identify a verbal aggregate.
  - 13. The computer-assisted method of claim 1 wherein the one or more spoken responses from the user are received at an interactive computer system via telephone or other telecommunication or data information network.
  - 14. The computer-assisted method of claim 1 wherein the next prompt comprises at least one of:
    - a graphical prompt, an audio prompt, or a combination of verbal and graphical elements.

15. A computer-assisted method, comprising selecting a next prompt for presentation to a user based at least in part on a value of the user'"'"'s native language, age or gender as estimated at least in part from one or more spoken responses of the user to one or more prior prompts.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The computer-assisted method of claim 15 wherein the next prompt comprises at least one of:
    - a request for information;
      
      a request to read a linguistic unit;
      
      a request to repeat a linguistic unit;
      
      or a request to complete, fill in or identify a verbal aggregate.
  - 17. The computer-assisted method of claim 15 wherein the user'"'"'s spoken responses are received at an interactive computer system via telephone or other telecommunication or data information network.
  - 18. The computer-assisted method of claim 15 wherein the next prompt comprises at least one of:
    - a graphical prompt, an audio prompt, or a combination of verbal and graphical elements.
  - 19. The computer-assisted method of claim 15 wherein the next prompt is further selected based at least in part on one or more measures of extra-linguistic, prosodic or production quality measures or linguistic units estimated from the one or more spoken responses, and the user'"'"'s identity estimated from the one or more spoken responses or provided directly by the user.

20. A computer-assisted method, comprising selecting a next prompt for presentation to a user based at least in part on prosodic measures estimated from one or more spoken responses received from the user in response to one or more prior prompts and indexical values, the prosodic measures being chosen from a group including the rate of speech of the user during the period of the spoken response and the fluency of the spoken response, and the indexical values being chosen from a group including user identity, user native language, user age, and user gender, the indexical values being estimated from the one or more spoken responses or directly provided by the user.
- View Dependent Claims (21, 22, 23)
- - 21. The computer-assisted method of claim 20 wherein the next prompt comprises at least one of:
    - a request for information;
      
      a request to read a linguistic unit;
      
      a request to repeat a linguistic unit;
      
      or a request to complete, fill in or identify a verbal aggregate.
  - 22. The computer-assisted method of claim 20 wherein the spoken responses from the user are received at an interactive computer system via telephone or other telecommunication or data information network.
  - 23. The computer-assisted method of claim 20 wherein the next prompt comprises at least one of:
    - a graphical prompt, an audio prompt, or a combination of verbal and graphical elements.

24. A computer-assisted method, comprising estimating skills, traits or states of a user from extra-linguistic measures derived from one or more spoken responses received from the user in response to one or more prompts, the extra-linguistic measures being chosen from a group including the latency of at least one of the spoken responses, the amplitude of at least one of the spoken responses and fundamental frequency values of at least one of the spoken responses.
- View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38)
- - 25. The computer-assisted method of claim 24 wherein the skills, traits or states are chosen from a group including language proficiency, subject matter skill, cognitive skill, verbal skill and vocal skill.
  - 26. The computer-assisted method of claim 25 wherein at least one of the prompts comprises at least one of:
    - a request for information;
      
      a request to read a linguistic unit;
      
      a request to repeat a linguistic unit;
      
      or a request to complete, fill in or identify a verbal aggregate.
  - 27. The computer-assisted method of claim 25 wherein the spoken responses from the user are received at an interactive computer system via telephone or other telecommunication or data information network.
  - 28. The computer-assisted method of claim 25 wherein at least one of the prompts comprises at least one of:
    - one or more graphical prompts, one or more audio prompts, or a combination of verbal and graphical elements.
  - 29. The computer-assisted method of claim 24 wherein the skills, traits or states are estimated from the one or more extra-linguistic measures and one or more prosodic measures derived from the one or more spoken responses, the prosodic measures being chosen from a group including the rate of speech of the user during the period of the spoken response and the fluency of the spoken response.
  - 30. The computer-assisted method of claim 29 wherein the skills, traits or states are estimated from the extra-linguistic and prosodic measures and one or more indexical values, the indexical values being chosen from a group including speaker identity, speaker native language, speaker age, and speaker gender, the indexical values being estimated from the one or more spoken responses or directly provided by the user.
  - 31. The computer-assisted method of claim 30 wherein the skills, traits or states are estimated from the extra-linguistic measures, prosodic measures and indexical values and one or more production quality measures derived from the one or more spoken responses, the production quality measures from a group including the pronunciation quality of the spoken response.
  - 32. The computer-assisted method of claim 30 wherein the skills, traits or states are estimated from the extra-linguistic measures, prosodic measures and indexical values and the identity of one or more linguistic units which comprise the one or more spoken responses.
  - 33. The computer-assisted method of claim 32 wherein the skills, traits or states are estimated from the indexical values, the extra-linguistic, prosodic, and production quality measures and the identity of one or more linguistic units which comprise the one or more spoken responses.
  - 34. The computer-assisted method of claim 29 wherein the skills, traits or states are estimated from the extra-linguistic and prosodic measures and one or more production quality measures derived from the one or more spoken responses, the production quality measures being chosen from a group including the pronunciation quality of the spoken response.
  - 35. The computer-assisted method of claim 29 wherein the skills, traits or states are estimated from the extra-linguistic and prosodic measures and the identity of one or more linguistic units which comprise the one or more spoken responses.
  - 36. The computer-assisted method of claim 24 wherein the skills, traits or states are estimated from the one or more extra-linguistic measures and one or more indexical values, the indexical values being chosen from a group including speaker identity, speaker native language, speaker age, and speaker gender, the indexical values being estimated from the one or more spoken responses or directly provided by the user.
  - 37. The computer-assisted method of claim 24 wherein the skills, traits or states are estimated from the one or more extra-linguistic measures and one or more production quality measures derived from the one or more spoken responses, the production quality measures being chosen from a group including the pronunciation quality of the spoken response.
  - 38. The computer-assisted method of claim 24 wherein the skills, traits or states are estimated from the one or more extra-linguistic measures and the identity of one or more linguistic units which comprise the one or more spoken responses.

39. A computer-assisted method, comprising using one or more semaphore values measured from one or more spoken responses of a user received in response to one or more prompts to select extra-linguistic characteristics, prosodic characteristics, indexical values or production quality characteristics with which to present linguistic units in a next prompt, the extra-linguistic characteristics being chosen from a group including the latency of the spoken response, the amplitude of the spoken response and fundamental frequency values of the spoken response, the prosodic characteristics being chosen from a group including the rate of speech of the user during the period of the spoken response and the fluency of the spoken response, the indexical values being chosen from a group including speaker identity, speaker native language, speaker age, and speaker gender, and the production quality characteristics being chosen from a group including the pronunciation style or quality of the spoken response.
- View Dependent Claims (40, 41, 42)
- - 40. The computer-assisted method of claim 39 wherein the semaphore values comprise at least one of:
    - extra-linguistic measures, prosodic measures, indexical values, production quality measures or linguistic units.
  - 41. The computer-assisted method of claim 39 wherein the next prompt comprises at least one of:
    - a graphical prompt, an audio prompt, or a combination of verbal and graphical elements.
  - 42. The computer-assisted method of claim 39 wherein the next prompt comprises at least one of:
    - a request to repeat a linguistic unit;
      
      a request to complete, fill in or identify a verbal aggregate;
      
      a request to read a linguistic unit;
      
      or a request for information.

43. An interactive computer-based system wherein spoken responses are elicited from a user in response to prompts presented by the system, the system comprising:
- a) means for extracting linguistic, indexical, or paralinguistic values in the user'"'"'s spoken response; and
  
  b) means for automatically selecting a next prompt to be presented to the user according to combined values of (i) linguistic units, including words, phrases or sentences contained in the response and (ii) a latency of the spoken response relative to the prompt.
- View Dependent Claims (44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59)
- - 44. The system of claim 43 wherein the means for extracting indexical or paralinguistic values comprises means for extracting semaphore values including speaker identity, fundamental frequency values, speech signal amplitudes, pronunciation quality, fluency, speech rate, speaker native language, speaker age or speaker gender from the user'"'"'s spoken responses.
  - 45. The system of claim 44 wherein the means for automatically selecting the next prompt comprises means for combining (a) one or more of the extracted semaphore values from the user'"'"'s spoken responses with (b) a measure of the latency of the user'"'"'s response and with (c) the linguistic units of a previous response to select the next prompt.
  - 46. The system of claim 44 wherein the means for automatically selecting the next prompt comprises means for estimating a user state by combining two or more of the extracted semaphore values to select the next prompt.
  - 47. The system of claim 44 wherein the means for automatically selecting the next prompt further comprises means for estimating a user skill or trait by combining two or more of the extracted semaphore values to select the next prompt.
  - 48. The system of claim 47 wherein the user skill or trait comprises at least one of:
    - language proficiency, subject matter knowledge, user age or user gender.
  - 49. The system of claim 48 further comprising means for selecting the linguistic, paralinguistic or indexical characteristics of the prompt, at least in part, by the linguistic, paralinguistic or indexical content of a spoken response from the user.
  - 50. The system of claim 44 wherein the linguistic, paralinguistic or indexical characteristics of the prompt presented include linguistic units, latency relative to the user'"'"'s response, speech rate, fundamental frequency values, speech signal amplitudes, pronunciation quality, fluency, speaker identity, speaker age or speaker gender.
  - 51. The system of claim 44 wherein the prompt comprises a request for information.
  - 52. The system of claim 44 wherein the prompt comprises a request to read a linguistic unit.
  - 53. The system of claim 44 wherein the prompt comprises a request to repeat a linguistic unit.
  - 54. The system of claim 44 wherein the prompt comprises a request to complete, fill in or identify a verbal aggregate.
  - 55. The system of claim 43 wherein the means for extracting indexical or paralinguistic values comprises means for extracting semaphore values including user'"'"'s native language derived from the user'"'"'s spoken responses in a target language.
  - 56. The system of claim 43 wherein the spoken responses from the user are received at the interactive computer system via telephone or other telecommunication or data information network.
  - 57. The system of claim 43 wherein the prompts are graphical prompts.
  - 58. The system of claim 43 wherein the prompts are audio prompts.
  - 59. The system of claim 43 wherein the prompts combine verbal and graphical elements.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Ordinate Corp. (Pearson plc)
Original Assignee
Ordinate Corp. (Pearson plc)
Inventors
Bernstein, Jared C.
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Smits, Talivaldis Ivars

Application Number

US08/753,580
Time in Patent Office

806 Days
Field of Search

704/270, 704/275, 434/156, 434/185
US Class Current

704/275
CPC Class Codes

G09B 19/04   Speaking with audible prese...

G09B 7/04   characterised by modifying ...

G10L 15/22   Procedures used during a sp...

Method and apparatus for combining information from speech signals for adaptive interaction in teaching and testing

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

59 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for combining information from speech signals for adaptive interaction in teaching and testing

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

59 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links