Method and apparatus for combining information from speech signals for adaptive interaction in teaching and testing
First Claim
1. A computer-assisted method, comprising selecting a next prompt for presentation to a user based at least in part on one or more extra-linguistic measures estimated from one or more spoken responses received from the user in response to one or more prior prompts, the extra-linguistic measures being chosen from a group including the latency of the spoken response, the amplitude of the spoken response and fundamental frequency values of the spoken response.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer system with a speech recognition component provides a method and apparatus for instructing and evaluating the proficiency of human users in skills that can be exhibited through speaking. The computer system tracks linguistic, indexical and paralinguistic characteristics of the spoken input of users, and implements games, data access, instructional systems, and tests. The computer system combines characteristics of the spoken input automatically to select appropriate material and present it in a manner suitable for the user. In one embodiment, the computer system measures the response latency and speaking rate of the user and presents its next spoken display at an appropriate speaking rate. In other embodiments, the computer system identifies the gender and native language of the user, and combines that information with the relative accuracy of the linguistic content of the user'"'"'s utterance to select and display material that may be easier or more challenging for speakers with these characteristics.
-
Citations
59 Claims
- 1. A computer-assisted method, comprising selecting a next prompt for presentation to a user based at least in part on one or more extra-linguistic measures estimated from one or more spoken responses received from the user in response to one or more prior prompts, the extra-linguistic measures being chosen from a group including the latency of the spoken response, the amplitude of the spoken response and fundamental frequency values of the spoken response.
- 15. A computer-assisted method, comprising selecting a next prompt for presentation to a user based at least in part on a value of the user'"'"'s native language, age or gender as estimated at least in part from one or more spoken responses of the user to one or more prior prompts.
- 20. A computer-assisted method, comprising selecting a next prompt for presentation to a user based at least in part on prosodic measures estimated from one or more spoken responses received from the user in response to one or more prior prompts and indexical values, the prosodic measures being chosen from a group including the rate of speech of the user during the period of the spoken response and the fluency of the spoken response, and the indexical values being chosen from a group including user identity, user native language, user age, and user gender, the indexical values being estimated from the one or more spoken responses or directly provided by the user.
- 24. A computer-assisted method, comprising estimating skills, traits or states of a user from extra-linguistic measures derived from one or more spoken responses received from the user in response to one or more prompts, the extra-linguistic measures being chosen from a group including the latency of at least one of the spoken responses, the amplitude of at least one of the spoken responses and fundamental frequency values of at least one of the spoken responses.
- 39. A computer-assisted method, comprising using one or more semaphore values measured from one or more spoken responses of a user received in response to one or more prompts to select extra-linguistic characteristics, prosodic characteristics, indexical values or production quality characteristics with which to present linguistic units in a next prompt, the extra-linguistic characteristics being chosen from a group including the latency of the spoken response, the amplitude of the spoken response and fundamental frequency values of the spoken response, the prosodic characteristics being chosen from a group including the rate of speech of the user during the period of the spoken response and the fluency of the spoken response, the indexical values being chosen from a group including speaker identity, speaker native language, speaker age, and speaker gender, and the production quality characteristics being chosen from a group including the pronunciation style or quality of the spoken response.
-
43. An interactive computer-based system wherein spoken responses are elicited from a user in response to prompts presented by the system, the system comprising:
-
a) means for extracting linguistic, indexical, or paralinguistic values in the user'"'"'s spoken response; and b) means for automatically selecting a next prompt to be presented to the user according to combined values of (i) linguistic units, including words, phrases or sentences contained in the response and (ii) a latency of the spoken response relative to the prompt. - View Dependent Claims (44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59)
-
Specification