Speech dialogue systems with repair facility
First Claim
1. An automated dialogue apparatus comprising:
- a buffer (10) for storing coded representations;
speech generation means (6) operable to generate a speech signal from the coded representation for confirmation by a user;
speech recognition means (2) operable to recognise speech received from the user and generate a coded representation of thereof;
means (5) operable to compare the coded representation from the recogniser of a response from the user with the contents of the buffer to determine, for each of a plurality of different alignments between the coded response and the buffer contents, a respective similarity measure, wherein at least some of said comparisons involve comparing only a leading portion of the coded response with a part of the buffer contents already uttered by the speech generation means; and
means (5) for replacing at least part of the buffer contents with at least part of said recognised response, in accordance with the alignment having the similarity measure indicative of the greatest similarity.
1 Assignment
0 Petitions
Accused Products
Abstract
The system has a speech recogniser (2) for recognising speech from a user and a synthesiser (6) for replying to him and engages in a dialogue with the object of enabling the user to convey to the system a piece of information such as a telephone number. The system builds up the number in a buffer (10). Each time it receives a string of digits, it reads it back for confirmation. When a number (or part of one) is read back, it is divided into “chunks” according to certain criteria: the positions of these divisions can be recorded to be taken into account in later processing. Responses are compared with the current buffer contents to determine whether they it should be interpreted as a correction, partial correction or pure continuation of the existing contents. Positions in the buffer at which pure continuations are entered are marked, to allow a “final repair” process in which, if the final result fails to match some criterion of acceptability (e.g. length) the marked positions can be reexamined to determine whether interpretation instead as correction or partial correction would meet the criterion. Algorithms are described for comparing new input with digits already received, to decide how it is to be interpreted.
51 Citations
54 Claims
-
1. An automated dialogue apparatus comprising:
-
a buffer (10) for storing coded representations;
speech generation means (6) operable to generate a speech signal from the coded representation for confirmation by a user;
speech recognition means (2) operable to recognise speech received from the user and generate a coded representation of thereof;
means (5) operable to compare the coded representation from the recogniser of a response from the user with the contents of the buffer to determine, for each of a plurality of different alignments between the coded response and the buffer contents, a respective similarity measure, wherein at least some of said comparisons involve comparing only a leading portion of the coded response with a part of the buffer contents already uttered by the speech generation means; and
means (5) for replacing at least part of the buffer contents with at least part of said recognised response, in accordance with the alignment having the similarity measure indicative of the greatest similarity. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A method of speech recognition comprising
(a) receiving a coded representation; -
(b) performing at least once the steps of (b1) recognising speech from a speaker to generate a coded representation thereof;
(b2) updating the previous coded representation by concatenation of at least part thereof with this recognised coded representation;
(b3) marking the position within the updated representation at which said concatenation occurred; and
(c) comparing a part of the updated representation immediately following the marked position with a part immediately preceding the same marked position to determine whether or not said immediately following part can be interpreted as a correction or partial correction of said immediately preceding part. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 41)
-
-
19. A method of speech recognition comprising (a) recognising an utterance from a speaker to generate a coded representation thereof;
- (b) detecting in the utterance a position that is followed by input having a correcting function and marking this position within the coded representation; and
(C) comparing a part of the updated representation immediately following the marked position with a part immediately preceding the same marked position to determine whether or not said immediately following part can be interpreted as a correction or partial correction of said immediately preceding part.
- (b) detecting in the utterance a position that is followed by input having a correcting function and marking this position within the coded representation; and
-
31. A method of speech recognition comprising
(a) recognising speech received from a speaker and generating a coded representation of each discrete utterance thereof; - and storing a plurality of representations of discrete utterances in sequence in a buffer, including markers indicative of divisions between units corresponding to the discrete utterances;
(b) performing a comparison process having a plurality of comparison steps, wherein each comparison step comprises comparing a first comparison sequence (each of which comprises a unit or leading portion thereof) with a second comparison sequence which, in the stored sequence, immediately precedes the first comparison sequence, so as to determine whether the first and second comparison sequences meet a predetermined criterion of similarity;
(c) in the event that the comparison process identifies only one instance of first and second comparison sequences meeting the criterion, deleting the second comparison sequence of that instance from the stored sequence. - View Dependent Claims (33)
- and storing a plurality of representations of discrete utterances in sequence in a buffer, including markers indicative of divisions between units corresponding to the discrete utterances;
-
32. A method of speech recognition comprising
(a) recognising speech received from a speaker and generating a coded representation of each discrete utterance thereof; - and storing a plurality of representations of discrete utterances in sequence in a buffer, including markers indicative of divisions between units corresponding to the discrete utterances;
in response to a parameter which defines an expected length for the stored sequence, the step of comparing the actual length of the stored sequence with the parameter and in the event that the actual length exceeds the parameter;
(b) performing a comparison process having a plurality of comparison steps, wherein each comparison step comprises comparing a first comparison sequence (each of which comprises a unit or leading portion thereof) with a second comparison sequence which, in the stored sequence, immediately precedes the first comparison sequence, so as to determine whether the first and second comparison sequences meet a predetermined criterion of similarity;
(c) in the event that the comparison process identifies only one instance where both (i) the length of the second comparison sequence is equal to the difference between the actual and expected length and (ii) the first and second comparison sequences meet the criterion, deleting the second comparison sequence of that instance from the stored sequence.
- and storing a plurality of representations of discrete utterances in sequence in a buffer, including markers indicative of divisions between units corresponding to the discrete utterances;
-
34. A method of speech recognition comprising
(a) storing a coded representation; -
(b) selecting a portion of the stored coded representation;
(c) supplying the selected portion to speech generation means operable to generate a speech signal therefrom for confirmation by a user;
(d) recognising a spoken response from the user to generate a coded representation thereof; and
(e) updating the stored coded representation on the basis of the recognised response;
wherein said updating includes updating at least one part of the stored coded representation other than the selected portion. - View Dependent Claims (35, 36, 37, 38, 39, 40)
-
-
42. An automated dialogue apparatus comprising
speech generation means operable to generate a speech signal from a coded representation for confirmation by a user, characterised by means operable in dependence on the length of the coded representation to divide the coded representation into at least two portions, to supply a first portion to the speech generation means and to await a response from the user before supplying any further portion to the speech generation means.
-
44. An automated dialogue apparatus comprising:
-
speech generation means operable to generate a speech signal from a coded representation for confirmation by a user; and
means operable to divide the coded representation into at least two portions, to supply a first portion to the speech generation means and to await a response from the user before supplying any further portion to the speech generation means;
characterised by means for recognising predetermined patterns in the coded representation and wherein upon such recognition one of the portions is determined by reference to a recognised pattern.
-
-
49. An automated dialogue apparatus comprising:
-
a buffer (10) for storing coded representations;
speech recognition means (2) operable to recognise speech received from the user, including detecting phrasal boundaries in said input speech, and to store in the buffer a coded representation of the recognised speech and markers indicative of the positions of said phrasal boundaries;
speech generation means (6) operable to generate a speech signal from the coded representation for confirmation by a user;
control means operable in response to the phrase boundary markers to divide the coded representation into at least two portions, to supply a first portion to the speech generation means for a response from the user before supplying any further portion to the speech generation means.
-
-
50. An automated dialogue method comprising:
storing coded representations including markers indicative of points of ambiguity;
comparing, for each of a plurality of different alignments thereof, a part of the coded representations immediately following a marked point with a part immediately preceding the same marked point to determine whether or not said immediately following part can be interpreted as a correction or partial correction of said immediately preceding part;
wherein at least some of said comparisons involve comparing only a leading portion of said immediately following part with said immediately preceding part.
-
51. An automated dialogue apparatus comprising:
-
speech recognition means operable to recognise speech received from a speaker and generate a coded representation thereof;
timeout means operable to determine in accordance with a silence duration parameter when an utterance being recognised is deemed to have ended;
characterised by means operable, during an utterance, in dependence on the contents of the utterance to date, to vary the timeout parameter for the continuation of that utterance. - View Dependent Claims (52, 53)
-
-
54. An automated dialogue apparatus comprising:
-
speech recognition means operable to recognise speech received from a speaker and generate a coded representation thereof;
timeout means operable to determine in accordance with a silence duration parameter when an utterance being recognised is deemed to have ended;
characterised by means operable in dependence on a dialogue state to vary the timeout parameter.
-
Specification