Learning of dialogue states and language model of spoken information system
First Claim
1. A method of classifying a plurality of sequences of symbols to form a plurality of sets of sequences of symbols comprising the steps of a) determining a distance between each sequence and each other sequence in said plurality of sequences in dependence upon a set of insignificant symbol sequences and a set of equivalent symbol sequence pairs;
- and b) grouping the plurality of sequences into a plurality of sets in dependence upon said distances.
1 Assignment
0 Petitions
Accused Products
Abstract
In this invention dialogue states for a dialogue model are created using a training corpus of example human-human dialogues. Dialogue states are modelled at the turn level rather than at the move level, and the dialogue states are derived from the training corpus.
The range of operator dialogue utterances is actually quite small in many services and therefore may be categorized into a set of predetermined meanings. This is an important assumption which is not true of general conversation, but is often true of conversations between telephone operators and people.
Phrases are specified which have specific substitution and deletion penalties, for example the two phrases ‘I would like to’ and ‘can I’ may be specified as a possible substitution with low or zero penalty. Thus allows common equivalent phrases are given low substitution penalties. Insignificant phrases such as ‘erm’ are given low or zero deletion penalties.
60 Citations
20 Claims
-
1. A method of classifying a plurality of sequences of symbols to form a plurality of sets of sequences of symbols comprising the steps of
a) determining a distance between each sequence and each other sequence in said plurality of sequences in dependence upon a set of insignificant symbol sequences and a set of equivalent symbol sequence pairs; - and
b) grouping the plurality of sequences into a plurality of sets in dependence upon said distances. - View Dependent Claims (2, 3, 4, 5, 6, 7, 13)
- and
-
8. An apparatus for classifying a plurality of sequences of symbols to form a plurality of sets of sequences of symbols comprising
a store for storing a set of insignificant symbol sequences; -
a store for storing a set of equivalent symbol sequence pairs;
means for determining a distance between each sequence and each other sequence in said plurality of sequences in dependence upon the set of insignificant symbol sequences and the set of equivalent symbol sequence pairs; and
means for grouping the plurality of sequences into a plurality of sets in dependence upon said distances. - View Dependent Claims (9, 10, 11, 14)
-
-
12. An apparatus for generating a grammar for enquiries made to a call centre comprising
a store for storing a plurality of sets of sequences of words; -
means for transcribing a plurality of enquiries according to which of the sets the sequences of words in the enquiry occur; and
means for generating a grammar in dependence upon the resulting transcription.
-
-
15. A method of classifying a plurality of sequences of words to form a plurality of sets of sequences of words, the method comprising the steps of:
-
transcribing the plurality of sequences of words from operator speech signals generated during an enquiry to a call centre;
determining a distance between each sequence of words and each other sequence of words in said plurality of sequences; and
grouping the plurality of sequences of words into a plurality of sets in dependence upon said distances. - View Dependent Claims (16, 17, 18)
-
-
19. An apparatus for classifying a plurality of sequences of words to form a plurality of sets of sequences of words comprising:
-
transcribing means for transcribing the plurality of sequences of words from operator speech signals generated during an enquiry to a call centre;
means for determining a distance between each sequence and each other sequence in said plurality of sequences in dependence upon the set of insignificant symbol sequences and the set of equivalent symbol sequence pairs; and
means for grouping the plurality of sequences into a plurality of sets in dependence upon said distances. - View Dependent Claims (20)
-
Specification