Library of existing spoken dialog data for use in generating new natural language spoken dialog systems
First Claim
1. A non-transitory computer-readable medium comprising:
- a plurality of reusable components for building a natural language spoken dialog system, each of the plurality of reusable components comprising a plurality of groups of previously collected audible utterances and associated labels for call-types and named entities, wherein;
(1) the plurality of reusable components is organized into a plurality of datasets;
(2) each of the plurality of datasets comprises data pertaining to an industrial sector in a different task domain;
(3) data in the plurality of datasets is collected during a plurality of collection phases, each of the plurality of collection phases comprising a respective defined period of time;
(4) each group of the plurality of groups of previously collected audible utterances was collected in a separate spoken dialog system operating within a respective industry sector; and
(5) an annotation guide comprising guideline utterances and descriptions, the guideline utterances comprising both positive and negative utterances for an associated call-type category,wherein the previously collected audible utterances are associated with an occurrence of utterance data comprising information indicating the associated call-type category, and wherein each respective industry sector is in a different task domain from other respective industry sectors.
5 Assignments
0 Petitions
Accused Products
Abstract
A machine-readable medium may include a group of reusable components for building a spoken dialog system. The reusable components may include a group of previously collected audible utterances. A machine-implemented method to build a library of reusable components for use in building a natural language spoken dialog system may include storing a dataset in a database. The dataset may include a group of reusable components for building a spoken dialog system. The reusable components may further include a group of previously collected audible utterances. A second method may include storing at least one set of data. Each one of the at least one set of data may include ones of the reusable components associated with audible data collected during a different collection phase.
21 Citations
22 Claims
-
1. A non-transitory computer-readable medium comprising:
-
a plurality of reusable components for building a natural language spoken dialog system, each of the plurality of reusable components comprising a plurality of groups of previously collected audible utterances and associated labels for call-types and named entities, wherein; (1) the plurality of reusable components is organized into a plurality of datasets; (2) each of the plurality of datasets comprises data pertaining to an industrial sector in a different task domain; (3) data in the plurality of datasets is collected during a plurality of collection phases, each of the plurality of collection phases comprising a respective defined period of time; (4) each group of the plurality of groups of previously collected audible utterances was collected in a separate spoken dialog system operating within a respective industry sector; and (5) an annotation guide comprising guideline utterances and descriptions, the guideline utterances comprising both positive and negative utterances for an associated call-type category, wherein the previously collected audible utterances are associated with an occurrence of utterance data comprising information indicating the associated call-type category, and wherein each respective industry sector is in a different task domain from other respective industry sectors. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method comprising:
-
storing via a processor a plurality of reusable components for building a natural language spoken dialog system, each of the plurality of reusable components comprising a plurality of groups of previously collected audible utterances and associated labels for call-types and named entities, wherein; (1) the plurality of reusable components is organized into a plurality of datasets; (2) each of the plurality of datasets comprises data pertaining to an industrial sector in a different task domain; (3) data in the plurality of datasets is collected during a plurality of collection phases, each of the plurality of collection phases comprising a respective defined period of time; (4) each group of the plurality of groups of previously collected audible utterances was collected in a separate spoken dialog system operating within a respective industry sector; and (5) an annotation guide comprising guideline utterances and descriptions, the guideline utterances comprising both positive and negative utterances for an associated call-type category, wherein the previously collected audible utterances are associated with an occurrence of utterance data comprising information indicating an associated call-type category, and wherein each respective industry sector is in a different task domain from other respective industry sectors. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A system comprising:
-
a processor; and a computer readable storage medium storing instructions for controlling the processor to perform steps comprising; storing a plurality of reusable components for building a natural language spoken dialog system, each of the plurality of reusable components comprising a plurality of groups of previously collected audible utterances and associated labels for call-types and named entities, wherein; (1) the plurality of reusable components is organized into a plurality of datasets; (2) each of the plurality of datasets comprises data pertaining to an industrial sector in a different task domain; (3) data in the plurality of datasets is collected during a plurality of collection phases, each of the plurality of collection phases comprising a respective defined period of time; (4) each group of previously collected audible utterances was collected in a separate spoken dialog system operating within a respective industry sector; and (5) an annotation guide comprising guideline utterances and descriptions, the guideline utterances comprising both positive and negative utterances for an associated call-type category, wherein the previously collected audible utterances are associated with an occurrence of utterance data comprising information indicating an associated call-type category, and wherein each respective industry sector is in a different task domain from other respective industry sectors. - View Dependent Claims (19, 20, 21, 22)
-
Specification