Method and apparatus for selecting a vocabulary sub-set from a speech recognition dictionary for use in real time automated directory assistance
First Claim
1. A process for generating a vocabulary sub-set from a speech recognition dictionary suitable for use in an automated directory assistance system, said process being performed before the automated directory assistance system performs speech recognition with the vocabulary sub-set, said process including the steps of:
- providing a speech recognition dictionary including a plurality of vocabulary items;
providing a plurality of call records;
matching said call records to vocabulary items in said speech recognition dictionary;
computing for a group of the vocabulary items in said speech recognition dictionary a linkage value for each vocabulary item in the group, said linkage value being indicative of the probability of successful linkage of the vocabulary item to a desired listing containing telephone number information by a link layer of an automated directory assistance system;
computing for a group of the vocabulary items in said speech recognition dictionary a frequency of occurrence for each vocabulary item in the group, said frequency of occurrence being computed at least in part on the basis of said call records;
computing for said group of vocabulary items in the speech recognition dictionary a benefit value for each vocabulary item of the group, said benefit value being computed at least in part on the basis of said linkage value and said frequency of occurrence;
ranking the vocabulary item in said group on a basis of benefit values;
selecting N vocabulary items from said group that have a highest benefit value to form said vocabulary sub-set, N being less that the total number of vocabulary items in said group;
storing said vocabulary sub-set on a computer readable medium suitable for use in the automated directory assistance system to perform speech recognition.
10 Assignments
0 Petitions
Accused Products
Abstract
A vocabulary sub-set is selected from a large speech recognition dictionary. The selected vocabulary sub-set may be used in a real time directory assistance system to improve the system'"'"'s real-time performance. The selection process is effected on the basis of the cost-benefit ratio, the benefit being measured in savings in operator working time. On the other hand, the cost is measured in terms of hardware limitations, namely processor throughput. Typically, the vocabulary sub-set is limited to a maximum number orthographies that would enable the system to achieve real-time performance.
104 Citations
34 Claims
-
1. A process for generating a vocabulary sub-set from a speech recognition dictionary suitable for use in an automated directory assistance system, said process being performed before the automated directory assistance system performs speech recognition with the vocabulary sub-set, said process including the steps of:
-
providing a speech recognition dictionary including a plurality of vocabulary items; providing a plurality of call records; matching said call records to vocabulary items in said speech recognition dictionary; computing for a group of the vocabulary items in said speech recognition dictionary a linkage value for each vocabulary item in the group, said linkage value being indicative of the probability of successful linkage of the vocabulary item to a desired listing containing telephone number information by a link layer of an automated directory assistance system; computing for a group of the vocabulary items in said speech recognition dictionary a frequency of occurrence for each vocabulary item in the group, said frequency of occurrence being computed at least in part on the basis of said call records; computing for said group of vocabulary items in the speech recognition dictionary a benefit value for each vocabulary item of the group, said benefit value being computed at least in part on the basis of said linkage value and said frequency of occurrence; ranking the vocabulary item in said group on a basis of benefit values; selecting N vocabulary items from said group that have a highest benefit value to form said vocabulary sub-set, N being less that the total number of vocabulary items in said group; storing said vocabulary sub-set on a computer readable medium suitable for use in the automated directory assistance system to perform speech recognition. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A process for enhancing a real-time operation of an automated directory assistance system, said process being performed before the automated directory assistance system performs speech recognition said process including the steps of:
-
providing a speech recognition dictionary including a plurality of vocabulary items; providing a plurality of call records; matching said call records to vocabular items in said speech recognition dictionary; computing for a group of the vocabulary items in said speech recognition dictionary a linkage value for each vocabulary items in the group, said linkage value being indicative of the probability of successful linkage of the vocabulary item to a desired listing containing telephone number information by a link layer of an automated directory assistance system; computing for a group of the vocabulary items in said speech recognition dictionary a frequency of occurrence for each vocabulary item in the group, said frequency of occurrence being computed at least in part on the basis of said call records; computing for at least a group of vocabulary items in the speech recognition dictionary a benefit value for each vocabulary item of the group, said benefit value being computed at least in part on the basis of said linkage value and said frequency of occurrence; ranking the vocabulary items of the speech recognition dictionary on a basis of benefit value; selecting N vocabulary items from said speech recognition dictionary that have a highest benefit value to form said vocabulary sub-set, N being less that the total number of vocabulary items in said group; operating the automated directory assistance system to perform speech recognition with said vocabulary sub-set. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. An apparatus for generating a vocabulary sub-set from a speech recognition dictionary for use in an automated directory assistance system, the speech recognition dictionary including a plurality of vocabulary items, said vocabulary sub-set being generated before the automated directory assistance system performs speech recognition with the vocabulary sub-set, said apparatus comprising:
-
first memory means containing at least a group of vocabulary items of the speech recognition dictionary; second memory means containing a plurality of call records; a processor in operative relationship with said first memory means and said second memory means; a program element providing means for; a) matching said call records to vocabulary items in said speech recognition dictionary; b) computing for a group of the vocabulary items in said speech recognition dictionary a linkage value for each vocabulary item in the group, said linkage value being indicative of the probability of successful linkage of the vocabulary item to a desired listing containing telephone number information by a link layer of an automated directory assistance system; c) computing for a group of the vocabulary items in said speech recognition dictionary a frequency of occurrence for each vocabulary item in the group, said frequency of occurrence being computed at least in part on the basis of said call records; d) directing said processor to compute a benefit value for each vocabulary item of the group, said benefit value being computed at least in part on the basis of said linkage value and said frequency of occurrence; e) ranking the vocabulary items in said group on a basis of benefit value; and f) selecting N vocabulary items from said group that have a highest benefit value to form said vocabulary sub-set, N being less that the total number of vocabulary items in said group. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A machine readable medium containing a program element for instructing a computer for generating a vocabulary sub-set from a speech recognition dictionary for use in an automated directory assistance system, the speech recognition dictionary including a plurality of vocabulary items, said vocabulary sub-set being generated before the automated directory assistance system performs speech recognition with the vocabulary sub-set, said computer including:
-
first memory means containing at least a group of vocabulary items of the speech recognition dictionary; second memory means containing a plurality of call records; a processor in operative relationship with said first memory means and said second memory means; said program element providing means for; a) matching said call records to vocabulary items in said speech recognition dictionary; b) computing for a group of the vocabulary items in said speech recognition dictionary a linkage value for each vocabulary item in the group, said linkage value being indicative of the probability of successful linkage of the vocabulary item to a desired listing containing telephone number information by a link layer of an automated directory assistance system; c) computing for a group of the vocabulary items in said speech recognition dictionary a frequency of occurrence for each vocabulary item in the group, said frequency of occurrence being computed at least in part on the basis of said call records; d) directing said processor to compute a benefit value for each vocabulary item of the group, said benefit value being computed at least in part on the basis of said linkage value and said frequency of occurrence; e) ranking the vocabulary items in said group on a basis of benefit value; and f) selecting N vocabulary items from said group that have a highest benefit value to form said vocabulary sub-set, N being less that the total number of vocabulary items in said group. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34)
-
Specification