Method and apparatus using discriminative training in natural language call routing and document retrieval
First Claim
1. A method of training a scoring matrix for use by a classification system, the classification system for use in performing classification requests based on natural language text and with use of said scoring matrix which has been based on a set of training data comprising natural language text, the method comprising the steps of:
- generating an initial scoring matrix comprising a numerical value for each of a set of n classes in association with each of a set of m features, the initial scoring matrix based on said set of training data and, for each element of said set of training data, based on a subset of said features which are comprised in said natural language text of said element of said set of training data and on one of said classes which has been identified therefor; and
based on the initial scoring matrix and said set of training data, generating a discriminatively trained scoring matrix for use by said classification system by adjusting one or more of said numerical values such that a greater degree of discrimination exists between competing ones of said classes when said classification requests are performed, thereby resulting in a reduced classification error rate.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for performing discriminative training of, for example, call routing training data (or, alternatively, other classification training data) which improves the subsequent classification of a user'"'"'s natural language based requests. An initial scoring matrix is generated based on the training data and then the scoring matrix is adjusted so as to improve the discrimination between competing classes (e.g., destinations). In accordance with one illustrative embodiment of the present invention a Generalized Probabilistic Descent (GPD) algorithm may be advantageously employed to provide the improved discrimination. More specifically, the present invention provides a method and apparatus comprising steps or means for generating an initial scoring matrix comprising a numerical value for each of a set of n classes in association with each of a set of m features, the initial scoring matrix based on a set of training data and, for each element of said set of training data, based on a subset of said features which are comprised in the natural language text of said element of said set of training data and on one of said classes which has been identified therefor; and based on the initial scoring matrix and the set of training data, generating a discriminatively trained scoring matrix for use by said classification system by adjusting one or more of said numerical values such that a greater degree of discrimination exists between competing ones of said classes when said classification requests are performed, thereby resulting in a reduced classification error rate.
-
Citations
40 Claims
-
1. A method of training a scoring matrix for use by a classification system, the classification system for use in performing classification requests based on natural language text and with use of said scoring matrix which has been based on a set of training data comprising natural language text, the method comprising the steps of:
-
generating an initial scoring matrix comprising a numerical value for each of a set of n classes in association with each of a set of m features, the initial scoring matrix based on said set of training data and, for each element of said set of training data, based on a subset of said features which are comprised in said natural language text of said element of said set of training data and on one of said classes which has been identified therefor; and
based on the initial scoring matrix and said set of training data, generating a discriminatively trained scoring matrix for use by said classification system by adjusting one or more of said numerical values such that a greater degree of discrimination exists between competing ones of said classes when said classification requests are performed, thereby resulting in a reduced classification error rate. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of performing classification requests based on natural language text and with use of a discriminatively trained scoring matrix which has been trained based on a set of training data comprising natural language text, the scoring matrix having been discriminatively trained by a method comprising the steps of:
-
generating an initial scoring matrix comprising a numerical value for each of a set of n classes in association with each of a set of m features, the initial scoring matrix based on said set of training data and, for each element of said set of training data, based on a subset of said features which are comprised in said natural language text of said element of said set of training data and on one of said classes which has been identified therefor; and
based on the initial scoring matrix and said set of training data, generating said discriminatively trained scoring matrix by adjusting one or more of said numerical values such that a greater degree of discrimination exists between competing ones of said classes when said classification requests are performed, thereby resulting in a reduced classification error rate. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 37)
-
-
21. An apparatus for training a scoring matrix for use by a classification system, the classification system for use in performing classification requests based on natural language text and with use of said scoring matrix which has been based on a set of training data comprising natural language text, the apparatus comprising:
-
means for generating an initial scoring matrix comprising a numerical value for each of a set of n classes in association with each of a set of m features, the initial scoring matrix based on said set of training data and, for each element of said set of training data, based on a subset of said features which are comprised in said natural language text of said element of said set of training data and on one of said classes which has been identified therefor; and
based on the initial scoring matrix and said set of training data, means for generating a discriminatively trained scoring matrix for use by said classification system by adjusting one or more of said numerical values such that a greater degree of discrimination exists between competing ones of said classes when said classification requests are performed, thereby resulting in a reduced classification error rate. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. An apparatus for performing classification requests based on natural language text and with use of a discriminatively trained scoring matrix which has been trained based on a set of training data comprising natural language text, the scoring matrix having been discriminatively trained by an apparatus comprising:
-
means for generating an initial scoring matrix comprising a numerical value for each of a set of n classes in association with each of a set of m features the initial scoring matrix based on said set of training data and, for each element of said set of training data, based on a subset of said features which are comprised in said natural language text of said element of said set of training data and on one of said classes which has been identified therefor; and
based on the initial scoring matrix and said set of training data, means for generating said discriminatively trained scoring matrix by adjusting one or more of said numerical values such that a greater degree of discrimination exists between competing ones of said classes when said classification requests are performed, thereby resulting in a reduced classification error rate. - View Dependent Claims (32, 33, 34, 35, 36, 38, 39, 40)
-
Specification