Method and apparatus for detecting data anomalies in statistical natural language applications
First Claim
Patent Images
1. A computer-implemented method of detecting data anomalies in a natural language understanding (NLU) system, comprising the steps of:
- obtaining a plurality of categorized sentences that are categorized into a plurality of categories;
clustering those of said sentences within a given one of said categories into a plurality of subclusters; and
analyzing said subclusters to identify data anomalies therein.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for detecting data anomalies in a natural language understanding (NLU) system are provided. A number of categorized sentences, categorized into a number of categories, are obtained. Sentences within a given one of the categories are clustered into a number of sub clusters, and the sub clusters are analyzed to identify data anomalies. The clustering can be based on surface forms of the sentences. The anomalies can be, for example, ambiguities or inconsistencies. The clustering can be performed, for example, with a K-means clustering algorithm.
-
Citations
20 Claims
-
1. A computer-implemented method of detecting data anomalies in a natural language understanding (NLU) system, comprising the steps of:
-
obtaining a plurality of categorized sentences that are categorized into a plurality of categories;
clustering those of said sentences within a given one of said categories into a plurality of subclusters; and
analyzing said subclusters to identify data anomalies therein. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer program product comprising a computer usable medium having computer usable program code for detecting data anomalies in a natural language understanding (NLU) system, said computer program product including:
-
computer usable program code for obtaining a plurality of categorized sentences that are categorized into a plurality of categories;
computer usable program code for clustering those of said sentences within a given one of said categories into a plurality of subclusters; and
computer usable program code for analyzing said subclusters to identify data anomalies therein. - View Dependent Claims (19)
-
-
20. An apparatus for detecting data anomalies in a natural language understanding (NLU) system, comprising:
-
a memory; and
at least one processor coupled to said memory and operative to;
obtain a plurality of categorized sentences that are categorized into a plurality of categories;
cluster those of said sentences within a given one of said categories into a plurality of subclusters; and
analyze said subclusters to identify data anomalies therein.
-
Specification