Classification filter for processing data for creating a language model
First Claim
Patent Images
1. A computer-implemented method of processing data for creating a language model, the method comprising:
- receiving data which is suitable for creating a language model and data which is not suitable for creating a language model;
processing the data using a classifier to identify at least the data suitable for creating the language model; and
outputting the data suitable for creating the language model.
2 Assignments
0 Petitions
Accused Products
Abstract
The method and apparatus utilize a filter to remove a variety of non-dictated words from data based on probability and improve the effectiveness of creating a language model.
66 Citations
20 Claims
-
1. A computer-implemented method of processing data for creating a language model, the method comprising:
-
receiving data which is suitable for creating a language model and data which is not suitable for creating a language model;
processing the data using a classifier to identify at least the data suitable for creating the language model; and
outputting the data suitable for creating the language model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer-readable medium having computer-executable instructions for performing steps to process data for creating a language model, the steps comprising:
-
receiving data which is suitable for creating a language model and data which is not suitable for creating the language model;
dividing the data into a sequence of text units; and
ascertaining based on a probability each text unit that is suitable for creating the language model. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification