×

Classification filter for processing data for creating a language model

  • US 8,165,870 B2
  • Filed: 02/10/2005
  • Issued: 04/24/2012
  • Est. Priority Date: 02/10/2005
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method of processing textual adaptation data for creating a statistical language model that provides prior probability estimates for word sequences, the method comprising, with a computer:

  • receiving textual adaptation data comprising textual data which is suitable for creating the statistical language model and non-dictated textual data which is not suitable for creating the statistical language model;

    segmenting the textual adaptation data into a sequence of units;

    extracting a first set of features for each unit in the sequence;

    normalizing the sequence of units to form a normalized sequence of units;

    extracting a second set of features for each unit in the normalized sequence of units;

    processing the data using a processor operating as a classifier to filter out the non-dictated textual data from the textual adaptation data, thereby identifying at least the textual data suitable for creating the language model, the processing including using a classification model which uses a combination of the first and second sets of features;

    outputting the textual data suitable for creating the statistical language model; and

    generating the statistical language model from the suitable data, wherein the statistical language model provides prior probability estimates for word sequences to guide a hypothesis search for a likely intended word sequence.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×