Hybrid baseform generation
First Claim
1. A method for generating baseforms or phonetic spellings from input text, said method including the steps of:
- defining rules for generating a pronunciation dictionary for a particular language;
generating baseforms using said defined rules;
identifying phones, in said language, that are exceptions to said defined rules;
associating an action with each said phone;
applying a statistical technique to determine whether said identified phones can be modified; and
correcting baseforms containing said identified phones that can be modified according to said associated actions.
3 Assignments
0 Petitions
Accused Products
Abstract
A method, a computer system and a computer program product for generating baseforms or phonetic spellings from input text are disclosed. The baseforms are initially generated using rules defined for a particular language. Then, phones are identified in the language that are exceptions to the defined rules and an action is associated with each identified phone. A statistical technique is applied to determine whether the identified phones can be modified. Finally, baseforms containing the identified phones that can be modified, are corrected according to the associated actions. Preferably, the statistical technique is only applied to baseforms containing phones that are exceptions to the defined rules. The defined rules can comprise spelling-to-sound rules for a particular phonetic language that incorporate all possible alternative pronunciations of each baseform.
-
Citations
33 Claims
-
1. A method for generating baseforms or phonetic spellings from input text, said method including the steps of:
-
defining rules for generating a pronunciation dictionary for a particular language;
generating baseforms using said defined rules;
identifying phones, in said language, that are exceptions to said defined rules;
associating an action with each said phone;
applying a statistical technique to determine whether said identified phones can be modified; and
correcting baseforms containing said identified phones that can be modified according to said associated actions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer system for generating baseforms or phonetic spellings from input text, including:
-
processor means for defining rules for generating a pronunciation dictionary for a particular language;
processor means for generating baseforms using said defined rules;
processor means for identifying phones, in said language, that are exceptions to said defined rules;
processor means for associating an action with each said phone;
processor means for applying a statistical technique to determine whether said identified phones can be modified; and
processor means for correcting baseforms containing said identified phones that can be modified according to said associated actions. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer program product having a computer readable medium having a computer program recorded therein for generating baseforms or phonetic spellings from input text, said computer program product including:
-
computer program code means for defining rules for generating a pronunciation dictionary for a particular language;
computer program code means for generating baseforms using said defined rules;
computer program code means for identifying phones, in said language, that are exceptions to said defined rules;
computer program code means for associating an action with each said phone;
computer program code means for applying a statistical technique to determine whether said identified phones can be modified; and
computer program code means for correcting baseforms containing said identified phones that can be modified according to said associated actions. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A computer program product having a computer readable medium having a computer program recorded therein for generating baseforms or phonetic spellings from input text, said computer program product including:
-
a pronunciation dictionary for a particular language;
a list of actions associated with phones that are exceptions to rules defined for said language;
computer program code means for generating baseforms according to said defined rules;
computer program code means for identifying phones, in said baseforms, that are exceptions to said defined rules;
computer program code means for applying a statistical technique to determine whether said identified phones can be modified; and
computer program code means for correcting baseforms containing said identified phones that can be modified according to said associated actions. - View Dependent Claims (29, 30, 31, 32, 33)
-
Specification