Construction of phonetic representation of a string of characters
First Claim
Patent Images
1. A method, comprising:
- accessing a string of characters;
parsing the string of characters into a first string of graphemes;
adding one or more characters to the first string of graphemes to represent missing characters in the string of characters to create a second string of graphemes;
grouping the second string of graphemes into a plurality of pseudo-graphemes based on a probability determined by a trained discrete estimator, wherein two or more graphemes in the string of graphemes that are phonetized together are grouped to a single pseudo-grapheme;
accessing a first data structure that maps each pseudo-grapheme in the string of pseudo-graphemes to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a pronunciation of a grapheme;
determining one or more phonetic representations for each pseudo-grapheme in the string of pseudo-graphemes based on the first data structure;
accessing a second data structure that maps the one or more universal phonetic representations to one or more graphemes, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a grapheme;
determining at least one grapheme representation for one or more of the one or more phonetic representations based on the second data structure; and
constructing a phonetic representation of the string of characters based on the at least one grapheme representation that was determined.
1 Assignment
0 Petitions
Accused Products
Abstract
Provided are methods, devices, and computer-readable media for accessing a string of characters; parsing the string of characters into string of graphemes; determining one or more phonetic representations for one or more graphemes in the string of graphemes based on a first data structure; determining at least one grapheme representation for one or more of the one or more phonetic representations based on a second data structure; and constructing the phonetic representation of the string of characters based on the grapheme representation that was determined.
71 Citations
18 Claims
-
1. A method, comprising:
-
accessing a string of characters; parsing the string of characters into a first string of graphemes; adding one or more characters to the first string of graphemes to represent missing characters in the string of characters to create a second string of graphemes; grouping the second string of graphemes into a plurality of pseudo-graphemes based on a probability determined by a trained discrete estimator, wherein two or more graphemes in the string of graphemes that are phonetized together are grouped to a single pseudo-grapheme; accessing a first data structure that maps each pseudo-grapheme in the string of pseudo-graphemes to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a pronunciation of a grapheme; determining one or more phonetic representations for each pseudo-grapheme in the string of pseudo-graphemes based on the first data structure; accessing a second data structure that maps the one or more universal phonetic representations to one or more graphemes, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a grapheme; determining at least one grapheme representation for one or more of the one or more phonetic representations based on the second data structure; and constructing a phonetic representation of the string of characters based on the at least one grapheme representation that was determined. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A device, comprising:
-
a memory storing instructions; and at least one processor, operably connected to the memory, implemented at least in part in hardware, and configured to execute the instructions to perform operations comprising; receiving the string of characters; parsing the string of characters into a first string of graphemes; adding one or more characters to the first string of graphemes to represent missing characters in the string of characters to create a second string of graphemes; grouping the second string of graphemes into a plurality of pseudo-graphemes based on a probability determined by a trained discrete estimator, wherein two or more graphemes in the string of graphemes that are phonetized together are grouped to a single pseudo-grapheme; accessing a first data structure that maps each pseudo-grapheme in the string of pseudo-graphemes to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a likely pronunciation of a grapheme; determining one or more phonetic representations for each pseudo-grapheme in the string of pseudo-graphemes based on the first data structure; accessing a second data structure that maps the one or more universal phonetic representations to one or more graphemes, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a grapheme; determining at least one grapheme representation for one or more of the one or more phonetic representation based on the second data structure; and constructing a phonetic representation of the string of characters based on the at least one grapheme representation that was determined. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable medium comprising computer-interpretable instructions which, when executed by at least one electronic processor, cause the at least one electronic processor to perform a method of converting a string of characters into a phonetic representation, the method comprising:
-
receiving the string of characters; parsing the string of characters into a first string of graphemes; adding one or more characters to the first string of graphemes to represent missing characters in the string of characters to create a second string of graphemes; grouping the second string of graphemes into a plurality of pseudo-graphemes based on a probability determined by a trained discrete estimator, wherein two or more graphemes in the string of graphemes that are phonetized together are grouped to a single pseudo-grapheme; accessing a first data structure that maps graphemes to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a likely pronunciation of a grapheme; determining one or more phonetic representations for each pseudo-grapheme in the string of pseudo-graphemes based on the first data structure; accessing a second data structure that maps the one or more universal phonetic representations to one or more graphemes, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a grapheme; determining at least one grapheme representation for one or more of the one or more phonetic representation based on the second data structure; and constructing the phonetic representation of the string of characters based on the at least one grapheme representation that was determined. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification