LANGUAGE PHONETIC PROCESSING BASED ON FINE-GRAINED MAPPING OF PHONETIC COMPONENTS
First Claim
1. A computer-implemented method for determining a phonetic distance between two words of a particular language, the computer-implemented method comprising:
- obtaining a pronunciation of a first word of a particular language;
identifying a phonetic component of the pronunciation of the first word, wherein the phonetic component corresponds to a type of phonetic component of the particular language;
obtaining a phonetic component mapping table for the type of phonetic component identified in the pronunciation of the first word;
assigning a phonetic value to the identified phonetic component using the phonetic component mapping table;
obtaining a pronunciation of a second word of the particular language, wherein the first word and the second word are different;
identifying a phonetic component of the pronunciation of the second word;
assigning a phonetic value to the identified phonetic component of the second word using the phonetic component mapping table;
calculating a phonetic distance between (i) the identified phonetic component of the first word and (ii) the identified phonetic component of the second word, using (a) the assigned phonetic value of the identified phonetic component of the first word and (b) the assigned phonetic value of the identified phonetic component of the second word; and
storing the calculated phonetic distance in association with the identified phonetic component of the first word.
1 Assignment
0 Petitions
Accused Products
Abstract
In one embodiment, a computer-implemented method includes obtaining a pronunciation of a first word of a particular language and identifying a phonetic component of the pronunciation. The method includes obtaining a phonetic component mapping table for the type of phonetic component identified in the pronunciation of the first word and assigning a phonetic value to the identified phonetic component using the phonetic component mapping table. For a second word, the method includes obtaining a pronunciation of a second word, identifying a phonetic component of the pronunciation, and assigning a phonetic value to the identified phonetic component. In addition, the method includes calculating a phonetic distance between the identified phonetic component of the first word and the identified phonetic component of the second word, using the assigned phonetic values of the respective identified phonetic components of the first word second word, and storing the calculated phonetic distance in a file.
5 Citations
21 Claims
-
1. A computer-implemented method for determining a phonetic distance between two words of a particular language, the computer-implemented method comprising:
-
obtaining a pronunciation of a first word of a particular language; identifying a phonetic component of the pronunciation of the first word, wherein the phonetic component corresponds to a type of phonetic component of the particular language; obtaining a phonetic component mapping table for the type of phonetic component identified in the pronunciation of the first word; assigning a phonetic value to the identified phonetic component using the phonetic component mapping table; obtaining a pronunciation of a second word of the particular language, wherein the first word and the second word are different; identifying a phonetic component of the pronunciation of the second word; assigning a phonetic value to the identified phonetic component of the second word using the phonetic component mapping table; calculating a phonetic distance between (i) the identified phonetic component of the first word and (ii) the identified phonetic component of the second word, using (a) the assigned phonetic value of the identified phonetic component of the first word and (b) the assigned phonetic value of the identified phonetic component of the second word; and storing the calculated phonetic distance in association with the identified phonetic component of the first word. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-implemented method for ranking a series of candidate words with pronunciation similar to that of a seed word, the computer-implemented method comprising:
-
obtaining a pronunciation of a seed word of a particular language; identifying a phonetic component of the pronunciation of the seed word, wherein the phonetic component corresponds to a type of phonetic component of the particular language; obtaining a phonetic component mapping table for the type of phonetic component identified in the pronunciation of the seed word, assigning a phonetic value to the identified phonetic component using the phonetic component mapping table; obtaining a pronunciation of a given one of a plurality of candidate words of the particular language, wherein the plurality of candidate words and the seed word are different; identifying a phonetic component of the pronunciation of the given one of the plurality of candidate words; assigning a phonetic value to the identified phonetic component using the phonetic component mapping table; for each type of phonetic component identified in the seed word, calculating a phonetic distance between (i) the identified phonetic component of the seed word and (ii) the identified phonetic component of the candidate word, using (a) the assigned phonetic value of the identified phonetic component of the seed word and (b) the assigned phonetic value of the identified phonetic component of the candidate word; determining a phonetic similarity distance between the seed word and the candidate word, wherein the phonetic similarity distance comprises calculating a sum of a plurality of phonetic distances between the seed word and the candidate word, each phonetic distance representing a given type of phonetic component; generating a series of candidate words, wherein each candidate word in the series of candidate words has a pronunciation similar to that of the seed word based on a value of the determined phonetic similarity distance between the seed word and each candidate word; and ranking the candidate words that have a pronunciation similar to the seed word in order of the value of the determined phonetic similarity distance between the seed word and each candidate word. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer program product for ranking a series of candidate words with pronunciation similar to that of a seed word, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, the program instructions executable by a computer to cause the computer to perform a method comprising:
-
obtaining, by computer, a pronunciation of a seed word of a particular language; identifying, by computer, a phonetic component of the pronunciation of the seed word, wherein the phonetic component corresponds to a type of phonetic component of the particular language; obtaining, by computer, a phonetic component mapping table for the type of phonetic component identified in the pronunciation of the seed word; assigning, by computer, a phonetic value to the identified phonetic component using the phonetic component mapping table; obtaining, by computer, a pronunciation of a given one of a plurality of candidate words of the particular language, wherein the plurality of candidate words and the seed word are different; identifying, by computer, a phonetic component of the pronunciation of the given one of a plurality of candidate words; assigning, by computer, a phonetic value to the identified phonetic component using the phonetic component mapping table; for each type of phonetic component identified in the seed word, calculating, by computer, a phonetic distance between (i) the identified phonetic component of the seed word and (ii) the identified phonetic component of the candidate word, using (a) the assigned phonetic value of the identified phonetic component of the seed word and (b) the assigned phonetic value of the identified phonetic component of the candidate word; determining, by computer, a phonetic similarity distance between the seed word and the candidate word, wherein the phonetic similarity distance comprises calculating a sum of a plurality of phonetic distances between the seed word and the candidate word, each phonetic distance representing a given type of phonetic component; generating, by computer, a series of candidate words, wherein each candidate word in the series of candidate words has a pronunciation similar to that of the seed word based on a value of the determined phonetic similarity distance between the seed word and each candidate word; and ranking, by computer, the candidate words that have a pronunciation similar to the seed word in order of the value of the determined phonetic similarity distance between the seed word and each candidate word.
-
-
20. A computer-implemented method for ranking a series of candidate words with pronunciation similar to that of a seed word, wherein the candidate words and the seed word are of the Chinese language, the computer-implemented method comprising:
-
obtaining a Pinyin pronunciation of a seed word of the Chinese language, wherein the seed word is comprised of a series of characters, wherein each character has a Pinyin pronunciation; identifying a Pinyin phonetic component of the Pinyin pronunciation of one character of the seed word, wherein the Pinyin phonetic component is selected from the group of Pinyin phonetic components consisting of;
an initial, a final, and a tone;obtaining a Pinyin component mapping table selected from the group consisting of;
a Pinyin initial mapping table, a Pinyin final mapping table, and a Pinyin tone mapping table;assigning a phonetic value to the identified Pinyin phonetic component of the Pinyin pronunciation of the character of the seed word using the respective Pinyin phonetic component mapping table; obtaining a Pinyin pronunciation of a given one of a plurality of candidate words of the Chinese language, wherein the given one of the candidate words is comprised of a series of characters, wherein each character has a Pinyin pronunciation; identifying a Pinyin phonetic component of the Pinyin pronunciation of a character of the given one of the candidate words; assigning a phonetic value to the identified Pinyin phonetic component of the Pinyin pronunciation of the character using the respective Pinyin phonetic component mapping table; for each type of phonetic component identified in the character of the seed word, calculating a phonetic distance between (i) the identified Pinyin phonetic component of the character of the seed word and (ii) the identified Pinyin phonetic component of the character of the candidate word using (a) the assigned phonetic value of the identified Pinyin phonetic component of the character of the seed word and (b) the assigned phonetic value of the identified Pinyin phonetic component of the character of the candidate word; determining a phonetic similarity distance between the seed word and the candidate word, wherein the phonetic similarity distance comprises calculating a sum of a plurality of phonetic distances between the seed word and the candidate word, each phonetic distance representing a given type of phonetic component; generating a series of candidate words, wherein each candidate word in the series of candidate words has a Pinyin pronunciation similar to that of the seed word based on a value of the determined phonetic similarity distance between the seed word and each candidate word; and
ranking the candidate words that have a Pinyin pronunciation similar to the seed word in order of the value of the determined phonetic similarity distance between the seed word and each candidate word. - View Dependent Claims (21)
-
Specification