Method for writing a foreign language in a pseudo language phonetically resembling native language of the speaker

US 10,102,203 B2
Filed: 12/21/2015
Issued: 10/16/2018
Est. Priority Date: 12/21/2015
Status: Active Grant

First Claim

Patent Images

1. A method of converting a string of characters in a first language into a phonetic representation of a second language, the method comprising:

receiving the string of characters in the first language;

parsing the string of characters in the first language into a first string of graphemes in the first language;

adding one or more characters to the first string of graphemes to represent missing characters in the string of characters to create a second string of graphemes;

grouping the second string of graphemes into a plurality of pseudo-graphemes based on a probability determined by a trained discrete estimator, wherein two or more graphemes in the string of graphemes that are phonetized together are grouped to a single pseudo-grapheme;

accessing a first data structure that maps each pseudo-grapheme in the string of pseudo-graphemes in the first language to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a likely pronunciation of a grapheme;

determining one or more phonetic representations for each pseudo-grapheme in the string of pseudo-graphemes in the first language based on the first data structure;

accessing a second data structure that maps the one or more universal phonetic representations to one or more graphemes in the second language, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a grapheme in the second language;

determining at least one grapheme representation in the second language for one or more of the one or more phonetic representation based on the second data structure; and

constructing the phonetic representation of the string of characters in the second language based on the grapheme representation in the second language that was determined.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Provided is a method, device, and computer-readable medium for converting a string of characters in a first language into a phonetic representation of a second language using a first data structure that maps graphemes in the first language to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a likely pronunciation of a grapheme, and a second data structure that maps the one or more universal phonetic representations to one or more graphemes in the second language, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a grapheme in the second language.

Citations

18 Claims

1. A method of converting a string of characters in a first language into a phonetic representation of a second language, the method comprising:
- receiving the string of characters in the first language;
  
  parsing the string of characters in the first language into a first string of graphemes in the first language;
  
  adding one or more characters to the first string of graphemes to represent missing characters in the string of characters to create a second string of graphemes;
  
  grouping the second string of graphemes into a plurality of pseudo-graphemes based on a probability determined by a trained discrete estimator, wherein two or more graphemes in the string of graphemes that are phonetized together are grouped to a single pseudo-grapheme;
  
  accessing a first data structure that maps each pseudo-grapheme in the string of pseudo-graphemes in the first language to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a likely pronunciation of a grapheme;
  
  determining one or more phonetic representations for each pseudo-grapheme in the string of pseudo-graphemes in the first language based on the first data structure;
  
  accessing a second data structure that maps the one or more universal phonetic representations to one or more graphemes in the second language, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a grapheme in the second language;
  
  determining at least one grapheme representation in the second language for one or more of the one or more phonetic representation based on the second data structure; and
  
  constructing the phonetic representation of the string of characters in the second language based on the grapheme representation in the second language that was determined.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, further comprising:
    - ranking each grapheme representation in the second language to produce a rank list, wherein the ranking is based on a likelihood that a grapheme representation in the second language sounds similar to a pronunciation sound of the string of characters in the second language; and
      
      filtering the ranked list to produce a subset of graphene representations in the second language.
  - 3. The method of claim 2, further comprising determining a first composite weight for the one or more phonetic representations based on the first data structure.
  - 4. The method of claim 2, further comprising determining a second composite weight for the one or more grapheme representations based on the second data structure.
  - 5. The method of claim 4, wherein the filtering is based on the second composite weight.
  - 6. The method of claim 1, further comprising creating the first data structure and the second data structure as information gain trees.

7. A device for converting a string of characters in a first language into a phonetic representation of a second language, the device comprising:
- a memory containing instructions; and
  
  at least one processor, operably connected to the memory, the executes the instructions to perform operations comprising;
  
  receiving the string of characters in the first language;
  
  parsing the string of characters in the first language into a first string of graphemes in the first language;
  
  adding one or more characters to the first string of graphemes to represent missing characters in the string of characters to create a second string of graphemes;
  
  grouping the second string of graphemes into a plurality of pseudo-graphemes based on a probability determined by a trained discrete estimator, wherein two or more graphemes in the string of graphemes that are phonetized together are grouped to a single pseudo-grapheme;
  
  accessing a first data structure that maps each pseudo-grapheme in the string of pseudo-graphemes in the first language to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a likely pronunciation of a grapheme;
  
  determining one or more phonetic representations for each pseudo-grapheme in the string of pseudo-graphemes in the first language based on the first data structure;
  
  accessing a second data structure that maps the one or more universal phonetic representations to one or more graphemes in the second language, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a grapheme in the second language;
  
  determining at least one grapheme representation in the second language for one or more of the one or more phonetic representation based on the second data structure; and
  
  constructing the phonetic representation of the string of characters in the second language based on the grapheme representation in the second language that was determined.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The device of claim 7, wherein the at least one processor is further operable to perform the method comprising:
    - ranking each grapheme representation in the second language to produce a rank list, wherein the ranking is based on a likelihood that a grapheme representation in the second language sounds similar to a pronunciation sound of the string of characters in the second language; and
      
      filtering the ranked list to produce a subset of graphene representations in the second language.
  - 9. The device of claim 8, wherein the at least one processor is further operable to perform the method comprising determining a first composite weight for the one or more phonetic representations based on the first data structure.
  - 10. The device of claim 8, wherein the at least one processor is further operable to perform the method comprising determining a second composite weight for the one or more grapheme representations based on the second data structure.
  - 11. The device of claim 10, wherein the filtering is based on the second composite weight.
  - 12. The device of claim 7, wherein the at least one process is further operable to perform the method comprising creating the first data structure and the second data structure as information gain trees.

13. A non-transitory computer-readable medium comprising computer-interpretable instructions which, when executed by at least one electronic processor, cause the at least one electronic processor to perform a method of converting a string of characters in a first language into a phonetic representation of a second language, the method comprising:
- receiving the string of characters in the first language;
  
  parsing the string of characters in the first language into a first string of graphemes in the first language;
  
  adding one or more characters to the first string of graphemes to represent missing characters in the string of characters to create a second string of graphemes;
  
  grouping the second string of graphemes into a plurality of pseudo-graphemes based on a probability determined by a trained discrete estimator, wherein two or more graphemes in the string of graphemes that are phonetized together are grouped to a single pseudo-grapheme;
  
  accessing a first data structure that maps each pseudo-grapheme in the string of pseudo-graphemes in the first language to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a likely pronunciation of a grapheme;
  
  determining one or more phonetic representations for each pseudo-grapheme in the string of pseudo-graphemes in the first language based on the first data structure;
  
  accessing a second data structure that maps the one or more universal phonetic representations to one or more graphemes in the second language, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a grapheme in the second language;
  
  determining at least one grapheme representation in the second language for one or more of the one or more phonetic representation based on the second data structure; and
  
  constructing the phonetic representation of the string of characters in the second language based on the grapheme representation in the second language that was determined.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The non-transitory computer-readable medium of claim 13, further comprising:
    - ranking each grapheme representation in the second language to produce a rank list, wherein the ranking is based on a likelihood that a grapheme representation in the second language sounds similar to a pronunciation sound of the string of characters in the second language; and
      
      filtering the ranked list to produce a subset of graphene representations in the second language.
  - 15. The non-transitory computer-readable medium of claim 14, further comprising determining a first composite weight for the one or more phonetic representations based on the first data structure.
  - 16. The non-transitory computer-readable medium of claim 14, further comprising determining a second composite weight for the one or more grapheme representations based on the second data structure.
  - 17. The non-transitory computer-readable medium of claim 16, wherein the filtering is based on the second composite weight.
  - 18. The non-transitory computer-readable medium of claim 13, further comprising creating the first data structure and the second data structure as information gain trees.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
VeriSign, Inc.
Original Assignee
VeriSign, Inc.
Inventors
Raemy, Vincent, Russo, Vincenzo, Hennebert, Jean, Wicht, Baptiste
Primary Examiner(s)
Spooner, Lamont

Application Number

US14/977,022
Publication Number

US 20170177569A1
Time in Patent Office

1,030 Days
Field of Search

704 1, 704 9, 704 10, 704257, 704 2- 8
US Class Current
CPC Class Codes

G06F 40/205   Parsing

G06F 40/47   Machine-assisted translatio...

G06F 40/55   Rule-based translation

Method for writing a foreign language in a pseudo language phonetically resembling native language of the speaker

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Method for writing a foreign language in a pseudo language phonetically resembling native language of the speaker

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links