IDENTIFYING SUBSTITUTE PRONUNCIATIONS

US 20150170642A1
Filed: 12/17/2013
Published: 06/18/2015
Est. Priority Date: 12/17/2013
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

selecting one or more terms;

obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the one or more terms;

receiving audio data corresponding to a particular user speaking the one or more terms in the natural language;

obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the one or more terms in the natural language;

aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user;

identifying, based on aligning the expected phonetic transcription of the idealized native speaker with the actual phonetic transcription of the particular user, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and

based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic transcription.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, including selecting terms; obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the terms; receiving audio data corresponding to a particular user speaking the terms in the natural language; obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the terms in the natural language; aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user; identifying, based on the aligning, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic transcription.

39 Citations

View as Search Results

23 Claims

1. A computer-implemented method comprising:
- selecting one or more terms;
  
  obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the one or more terms;
  
  receiving audio data corresponding to a particular user speaking the one or more terms in the natural language;
  
  obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the one or more terms in the natural language;
  
  aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user;
  
  identifying, based on aligning the expected phonetic transcription of the idealized native speaker with the actual phonetic transcription of the particular user, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and
  
  based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic transcription.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
- - 2. The computer-implemented method of claim 1, further comprising storing data of a mapping between the substitute pronunciation and the corresponding portion of the actual phonetic transcription for the term in a database.
  - 3. The computer-implemented method of claim 1, wherein selecting further comprises:
    - providing, on a user interface, a representation of the one or more terms to the particular user;
      
      receiving a confirmation from the particular user that the representation of the one or more terms correspond to the actual phonetic transcription; and
      
      associating the one or more terms with the actual phonetic transcription.
  - 4. The computer-implemented method of claim 3, wherein associating the one or more terms with the actual phonetic transcription further comprises storing data of a mapping between the one or more terms and the actual phonetic transcription in a database.
  - 5. The computer-implemented method of claim 3, wherein the representation of the one or more terms is provided in response to receiving the audio corresponding to the particular user speaking the one or more terms in the natural language.
  - 6. The computer-implemented method of claim 3, wherein the one or more terms include names of one or more contacts of the particular user.
  - 7. The computer-implemented method of claim 1, further comprising:
    - receiving data identifying one or more characteristics associated with the particular user;
      
      generating an identifier based on at least one of the one or more characteristics; and
      
      associating the identifier with the substitute pronunciation.
  - 8. The computer-implemented method of claim 7, wherein associating the identifier with the substitute pronunciation further comprising storing data of a mapping between the substitute pronunciation and the identifier in a confusion matrix.
  - 9. The computer-implemented method of claim 7, further comprising assigning the identifier to the particular user.
  - 10. The computer-implemented method of claim 7, wherein the one or more characteristics include one or more of a geographic location, a family name, an origin group, and a like-pronunciation group.
  - 11. The computer-implemented method of claim 7, further comprising:
    - identifying a mapping between the substitute pronunciation and an actual pronunciation for the corresponding portion of the actual phonetic transcription;
      
      associating the mapping with the identifier; and
      
      storing the association and the mapping in a confusion matrix.
  - 12. The computer-implemented method of claim 1, wherein obtaining the expected phonetic transcription further comprises:
    - identifying one or more rules associated with the one or more terms; and
      
      generating the expected phonetic transcription based on the one or more rules.
  - 13. The computer-implemented method of claim 1, wherein obtaining the expected phonetic transcription further comprises:
    - identifying a phonetic transcription dictionary, the phonetic transcription dictionary including one or more mappings between one or more terms and one or more expected phonetic transcriptions of the one or more terms; and
      
      identifying a particular mapping of the one or more mappings of the phonetic transcription dictionary between the one or more terms and the expected phonetic transcription.
  - 14. The computer-implemented method of claim 1, wherein the portion of the expected phonetic transcription includes a sequence of at least three phonemes.
  - 15. The computer-implemented method of claim 1, further comprising:
    - receiving additional audio data corresponding to the particular user speaking one or more additional terms in the natural language, the one or more additional terms including the corresponding portion of the actual phonetic transcription;
      
      identifying the substitute pronunciation; and
      
      obtaining, based on the additional audio data and the substitute pronunciation, a text-based transcription of the additional audio data corresponding to the particular user speaking the one or more additional terms in the natural language.
  - 16. The computer-implemented of claim 15, further comprising:
    - receiving data of one or more characteristics associated with the particular user;
      
      identifying the identifier that is associated with the one or more characteristics; and
      
      based on identifying the identifier that is associated with the one or more characteristics, identifying the substitute pronunciation associated with the identifier.
  - 17. The computer-implemented method of claim 16, wherein obtaining the text-based transcription of the additional audio data further comprises:
    - obtaining, based on the additional audio data, an additional actual phonetic transcription of the particular user speaking the one or more additional terms in the natural language;
      
      identifying an additional portion of the additional actual phonetic transcription that corresponds to the corresponding portion of the actual phonetic transcription;
      
      replacing the additional portion of the additional actual phonetic transcription with the substitute pronunciation; and
      
      based on the replacing, obtaining an updated phonetic transcription of the additional audio data corresponding to the particular user speaking the one or more additional terms in the natural language.
  - 18. The computer-implemented method of claim 17, wherein the text-based transcription is a text-based transcription of the updated phonetic transcription of the additional audio data corresponding to the particular user speaking the one or more terms in the natural language.
  - 19. The computer-implemented method of claim 18, further comprising providing the text-based transcription of the updated phonetic transcription of the additional audio data corresponding to the particular user speaking the one or more terms in the natural language to a user interface manager.
  - 20. The computer-implemented method of claim 18, further comprising providing the text-based transcription of the updated phonetic transcription of the additional audio data corresponding to the particular user speaking the one or more terms in the natural language as a search query to a search engine.
  - 21. The computer-implemented method of claim 1, further comprising designating the expected phonetic transcription as the substitute pronunciation for the portion of the corresponding actual phonetic transcription for a particular group of users, the particular group of users including the particular user.

22. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  selecting one or more terms;
  
  obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the one or more terms;
  
  receiving audio data corresponding to a particular user speaking the one or more terms in the natural language;
  
  obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the one or more terms in the natural language;
  
  aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user;
  
  identifying, based on aligning the expected phonetic transcription of the idealized native speaker with the actual phonetic transcription of the particular user, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and
  
  based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic transcription.

23. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- selecting one or more terms;
  
  obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the one or more terms;
  
  receiving audio data corresponding to a particular user speaking the one or more terms in the natural language;
  
  obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the one or more terms in the natural language;
  
  aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user;
  
  identifying, based on aligning the expected phonetic transcription of the idealized native speaker with the actual phonetic transcription of the particular user, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and
  
  based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic transcription.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Peng, Fuchun, Beaufays, Francoise, Mengibar, Pedro J. Moreno, Strope, Brian Patrick

Granted Patent

US 9,747,897 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G10L 15/005   Language recognition

G10L 15/187   Phonemic context, e.g. pron...

G10L 2015/025   Phonemes, fenemes or fenone...

G10L 2015/227   of the speaker; Human-fact...

IDENTIFYING SUBSTITUTE PRONUNCIATIONS

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

39 Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

IDENTIFYING SUBSTITUTE PRONUNCIATIONS

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

39 Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links