Please download the dossier by clicking on the dossier button x
×

Learning personalized entity pronunciations

  • US 10,152,965 B2
  • Filed: 02/03/2016
  • Issued: 12/11/2018
  • Est. Priority Date: 02/03/2016
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving audio data corresponding to an utterance that is spoken by a user of a device and that includes a voice command trigger term and an entity name that is a proper noun;

    generating, by an automated speech recognizer, a first phonetic representation of a first portion of the utterance that is associated with the entity name that is a proper noun, wherein the first phonetic pronunciation does not phonetically correspond to a previously available phonetic pronunciation of the entity name;

    generating, by the automated speech recognizer, an initial transcription that (i) is based on the first phonetic representation of the first portion of the utterance, and (ii) includes a transcription of a term that is not a proper noun;

    in response to the generation of the initial transcription that includes a transcription of the term that is not a proper noun, prompting a user for feedback, wherein prompting the user for feedback comprises;

    providing, for output to the user on a graphical user interface of the device, a representation of the initial transcription that (i) is based on the first phonetic pronunciation of the first portion of the utterance, and (ii) includes the transcription of the term that is not a proper noun;

    providing, for output to the user on the graphical user interface, multiple entity names from a set of entity names stored in the pronunciation dictionary, wherein the multiple entity names that are provided for output on the graphical user interface include both (i) entity names that are phonetically close to the entity name included in the utterance, and (ii) entity names that are phonetically unrelated to the entity name included in the utterance; and

    receiving data corresponding to a selection by the user of a particular entity name of the multiple entity names;

    generating a different transcription based on the received data corresponding to the particular entity name selected by the user, wherein the different transcription includes an entity name that does not phonetically correspond to the first phonetic representation;

    updating the pronunciation dictionary to associate (i) the first phonetic representation of the first portion of the utterance that corresponds to the portion of the utterance that is associated with the entity name that is a proper noun with (ii) the entity name in the pronunciation dictionary corresponding to the different transcription that does not phonetically correspond to the first phonetic representation;

    receiving a subsequent utterance that includes the entity name; and

    transcribing the subsequent utterance based at least in part on the updated pronunciation dictionary.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×