Systems and methods for performing ASR in the presence of heterographs

US 9,721,564 B2
Filed: 07/31/2014
Issued: 08/01/2017
Est. Priority Date: 07/31/2014
Status: Active Grant

First Claim

Patent Images

1. A method for performing automatic speech recognition (ASR) when a heterographic word is present, the method comprising:

receiving verbal input from a user that comprises a plurality of utterances;

matching a first of the plurality of utterances to a first word;

determining a word that describes the context for the first word;

determining that a second utterance in the plurality of utterances matches a plurality of words that are in a same heterograph set;

combining a second word chosen from the plurality of words with the word that describes the context for the first word to generate a first combined set of words;

storing a first value representing a distance between words in the first combined set of words;

combining a third word chosen from the plurality of words with the word that describes the context for the first word to generate a second combined set of words;

storing a second value representing a distance between words in the second combined set of words;

in response to determining that the second value is smaller than the first value, performing a media guidance application function on an available media asset based on the second combined set of words.

View all claims

9 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for performing ASR in the presence of heterographs are provided. Verbal input is received from the user that includes a plurality of utterances. A first of the plurality of utterances is matched to a first word. It is determined that a second utterance in the plurality of utterances matches a plurality of words that is in a same heterograph set. It is identified which one of the plurality of words is associated with a context of the first word. A function is performed based on the first word and the identified one of the plurality of words.

Citations

18 Claims

1. A method for performing automatic speech recognition (ASR) when a heterographic word is present, the method comprising:
- receiving verbal input from a user that comprises a plurality of utterances;
  
  matching a first of the plurality of utterances to a first word;
  
  determining a word that describes the context for the first word;
  
  determining that a second utterance in the plurality of utterances matches a plurality of words that are in a same heterograph set;
  
  combining a second word chosen from the plurality of words with the word that describes the context for the first word to generate a first combined set of words;
  
  storing a first value representing a distance between words in the first combined set of words;
  
  combining a third word chosen from the plurality of words with the word that describes the context for the first word to generate a second combined set of words;
  
  storing a second value representing a distance between words in the second combined set of words;
  
  in response to determining that the second value is smaller than the first value, performing a media guidance application function on an available media asset based on the second combined set of words.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1 further comprising:
    - storing a knowledge graph of a relationship between words, wherein a distance between words in the knowledge graph is indicative of strength in relationship between the words; and
      
      calculating the first value and the second value based on the distance between the words in the first combined set of words and the distance between the words in the second combined set of words.
  - 3. The method of claim 2 further comprising:
    - identifying positions, in the knowledge graph, of the context of the first word and each of the plurality of words; and
      
      computing, based on the identified positions, a distance between the context of the first word and each of the plurality of words.
  - 4. The method of claim 1, wherein the first word is a name of a competitor in a sporting event, further comprising:
    - setting the context to be the sporting event; and
      
      determining which of the plurality of words corresponds to the sporting event, wherein the third word corresponds to another competitor in the sporting event.
  - 5. The method of claim 1, wherein the plurality of words that are in the same heterograph set are phonetically similar to each other.
  - 6. The method of claim 1 further comprising generating a recommendation based on the first word and the third word.
  - 7. The method of claim 1, wherein matching the first of the plurality of utterances to the first word comprises determining that the first utterance phonetically corresponds to the first word.
  - 8. The method of claim 1, wherein the first word is a name of an actor in a media asset, further comprising:
    - setting the context to be the media asset; and
      
      determining which of the plurality of words corresponds to the media asset, wherein the third word corresponds to another actor in the media asset.
  - 9. The method of claim 1 further comprising determining the context based on a conjunction between two of the plurality of utterances.

10. A system for performing automatic speech recognition (ASR) when a heterographic word is present, the system comprising:
- control circuitry configured to;
  
  receive verbal input from a user that comprises a plurality of utterances;
  
  match a first of the plurality of utterances to a first word;
  
  determine a word that describes the context for the first word;
  
  determine that a second utterance in the plurality of utterances matches a plurality of words that are in a same heterograph set;
  
  combine a second word chosen from the plurality of words with the word that describes the context for the first word to generate a first combined set of words;
  
  store a first value representing a distance between words in the first combined set of words;
  
  combine a third word chosen from the plurality of words with the word that describes the context for the first word to generate a second combined set of words;
  
  store a second value representing a distance between words in the second combined set of words; and
  
  in response to determining that the second value is smaller than the first value, perform a media guidance application function on an available media asset based on the second combined set of words.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
- - 11. The system of claim 10, wherein the control circuitry is further configured to:
    - store a knowledge graph of a relationship between words, wherein a distance between words in the knowledge graph is indicative of strength in relationship between the words; and
      
      calculate the first value and the second value based on a distance between the words in the first combined set of words and the words in the second combined set of words.
  - 12. The system of claim 11, wherein the control circuitry is further configured to:
    - identify positions, in the knowledge graph, of the first word and each of the plurality of words; and
      
      compute, based on the identified positions, a distance between the first word and each of the plurality of words.
  - 13. The system of claim 10, wherein the first word is a name of a competitor in a sporting event, and wherein the control circuitry is further configured to:
    - set the context to be the sporting event;
      
      determine which of the plurality of words corresponds to the sporting event, wherein the third word corresponds to another competitor in the sporting event.
  - 14. The system of claim 10, wherein the plurality of words that are in the same heterograph set are phonetically similar to each other.
  - 15. The system of claim 10, wherein the control circuitry is further configured to generate a recommendation based on the first word and the third word.
  - 16. The system of claim 10, wherein the control circuitry is further configured to match the first of the plurality of utterances to the first word by determining that the first utterance phonetically corresponds to the first word.
  - 17. The system of claim 10, wherein the first word is a name of an actor in a media asset, and wherein the control circuitry is further configured to:
    - set the context to be the media asset; and
      
      determine which of the plurality of words corresponds to the media asset, wherein the third word corresponds to another actor in the media asset.
  - 18. The system of claim 10, wherein the control circuitry is further configured to determine the context based on a conjunction between two of the plurality of utterances.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Rovi Guides, Inc. (Adeia Inc.)
Original Assignee
Rovi Guides, Inc. (Adeia Inc.)
Inventors
Agarwal, Akshat, Barve, Rakesh
Primary Examiner(s)
Yang, Qian

Application Number

US14/448,308
Publication Number

US 20160035347A1
Time in Patent Office

1,097 Days
Field of Search
US Class Current
CPC Class Codes

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/193   Formal grammars, e.g. finit...

Systems and methods for performing ASR in the presence of heterographs

First Claim

9 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for performing ASR in the presence of heterographs

First Claim

9 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links