Generating topic-specific language models

US 10,559,301 B2
Filed: 12/15/2017
Issued: 02/11/2020
Est. Priority Date: 07/01/2009
Status: Active Grant

First Claim

Patent Images

1. An apparatus comprising:

one or more processors; and

memory storing instructions that, when executed by the one or more processors, cause the apparatus to;

perform, using a first language model, a first speech recognition process on an audio signal;

determine, based on the first speech recognition process, a plurality of topics associated with the audio signal;

determine, based on the first speech recognition process, a respective significance, for each of the plurality of topics, based on a respective quantity of terms, in the audio signal, associated with each of the plurality of topics;

determine, based on the respective significance for each of the plurality of topics, a respective term threshold;

cause, for each of the plurality of topics, a respective set of one or more searches such that a respective quantity of terms identified by the respective set of one or more searches satisfies the respective term threshold for the topic;

determine, based on the terms identified by the searches, a second language model; and

perform, using the second language model, a second speech recognition process on the audio signal.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech recognition may be improved by generating and using a topic specific language model. A topic specific language model may be created by performing an initial pass on an audio signal using a generic or basis language model. A speech recognition device may then determine topics relating to the audio signal based on the words identified in the initial pass and retrieve a corpus of text relating to those topics. Using the retrieved corpus of text, the speech recognition device may create a topic specific language model. In one example, the speech recognition device may adapt or otherwise modify the generic language model based on the retrieved corpus of text.

Citations

30 Claims

1. An apparatus comprising:
- one or more processors; and
  
  memory storing instructions that, when executed by the one or more processors, cause the apparatus to;
  
  perform, using a first language model, a first speech recognition process on an audio signal;
  
  determine, based on the first speech recognition process, a plurality of topics associated with the audio signal;
  
  determine, based on the first speech recognition process, a respective significance, for each of the plurality of topics, based on a respective quantity of terms, in the audio signal, associated with each of the plurality of topics;
  
  determine, based on the respective significance for each of the plurality of topics, a respective term threshold;
  
  cause, for each of the plurality of topics, a respective set of one or more searches such that a respective quantity of terms identified by the respective set of one or more searches satisfies the respective term threshold for the topic;
  
  determine, based on the terms identified by the searches, a second language model; and
  
  perform, using the second language model, a second speech recognition process on the audio signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The apparatus of claim 1, wherein the instructions, when executed by the one or more processors, further cause the apparatus to determine the plurality of topics based on a different term from a term, in the audio signal, that appears in a list of stop words.
  - 3. The apparatus of claim 1, wherein the instructions, when executed by the one or more processors, further cause the apparatus to determine the plurality of topics based on determining that a frequency of a term, in the audio signal, associated with a first topic of the plurality of topics satisfies a frequency threshold.
  - 4. The apparatus of claim 1, wherein the instructions, when executed by the one or more processors, further cause the apparatus to determine the plurality of topics based on data extracted from a web page associated with the audio signal.
  - 5. The apparatus of claim 1, wherein the first language model comprises a generic language model and indicates a probability of a first term following a second term.
  - 6. The apparatus of claim 1, wherein the instructions, when executed by the one or more processors, further cause the apparatus to:
    - based on determining that a quantity of terms returned by a first search does not satisfy the respective term threshold for a first topic of the plurality of topics, conduct a second search associated with the first topic.
  - 7. The apparatus of claim 1, wherein the instructions, when executed by the one or more processors, further cause the apparatus to determine the respective term threshold by dividing a total number of terms needed to generate the second language model by a quantity of the plurality of topics.

8. An apparatus comprising:
- one or more processors; and
  
  memory storing instructions that, when executed by the one or more processors, cause the apparatus to;
  
  determine, via a first speech recognition process, a first topic and a second topic associated with an audio signal, wherein the first speech recognition process uses an initial language model;
  
  determine a significance of the first topic based on a first quantity of terms, in the audio signal, identified as being associated with the first topic via the first speech recognition process;
  
  determine a significance of the second topic based on a second quantity of terms, in the audio signal, identified as being associated with the second topic via the first speech recognition process;
  
  receive, based on a first search associated with the first topic, a first plurality of terms related to at least the first topic, wherein a quantity of the first plurality of terms satisfies a first threshold number of terms, wherein the first threshold number of terms is based on the significance of the first topic;
  
  cause, based on the first plurality of terms, modification of the initial language model; and
  
  perform, using the modified initial language model, a second speech recognition process on the audio signal.
- View Dependent Claims (9, 10, 11)
- - 9. The apparatus of claim 8, wherein the instructions, when executed by the one or more processors, further cause the apparatus to:
    - receive, based on a second search associated with the second topic, a second plurality of terms related to the second topic, wherein a quantity of the second plurality of terms satisfies a second threshold number of terms, wherein the second threshold number of terms is based on the significance of the second topic; and
      
      cause, based at least in part on the second plurality of terms, the modification of the initial language model.
  - 10. The apparatus of claim 8, wherein the instructions, when executed by the one or more processors, further cause the apparatus to determine the first topic and the second topic based on metadata associated with the audio signal.
  - 11. The apparatus of claim 8, wherein the instructions, when executed by the one or more processors, further cause the apparatus to determine the first topic and the second topic based on a different term from a term, in the audio signal, that appears in a list of stop words.

12. An apparatus comprising:
- one or more processors; and
  
  memory storing instructions that, when executed by the one or more processors, cause the apparatus to;
  
  perform, using a first language model, a first speech recognition process on an input signal;
  
  determine, based on the first speech recognition process, a plurality of topics associated with the input signal;
  
  determine, for each topic of the plurality of topics;
  
  a respective significance based on a respective quantity of terms, in the input signal, associated with each of the plurality of topics; and
  
  a respective term threshold based on the respective significance;
  
  cause, for each of the plurality of topics and using words recognized by the first speech recognition process, one or more searches such that a quantity of terms identified by the one or more searches satisfies the respective term threshold for the topic;
  
  determine a corpus of terms by combining the terms identified by the one or more searches conducted for each of the plurality of topics;
  
  determine, based on the corpus of terms, a second language model; and
  
  perform, using the second language model, a second speech recognition process on the input signal.
- View Dependent Claims (13, 14, 15)
- - 13. The apparatus of claim 12, wherein the instructions, when executed by the one or more processors, further cause the apparatus to cause, for a first topic of the plurality of topics, the one or more searches by iteratively conducting a plurality of searches until a total quantity of terms identified by the iteratively conducted searches satisfies the respective term threshold for the first topic.
  - 14. The apparatus of claim 12, wherein the instructions, when executed by the one or more processors, further cause the apparatus to determine the second language model by causing, in the first language model, modification of a probability of two terms appearing consecutively.
  - 15. The apparatus of claim 12, wherein the instructions, when executed by the one or more processors, further cause the apparatus to cause, for a first topic of the plurality of topics, the one or more searches by determining, from a keyword table, a plurality of keywords previously associated with the first topic.

16. A non-transitory computer-readable medium storing instructions that, when executed, cause a computing device to:
- perform, using a first language model, a first speech recognition process on an audio signal;
  
  determine, based on the first speech recognition process, a plurality of topics associated with the audio signal;
  
  determine, based on the first speech recognition process, a respective significance, for each of the plurality of topics, based on a respective quantity of terms, in the audio signal, associated with each of the plurality of topics;
  
  determine, based on the respective significance for each of the plurality of topics, a respective term threshold;
  
  cause, for each of the plurality of topics, a respective set of one or more searches such that a respective quantity of terms identified by the respective set of one or more searches satisfies the respective term threshold for the topic;
  
  determine, based on the terms identified by the searches, a second language model; and
  
  perform, using the second language model, a second speech recognition process on the audio signal.
- View Dependent Claims (17, 18, 19, 20, 21, 22)
- - 17. The non-transitory computer-readable medium of claim 16, wherein the instructions, when executed, further cause the computing device to determine the plurality of topics based on a different term from a term, in the audio signal, that appears in a list of stop words.
  - 18. The non-transitory computer-readable medium of claim 16, wherein the instructions, when executed, further cause the computing device to determine the plurality of topics based on determining that a frequency of a term, in the audio signal, associated with a first topic of the plurality of topics satisfies a frequency threshold.
  - 19. The non-transitory computer-readable medium of claim 16, wherein the instructions, when executed, further cause the computing device to determine the plurality of topics based on data extracted from a web page associated with the audio signal.
  - 20. The non-transitory computer-readable medium of claim 16, wherein the first language model comprises a generic language model and indicates a probability of a first term following a second term.
  - 21. The non-transitory computer-readable medium of claim 16, wherein the instructions, when executed, further cause the computing device to:
    - based on determining that a quantity of terms returned by a first search does not satisfy the respective term threshold for a first topic of the plurality of topics, conduct a second search associated with the first topic.
  - 22. The non-transitory computer-readable medium of claim 16, wherein the instructions, when executed, further cause the computing device to determine the respective term threshold by dividing a total number of terms needed to generate the second language model by a quantity of the plurality of topics.

23. A non-transitory computer-readable medium storing instructions that, when executed, cause a computing device to:
- determine, via a first speech recognition process, a first topic and a second topic associated with an audio signal, wherein the first speech recognition process uses an initial language model;
  
  determine a significance of the first topic based on a first quantity of terms, in the audio signal, identified as being associated with the first topic via the first speech recognition process;
  
  determine a significance of the second topic based on a second quantity of terms, in the audio signal, identified as being associated with the second topic via the first speech recognition process;
  
  receive, based on a first search associated with the first topic, a first plurality of terms related to at least the first topic, wherein a quantity of the first plurality of terms satisfies a first threshold number of terms, wherein the first threshold number of terms is based on the significance of the first topic;
  
  cause, based on the first plurality of terms, modification of the initial language model; and
  
  perform, using the modified initial language model, a second speech recognition process on the audio signal.
- View Dependent Claims (24, 25, 26)
- - 24. The non-transitory computer-readable medium of claim 23, wherein the instructions, when executed, further cause the computing device to:
    - receive, based on a second search associated with the second topic, a second plurality of terms related to the second topic, wherein a quantity of the second plurality of terms satisfies a second threshold number of terms, wherein the second threshold number of terms is based on the significance of the second topic; and
      
      cause, based at least in part on the second plurality of terms, the modification of the initial language model.
  - 25. The non-transitory computer-readable medium of claim 23, wherein the instructions, when executed, further cause the computing device to determine the first topic and the second topic based on metadata associated with the audio signal.
  - 26. The non-transitory computer-readable medium of claim 23, wherein the instructions, when executed, further cause the computing device to determine the first topic and the second topic based on a different term from a term, in the audio signal, that appears in a list of stop words.

27. A non-transitory computer-readable medium storing instructions that, when executed, cause a computing device to:
- perform, using a first language model, a first speech recognition process on an input signal;
  
  determine, based on the first speech recognition process, a plurality of topics associated with the input signal;
  
  determine, for each topic of the plurality of topics;
  
  a respective significance based on a respective quantity of terms, in the input signal, associated with each of the plurality of topics; and
  
  a respective term threshold based on the respective significance;
  
  cause, for each of the plurality of topics and using words recognized by the first speech recognition process, one or more searches such that a quantity of terms identified by the one or more searches satisfies the respective term threshold for the topic;
  
  determine a corpus of terms by combining the terms identified by the one or more searches conducted for each of the plurality of topics;
  
  determine, based on the corpus of terms, a second language model; and
  
  perform, using the second language model, a second speech recognition process on the input signal.
- View Dependent Claims (28, 29, 30)
- - 28. The non-transitory computer-readable medium of claim 27, wherein the instructions, when executed, further cause the computing device to cause, for a first topic of the plurality of topics, the one or more searches by iteratively conducting a plurality of searches until a total quantity of terms identified by the iteratively conducted searches satisfies the respective term threshold for the first topic.
  - 29. The non-transitory computer-readable medium of claim 27, wherein the instructions, when executed, further cause the computing device to determine the second language model by causing, in the first language model, modification of a probability of two terms appearing consecutively.
  - 30. The non-transitory computer-readable medium of claim 27, wherein the instructions, when executed, further cause the computing device to cause, for a first topic of the plurality of topics, the one or more searches by determining, from a keyword table, a plurality of keywords previously associated with the first topic.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
TiVo Corporation (Adeia Inc.)
Original Assignee
Comcast Interactive Media LLC (Comcast Corporation)
Inventors
Houghton, David F., Murray, Seth Michael, Simon, Sibley Verbeck
Primary Examiner(s)
Baker, Matthew H

Application Number

US15/843,846
Publication Number

US 20190035388A1
Time in Patent Office

788 Days
Field of Search
US Class Current
CPC Class Codes

G10L 15/183 using context dependencies,...

G10L 15/197 Probabilistic grammars, e.g...

Generating topic-specific language models

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Generating topic-specific language models

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links