Apparatus and methods for dynamically changing a language model based on recognized text

US 9,812,130 B1
Filed: 03/04/2015
Issued: 11/07/2017
Est. Priority Date: 03/11/2014
Status: Active Grant

First Claim

Patent Images

1. A method performed on at least one processor for managing speech resources of a speech recognition engine, the method comprising the steps of:

initiating a speech recognition engine with a first language model;

converting audio received by the speech recognition engine to first interim text using the first language model, wherein the converting step includes correlating portions of the audio with corresponding portions of the first interim text;

determining whether the first interim text matches at least one trigger;

if it is determined that the first interim text does not match the at least one trigger, outputting the first interim text as recognized text; and

if it is determined that the first interim text does match the at least one trigger;

replacing the first language model with a second language model in the speech recognition engine,pausing the converting step until the first language model is replaced by the second language model,rewinding the audio based on the correlation between the portions of the audio and the corresponding portions of the first interim text,deleting a given portion of the first interim text that corresponds to a rewound portion of the audio, andresuming the converting step, wherein the rewound portion of the audio is re-input into the speech recognition engine and converted by the speech recognition engine to second interim text using only the second language model.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The technology of the present application provides a method and apparatus to manage speech resources. The method includes using a text recognizer to detect a change in a speech application that requires the use of different resources. On detection of the change, the method loads the different resources without the user needing to exit the currently executing speech application.

Citations

11 Claims

1. A method performed on at least one processor for managing speech resources of a speech recognition engine, the method comprising the steps of:
- initiating a speech recognition engine with a first language model;
  
  converting audio received by the speech recognition engine to first interim text using the first language model, wherein the converting step includes correlating portions of the audio with corresponding portions of the first interim text;
  
  determining whether the first interim text matches at least one trigger;
  
  if it is determined that the first interim text does not match the at least one trigger, outputting the first interim text as recognized text; and
  
  if it is determined that the first interim text does match the at least one trigger;
  
  replacing the first language model with a second language model in the speech recognition engine,pausing the converting step until the first language model is replaced by the second language model,rewinding the audio based on the correlation between the portions of the audio and the corresponding portions of the first interim text,deleting a given portion of the first interim text that corresponds to a rewound portion of the audio, andresuming the converting step, wherein the rewound portion of the audio is re-input into the speech recognition engine and converted by the speech recognition engine to second interim text using only the second language model.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1 wherein the initiating step includes initiating the speech recognition engine with a first user profile and the replacing step further includes replacing the first user profile with a second user profile.
  - 3. The method of claim 1 wherein correlating the portions of the audio with the corresponding portions of the first interim text includes creating a plurality of smaller audio files from the audio and converting the plurality of smaller audio files into a corresponding plurality of first interim text files, and wherein the outputted recognized text is concatenated from the plurality of first interim text files.
  - 4. The method of claim 1 wherein correlating the portions of the audio with the corresponding portions of the first interim text includes placing a plurality of markers in the audio and placing a corresponding plurality of tags in the first interim text such that the markers and the tags provide audio and text pairs.
  - 5. The method of claim 1 wherein the at least one trigger is linked to the second language model.
  - 6. The method of claim 5 wherein the at least one trigger comprises a plurality of triggers and wherein the second language model comprises a plurality of second language models.
  - 7. The method of claim 1 wherein the at least one trigger is at least one of a word, a clause, and a phrase.

8. At least one processor including a speech recognition engine specially programmed for speech recognition, the speech recognition engine comprising:
- a speech recognizer, the speech recognizer being configured to receive audio and to convert the audio to interim text using at least one language model, wherein the speech recognizer converts the audio to first interim text using a first language model, and when converting the audio to the first interim text, the speech recognizer correlates portions of the audio with corresponding portions of the first interim text;
  
  a text recognizer operationally coupled to the speech recognizer, the text recognizer being configured to receive the first interim text and to recognize whether the first interim text contains a trigger;
  
  wherein when the text recognizer recognizes a trigger in the first interim text, the speech recognizer pauses the conversion of the audio to the first interim text, replaces the first language model with a second language model, rewinds the audio based on the correlation between the portions of the audio and the corresponding portions of the first interim text, deletes a given portion of the first interim text that corresponds to a rewound portion of the audio, and resumes the conversion of the audio to second interim text, wherein the speech recognition engine is configured such that the rewound portion of the audio is re-input into the speech recognizer and converted by the speech recognizer to the second interim text using only the second language model; and
  
  wherein when the text recognizer does not recognize the trigger in the first interim text, the first interim text is provided as recognized text.
- View Dependent Claims (9, 10, 11)
- - 9. The speech recognition engine of claim 8 further comprising a memory, wherein the memory comprises a plurality of triggers and a plurality of language models, and wherein each of the plurality of triggers is linked to one of the plurality of language models.
  - 10. The speech recognition engine of claim 9 wherein when converting the audio to the first interim text, the speech recognizer creates a plurality of smaller audio files from the audio and converts the plurality of smaller audio files into a corresponding plurality of first interim text files.
  - 11. The speech recognition engine of claim 10 further comprising an index engine, wherein the index engine correlates the plurality of smaller audio files and the corresponding plurality of first interim text files.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
nVoq, Inc.
Original Assignee
nVoq, Inc.
Inventors
Corfield, Charles
Primary Examiner(s)
Wozniak, James

Application Number

US14/638,619
Time in Patent Office

979 Days
Field of Search

704235, 704251, 704257
US Class Current
CPC Class Codes

G10L 15/183 using context dependencies,...

G10L 15/197 Probabilistic grammars, e.g...

Apparatus and methods for dynamically changing a language model based on recognized text

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

11 Claims

Specification

Solutions

Use Cases

Quick Links

Apparatus and methods for dynamically changing a language model based on recognized text

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

11 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links