Apparatus and methods for dynamically changing a language model based on recognized text
First Claim
Patent Images
1. A method performed on at least one processor for managing speech resources of a speech recognition engine, the method comprising the steps of:
- initiating a speech recognition engine with a first language model;
converting audio received by the speech recognition engine to first interim text using the first language model, wherein the converting step includes correlating portions of the audio with corresponding portions of the first interim text;
determining whether the first interim text matches at least one trigger;
if it is determined that the first interim text does not match the at least one trigger, outputting the first interim text as recognized text; and
if it is determined that the first interim text does match the at least one trigger;
replacing the first language model with a second language model in the speech recognition engine,pausing the converting step until the first language model is replaced by the second language model,rewinding the audio based on the correlation between the portions of the audio and the corresponding portions of the first interim text,deleting a given portion of the first interim text that corresponds to a rewound portion of the audio, andresuming the converting step, wherein the rewound portion of the audio is re-input into the speech recognition engine and converted by the speech recognition engine to second interim text using only the second language model.
1 Assignment
0 Petitions
Accused Products
Abstract
The technology of the present application provides a method and apparatus to manage speech resources. The method includes using a text recognizer to detect a change in a speech application that requires the use of different resources. On detection of the change, the method loads the different resources without the user needing to exit the currently executing speech application.
-
Citations
11 Claims
-
1. A method performed on at least one processor for managing speech resources of a speech recognition engine, the method comprising the steps of:
-
initiating a speech recognition engine with a first language model; converting audio received by the speech recognition engine to first interim text using the first language model, wherein the converting step includes correlating portions of the audio with corresponding portions of the first interim text; determining whether the first interim text matches at least one trigger; if it is determined that the first interim text does not match the at least one trigger, outputting the first interim text as recognized text; and if it is determined that the first interim text does match the at least one trigger; replacing the first language model with a second language model in the speech recognition engine, pausing the converting step until the first language model is replaced by the second language model, rewinding the audio based on the correlation between the portions of the audio and the corresponding portions of the first interim text, deleting a given portion of the first interim text that corresponds to a rewound portion of the audio, and resuming the converting step, wherein the rewound portion of the audio is re-input into the speech recognition engine and converted by the speech recognition engine to second interim text using only the second language model. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. At least one processor including a speech recognition engine specially programmed for speech recognition, the speech recognition engine comprising:
-
a speech recognizer, the speech recognizer being configured to receive audio and to convert the audio to interim text using at least one language model, wherein the speech recognizer converts the audio to first interim text using a first language model, and when converting the audio to the first interim text, the speech recognizer correlates portions of the audio with corresponding portions of the first interim text; a text recognizer operationally coupled to the speech recognizer, the text recognizer being configured to receive the first interim text and to recognize whether the first interim text contains a trigger; wherein when the text recognizer recognizes a trigger in the first interim text, the speech recognizer pauses the conversion of the audio to the first interim text, replaces the first language model with a second language model, rewinds the audio based on the correlation between the portions of the audio and the corresponding portions of the first interim text, deletes a given portion of the first interim text that corresponds to a rewound portion of the audio, and resumes the conversion of the audio to second interim text, wherein the speech recognition engine is configured such that the rewound portion of the audio is re-input into the speech recognizer and converted by the speech recognizer to the second interim text using only the second language model; and wherein when the text recognizer does not recognize the trigger in the first interim text, the first interim text is provided as recognized text. - View Dependent Claims (9, 10, 11)
-
Specification