Speech recognition method and system based on user personalized information
First Claim
1. A method, comprising:
- receiving, by a processing device, a speech signal;
decoding, by the processing device, the speech signal according to a basic static decoding network to obtain a decoding path on each active node in the basic static decoding network, wherein the basic static decoding network is formed by extending words in a basic name language model into corresponding acoustic units, and wherein the basic name language model comprises a first statistical probability between two common words and a second statistical probability between a common word and a name;
generating, by the processing device, a user-specific name language model comprising a third statistical probability between the name and a user identifier;
building, by the processing device, an affiliated static decoding network associated with the user-specific name language model by extending words in the user-specific name language model into corresponding acoustic units, wherein building the affiliated static decoding network further comprises;
setting a first pronunciation of a first word at a beginning of a sentence in the user-specific name language model to a first virtual pronunciation;
setting a second pronunciation of a second word at an end of the sentence in the user-specific name language model to a second virtual pronunciation; and
extending a special pronunciation unit on an outgoing arc of a node corresponding to the beginning of the sentence and an incoming arc of the node corresponding to the end of the sentence to obtain the affiliated static decoding network associated with the user-specific name language model; and
responsive to identifying that a decoding path enters a name node in the basic static decoding network, extending, by the processing device, an extra network associated with the name node according to the affiliated static decoding network; and
returning, by the processing device, a recognition result after the decoding is completed, wherein a recognition accuracy rate for names is improved.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention relates to a speech recognition method and system based on user personalized information. The method comprises the following steps: receiving a speech signal; decoding the speech signal according to a basic static decoding network to obtain a decoding path on each active node in the basic static decoding network, wherein the basic static decoding network is a decoding network associated with a basic name language model; if a decoding path enters a name node in the basic static decoding network, network extending is carried out on the name node according to an affiliated static decoding network of a user, wherein the affiliated static decoding network is a decoding network associated with a name language model of a particular user; and returning a recognition result after the decoding is completed. The recognition accuracy rate of user personalized information in continuous speech recognition may be raised by using the present invention.
-
Citations
16 Claims
-
1. A method, comprising:
-
receiving, by a processing device, a speech signal; decoding, by the processing device, the speech signal according to a basic static decoding network to obtain a decoding path on each active node in the basic static decoding network, wherein the basic static decoding network is formed by extending words in a basic name language model into corresponding acoustic units, and wherein the basic name language model comprises a first statistical probability between two common words and a second statistical probability between a common word and a name; generating, by the processing device, a user-specific name language model comprising a third statistical probability between the name and a user identifier; building, by the processing device, an affiliated static decoding network associated with the user-specific name language model by extending words in the user-specific name language model into corresponding acoustic units, wherein building the affiliated static decoding network further comprises; setting a first pronunciation of a first word at a beginning of a sentence in the user-specific name language model to a first virtual pronunciation; setting a second pronunciation of a second word at an end of the sentence in the user-specific name language model to a second virtual pronunciation; and extending a special pronunciation unit on an outgoing arc of a node corresponding to the beginning of the sentence and an incoming arc of the node corresponding to the end of the sentence to obtain the affiliated static decoding network associated with the user-specific name language model; and responsive to identifying that a decoding path enters a name node in the basic static decoding network, extending, by the processing device, an extra network associated with the name node according to the affiliated static decoding network; and returning, by the processing device, a recognition result after the decoding is completed, wherein a recognition accuracy rate for names is improved. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A speech recognition system based on user personalized information, comprising:
-
a memory; and a processing device, communicatively coupled to the memory, to; receive a speech signal; decode the speech signal according to a basic static decoding network to obtain a decoding path on each active node in the basic static decoding network, wherein the basic static decoding network is formed by extending words in a basic name language model into corresponding acoustic units, and wherein the basic name language model comprises a first statistical probability between two common words and a second statistical probability between a common word and a name; generate a user-specific name language model comprising a third statistical probability between the name and a user identifier; build an affiliated static decoding network associated with the user-specific name language model by extending words in the user-specific name language model into corresponding acoustic units, wherein to build the affiliated static decoding network, the processing device is further to; set a first pronunciation of a first word at a beginning of a sentence in the user-specific name language model to a first virtual pronunciation; set a second pronunciation of a second word at an end of the sentence in the user-specific name language model to a second virtual pronunciation; and extend a special pronunciation unit on an outgoing arc of a node corresponding to the beginning of the sentence and an incoming arc of the node corresponding to the end of the sentence to obtain the affiliated static decoding network associated with the user-specific name language model; and responsive to identifying that a decoding path enters a name node in the basic static decoding network, extend an extra network associated with the name node according to the affiliated static decoding network; and return a recognition result after the decoding is completed, wherein a recognition accuracy rate for names is improved. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
Specification