×

Method for Speech Recognition on All Languages and for Inputing words using Speech Recognition

  • US 20110066434A1
  • Filed: 09/29/2009
  • Published: 03/17/2011
  • Est. Priority Date: 09/17/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method for speech recognition on all languages and for inputing words using speech recognition provides speech recognition on all languages and a method to input words comprising:

  • (1). a word may be English, Chinese or in any other languages and an unknown voice is the pronunciation of an unknown word, the invention needs m unknown (or known) voices and a database of commonly-used known words, each unknown voice has samples and all known words have no samples;

    (2). a pre-processor to delete noise and the time interval without speech signal;

    (3). a method to normalize the whole speech waveform of an unknown voice (or a word), using E equal elastic frames (windows) without filter and without overlap and to transform the waveform into an equal-sized E×

    P matrix of the linear predict coding cepstra (LPCC) such that the same unknown voices (or words) have about the same LPCC at the same time position in their equal-sized E×

    P matrices of LPCC;

    (4) for each unknown voice of m unkown voices, find the sample mean and variance of linear predict coding cepstra (LPCC), a E×

    P matrix of sample means and variances represents an unknown voice and an unknown voice represents a category of known words with similar pronunciation to the unknown voice;

    (5). a speaker with standard and clear utterance pronounces all words in the database and if the user pronounces using different languages or dialects or with special accents, let the user pronounce all words;

    (6). a method to normalize the whole speech waveform of a pronounced word, using E equal elastic frames (windows) without filter and without overlap to transform the waveform into an equal-sized E×

    P matrix of linear predict coding cepstra (LPCC);

    (7). a simplified Bayesian classifier to compare the E×

    P matrix of sample means and variances of LPCC of an unknown voice with the E×

    P matrix of linear predict coding cepstra (LPCC) of the pronounced word and use the Bayesian distance (similarity) to find the most similar unknown voice to the pronounced word, the pronounced word is put into the category of known words represented by its most similar unknown voice, all pronounced words are classified into m categories of known words, each category contains known words with similar pronunciation, a pronounced word may be classified into several categories;

    (8). a user pronounces a word, which is transformed into a E×

    P matrix of linear predict coding cepstra (LPCC);

    (9). the simplified Bayesian classifier finds the F most similar unknown voices for the pronounced word, i.e., the simplified Bayesian classifier uses the F least Bayesian distances (similarity) to the pronounced word to find the F most similar unknown voices;

    (10). all known words from F categories represented by the F most unknown voices are arranged in a decreasing similarity according to their (absolute) distances (similarity) of the E×

    P matrices of LPCC of the known words from F categories to the matrix of LPCC of the pronounced word, the word pronounced by the user should be among the several top known words (left-handed side);

    (11). all known words in F categories after arranged in a decreasing similarity is partitioned into several equal segments, each segment of known words are arranged in a line accoding to their alphabetic latters (or the number of strokes of a Chinese character), i.e., all known words in F categories are arranged into a matrix according to their pronunciation similarity to the pronounced word and their alphabetic letters, the pronounced word is easily to be found in the matrix by using the pronunciation similarity and the alphabetic letters (the number of strokes in Chinese) of the word pronounced by the user;

    (12). a method to recognize a sentence or name;

    (13). a skill to help recognize unsuccessful words, unsuccessful sentences or names, and help input words;

    (14). the sample means and variances of m unknown voices are considered to be constants, which are independent of languages, accents, persons and sex, hence the recognition method of the invention is stable for all users and any user can easily use the invention to recognize and input a large number of words;

    (15). for the same word, a user can use any language (English, Chinese, Japanese, German and etc.) or any accent or any dialect to pronounce, even to pronounce incorrectly or completely wrong, the Bayesian classifier classifies the same word into several categories, hence a user can easily use the invention to recognize a word, a sentence or name and input a word.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×