Speech recognition method on sentences in all languages

US 20120116764A1
Filed: 11/09/2010
Published: 05/10/2012
Est. Priority Date: 11/09/2010
Status: Abandoned Application

First Claim

Patent Images

1. A speech recognition method on sentences in all languages comprising:

(1) a sentence can be a syllable, a word, a name or a sentence, and M=1000 different voices are prepared;

(2) a pre-processor to delete noise and all time intervals without real signal sampled points, before and after a voice (sentence), between two syllables and two words;

(3) a method to normalize the whole waveform of real signal sampled points of a voice (sentence), using E equal elastic frames (windows) without filter and without overlap over each other, and to transform the whole waveform of real signal sampled points into an equal-sized E×

P matrix of the linear predict coding cepstra (LPCC);

(4) M=1000 different voices are transformed into 1000 different E×

P matrices of linear predict coding cepstra (LPCC) to represent 1000 different databases;

(5) a user pronounces a known sentence, delete noise and all time intervals without real language signal points, before and after the known sentence, between two syllables and two words, and E=12 equal elastic frames normalize the whole waveform of real language signal points into an E×

P matrix of LPCC;

(6) use the distance or weighted distance between the E×

P matrix of LPCC of the known sentence and 1000 different E×

P matrices of LPCC of 1000 different voices representing 1000 different databases to find its closest database, the E×

P matrix of the known sentence is put into its closest database, and similarly, the E×

P matrices of LPCC of all known sentences are put into their closest databases individually;

(7) to classify an unknown sentence, after deletion of noise and time intervals without language signal points, before and after the unknown sentence, between two syllables and two words, the unknown sentence with real language sampled points is transformed into an E×

P matrix of LPCC, the invention uses the distance or weighted distance between the E×

P matrix of LPCC of the unknown sentence and 1000 different E×

P matrices of LPCC of 1000 different voices representing 1000 different databases to find its F closest databases and again uses the distance or weighted distance between the E×

P matrix of LPCC of the unknown sentence and the E×

P matrices of LPCC of the similar known sentences in its F closest databases to find a known sentence to be the unknown sentence; and

(8) if an unknown sentence is not identified, the unknown sentence is pronounced again, its E×

P matrix of LPCC is put into the new closest database, and then it will be identified correctly.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition method on all sentences in all languages is provided. A sentence can be a word, name or sentence. All sentences are represented by E×P=12×12 matrices of linear predict coding cepstra (LPCC) 1000 different voices are transformed into 1000 matrices of LPCC to represent 1000 databases. E×P matrices of known sentences after deletion of time intervals between two words are put into their closest databases. To classify an unknown sentence, use the distance to find its F closest databases and then from known sentences in its F databases, find a known sentence to be the unknown one. The invention needs no samples and can find a sentence in one second using Visual Basic. Any person without training can immediately and freely communicate with computer in any language. It can recognize up to 7200 English words, 500 sentences of any language and 500 Chinese words.

38 Citations

View as Search Results

3 Claims

1. A speech recognition method on sentences in all languages comprising:
- (1) a sentence can be a syllable, a word, a name or a sentence, and M=1000 different voices are prepared;
  
  (2) a pre-processor to delete noise and all time intervals without real signal sampled points, before and after a voice (sentence), between two syllables and two words;
  
  (3) a method to normalize the whole waveform of real signal sampled points of a voice (sentence), using E equal elastic frames (windows) without filter and without overlap over each other, and to transform the whole waveform of real signal sampled points into an equal-sized E×
  
  P matrix of the linear predict coding cepstra (LPCC);
  
  (4) M=1000 different voices are transformed into 1000 different E×
  
  P matrices of linear predict coding cepstra (LPCC) to represent 1000 different databases;
  
  (5) a user pronounces a known sentence, delete noise and all time intervals without real language signal points, before and after the known sentence, between two syllables and two words, and E=12 equal elastic frames normalize the whole waveform of real language signal points into an E×
  
  P matrix of LPCC;
  
  (6) use the distance or weighted distance between the E×
  
  P matrix of LPCC of the known sentence and 1000 different E×
  
  P matrices of LPCC of 1000 different voices representing 1000 different databases to find its closest database, the E×
  
  P matrix of the known sentence is put into its closest database, and similarly, the E×
  
  P matrices of LPCC of all known sentences are put into their closest databases individually;
  
  (7) to classify an unknown sentence, after deletion of noise and time intervals without language signal points, before and after the unknown sentence, between two syllables and two words, the unknown sentence with real language sampled points is transformed into an E×
  
  P matrix of LPCC, the invention uses the distance or weighted distance between the E×
  
  P matrix of LPCC of the unknown sentence and 1000 different E×
  
  P matrices of LPCC of 1000 different voices representing 1000 different databases to find its F closest databases and again uses the distance or weighted distance between the E×
  
  P matrix of LPCC of the unknown sentence and the E×
  
  P matrices of LPCC of the similar known sentences in its F closest databases to find a known sentence to be the unknown sentence; and
  
  (8) if an unknown sentence is not identified, the unknown sentence is pronounced again, its E×
  
  P matrix of LPCC is put into the new closest database, and then it will be identified correctly.
- View Dependent Claims (2, 3)
- - 2. The speech recognition method on sentences in all languages of claim 1 wherein said step (2) further includes two methods to delete noise and time intervals without real signal sampled points, before and after a voice (sentence), between two syllables and two words:
    - (a) in a small unit time interval, compute the variance of sampled points in the unit time interval and if the variance is less than the variance of noise, delete the small unit time interval; and
      
      (b) in a small unit time interval, compute the total sum of absolute distances between two consecutive sampled points and if the total sum of absolute distances is less than that of noise, delete the small unit time interval.
  - 3. The speech recognition method on sentences in all languages of claim 1 wherein said step (3) further includes a method for normalization of the signal waveform of a voice or a sentence into an equal-sized E×
    - P matrix of linear predict coding cepstra (LPCC) using E equal elastic frames (windows) without filter and without overlap over each other;
      
      (a) a method is used to uniformly and equally partition the whole waveform of a voice or a sentence into E equal sections, the length of each equal section is proportional to the whole waveform of a sentence (voice) and each equal section forms an elastic frame (window) without filter and without overlap over each other such that E equal elastic frames can contract and expand themselves to cover the whole waveform;
      
      (b) in each equal elastic frame, use a linear regression model to estimate the nonlinear time-varying waveform to produce a set of P=12 regression coefficients, i.e., 12 linear predict coding (LPC) coefficients by the least squares method;
      
      (c) use Durbin'"'"'s recursive equations

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lee L.I. Tai-Jan, Li-Chuan Liao, Shih-Hon Li, Shih-Tzung Li, Tze Fen Li
Original Assignee
Lee L.I. Tai-Jan, Li-Chuan Liao, Shih-Hon Li, Shih-Tzung Li, Tze Fen Li
Inventors
Li, Tze Fen, Li, Shih-Tzung, Li, Shih-Hon, Liao, Li-Chuan, Lee Li, Tai-Jan

Application Number

US12/926,301
Publication Number

US 20120116764A1
Time in Patent Office

Days
Field of Search
US Class Current

704/251
CPC Class Codes

G10L 15/005 Language recognition

Speech recognition method on sentences in all languages

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

38 Citations

3 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition method on sentences in all languages

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

38 Citations

3 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links