Voice enabled knowledge system

US 20070124142A1
Filed: 11/25/2005
Published: 05/31/2007
Est. Priority Date: 11/25/2005
Status: Abandoned Application

First Claim

Patent Images

1. A system for converting speech to text comprising:

a speech recognition engine for understanding the spoken words of a user, further comprising;

a representation unit to represent the spoken words;

a model classification unit to classify the spoken words;

a training database to match the spoken words with preset words, and a search unit to search for the spoken word in said training database, based on the results of said model classification.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

This invention discloses a voice enabled knowledge system, comprising a speech recognition engine and text to speech engine. The speech recognition engine further comprises a representation unit to represent the spoken words, a model classification unit to classify the spoken words, a training database to match the spoken words with preset words and a search unit to search for the spoken word in said training database, based on the results of said model classification. The text to speech engine for conversion of an input text to speech, comprises a text pre-processing unit for analyzing the input text in a sentence form, a prosody unit for word recognition using said acoustic model, a concatenation unit for converting the diphone equivalents into words and thereafter into a sentence and an audio output device for speech output.

Citations

15 Claims

1. A system for converting speech to text comprising:
- a speech recognition engine for understanding the spoken words of a user, further comprising;
  
  a representation unit to represent the spoken words;
  
  a model classification unit to classify the spoken words;
  
  a training database to match the spoken words with preset words, and a search unit to search for the spoken word in said training database, based on the results of said model classification.

2. A system for converting text to speech comprising:
- a text to speech engine for understanding the spoken words of a user, further comprising;
  
  a text pre-processing unit for analyzing the input text in a sentence form;
  
  a prosody unit for word recognition using said acoustic model;
  
  a concatenation unit for converting the diphone equivalents into words and thereafter into a sentence; and
  
  an audio output device for speech output.

3. A voice enabled knowledge system, comprising:
- a speech recognition engine for understanding the spoken words of a user, further comprising;
  
  a representation unit to represent the spoken words;
  
  a model classification unit to classify the spoken words;
  
  a training database to match the spoken words with preset words, a search unit to search for the spoken word in said training database, based on the results of said model classification; and
  
  a text to speech engine for conversion of an input text to speech, further comprising;
  
  a text pre-processing unit for analyzing the input text in a sentence form;
  
  a prosody unit for word recognition using said acoustic model;
  
  a concatenation unit for converting the diphone equivalents into words and thereafter into a sentence; and
  
  an audio output device for speech output.
- View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 4. The tool to audio enable the documents of claim 3, wherein the training database further comprises:
    - an acoustic model to recognize the pitch and flow of the spoken word;
      
      a lexical model to recognize the punctuations of the spoken word; and
      
      a language model for information classification.
  - 5. The voice enabled knowledge system of claim 3, wherein the text pre-processing unit further comprises:
    - a number converter to convert numbers to their textual equivalents;
      
      an acronym converter to replace acronyms with their single letter components and convert abbreviations to their textual equivalents;
      
      a word-segmenter to fragment sentences created from said input text into words;
      
      a word to diphone translator to convert said words to their diphone equivalents;
      
      a diphone dictionary to map diphones with the words; and
      
      a multi level data structure for storing the diphone equivalents of the input text.
  - 6. The voice enabled knowledge system of claim 3, wherein the prosody unit further comprises:
    - a diphone retrieval unit for retrieval of said diphone equivalents;
      
      a diphone dictionary to choose the word corresponding to its diphone equivalent; and
      
      an acoustic manipulation unit for recognition of appropriate file format.
  - 7. The voice enabled knowledge system of claim 3, wherein said document includes hyper-text markup language documents.
  - 8. The voice enabled knowledge system of claim 3, further comprising a summarizer to prepare and play the summary of an input request.
  - 9. The voice enabled knowledge system of claim 3, wherein the text to speech engine reads out text highlighted on a document by a user.
  - 10. The voice enabled knowledge system of claim 3, wherein the voice enabled knowledge system edits text documents.
  - 11. The voice enabled knowledge system of claim 3, wherein the voice enabled knowledge system is installed in personal digital assistants, mobile devices and personal computers.
  - 12. The voice enabled knowledge system of claim 3, wherein the speech recognition engine interprets the user'"'"'s tone, pitch, accent and other speech characteristics.
  - 13. The voice enabled system of claim 3, wherein the voice enabled system reads all the pages from a Microsoft word document and all the slides from a Microsoft power point file even while only one page or slide is visible in the active window.
  - 14. The voice enabled system of claim 3, wherein the voice enabled system searches the world-wide web using voice commands and to create voice enabled business critical information, data entry forms and electronic commerce applications.
  - 15. The voice enabled system of claim 3, wherein the voice enabled system provides a voice tune up process, wherein the pronunciation and dictation can be fine tuned during voice recognition.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Santosh Mukherjee
Original Assignee
Santosh Mukherjee
Inventors
Mukherjee, Santosh

Application Number

US11/287,139
Publication Number

US 20070124142A1
Time in Patent Office

Days
Field of Search
US Class Current

704/235
CPC Class Codes

G10L 13/10 Prosody rules derived from ...

G10L 15/26 Speech to text systems G10L...

Voice enabled knowledge system

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Voice enabled knowledge system

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links