Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition

US 20050273337A1
Filed: 06/02/2004
Published: 12/08/2005
Est. Priority Date: 06/02/2004
Status: Abandoned Application

First Claim

Patent Images

1. A method comprising:

selecting one of a plurality of phonetic representations of speech elements of a predefined vocabulary that most closely matches an utterance, wherein said plurality of phonetic representations includes multiple phonetic representations of any of said speech elements having different possible pronunciations; and

synthesizing an audible speech fragment according to said one of said phonetic representations.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

When a speaker-independent voice-recognition (SIVR) system recognizes a spoken utterance that matches a phonetic representation of a speech element belonging to a predefined vocabulary, it may play a synthesized speech fragment as a means for the user to verify that the utterance was correctly recognized. When a speech element in the vocabulary has more than one possible pronunciation, the system may select the one most closely matching the user'"'"'s utterance, and play a synthesized speech fragment corresponding to that particular representation.

Citations

24 Claims

1. A method comprising:
- selecting one of a plurality of phonetic representations of speech elements of a predefined vocabulary that most closely matches an utterance, wherein said plurality of phonetic representations includes multiple phonetic representations of any of said speech elements having different possible pronunciations; and
  
  synthesizing an audible speech fragment according to said one of said phonetic representations.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, further comprising:
    - storing said phonetic representations.
  - 3. The method of claim 1, further comprising:
    - generating said phonetic representations from textual representations of said speech elements.
  - 4. The method of claim 1, further comprising:
    - displaying information identifying the speech element represented by said one of said phonetic representations that most closely matches said utterance.
  - 5. The method of claim 1, further comprising:
    - performing a predetermined action associated with one of said speech elements.
  - 6. The method of claim 2, wherein storing said phonetic representations further comprises storing said phonetic representations as a word graph.

7. An apparatus comprising:
- a processor to select one of a plurality of phonetic representations of speech elements of a predefined vocabulary that most closely matches a portion of an incoming digitized voice signal corresponding to an utterance, wherein said plurality of phonetic representations includes multiple phonetic representations of any of said speech elements having different possible pronunciations, and to synthesize an outgoing digitized voice signal according to said one of said phonetic representations.
- View Dependent Claims (8, 9, 10, 11, 12, 13)
- - 8. The apparatus of claim 7, further comprising:
    - a memory to store said phonetic representations.
  - 9. The apparatus of claim 8, wherein said memory is to store said phonetic representations as a word graph.
  - 10. The apparatus of claim 7, wherein said processor is to generate said phonetic representations from textual representations of said speech elements.
  - 11. The apparatus of claim 10, further comprising:
    - an input device to allow entry of said textual representations.
  - 12. The apparatus of claim 7, further comprising:
    - a display, wherein said processor is to show on said display information identifying the speech element represented by said one of said phonetic representations that most closely matches said utterance.
  - 13. The apparatus of claim 7, wherein said processor is to initiate a predetermined action associated with one of said speech elements.

14. A voice-operated, mobile cellular telephone comprising:
- a transceiver;
  
  an antenna; and
  
  a processor to select one of a plurality of phonetic representations of speech elements of a predefined vocabulary that most closely matches a portion of an incoming digitized voice signal corresponding to an utterance, wherein said plurality of phonetic representations includes multiple phonetic representations of any of said speech elements having different possible pronunciations, and to synthesize an outgoing digitized voice signal according to said one of said phonetic representations.
- View Dependent Claims (15, 16, 17, 18, 19, 20)
- - 15. The voice-operated, mobile cellular telephone of claim 14, further including:
    - a memory to store said phonetic representations.
  - 16. The voice-operated, mobile cellular telephone of claim 15, wherein said memory is to store said phonetic representations as a word graph.
  - 17. The voice-operated, mobile cellular telephone of claim 14, wherein said processor is to generate said phonetic representations from textual representations of said speech elements.
  - 18. The voice-operated, mobile cellular telephone of claim 17, further including:
    - an input device to allow entry of said textual representations.
  - 19. The voice-operated, mobile cellular telephone of claim 14, wherein said processor is to initiate a predetermined action associated with one of said speech elements.
  - 20. The voice-operated, mobile cellular telephone of claim 19, wherein said predetermined action further includes commanding said transceiver to establish a connection with a specified distant party.

21. An article comprising a computer-readable storage medium having stored thereon instructions that, when executed by a processor, result in:
- selecting one of a plurality of phonetic representations of speech elements of a predefined vocabulary that most closely matches an utterance, wherein said plurality of phonetic representations includes multiple phonetic representations of any of said speech elements having different possible pronunciations; and
  
  synthesizing an audible speech fragment according to said one of said phonetic representations.
- View Dependent Claims (22, 23, 24)
- - 22. The article of claim 21, wherein said instructions further result in:
    - storing said phonetic representations.
  - 23. The article of claim 21, wherein said instructions further result in:
    - storing said phonetic representations as a word graph.
  - 24. The article of claim 21, wherein said instructions further result in:
    - generating said phonetic representations from textual representations of said speech elements.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Marvell World Trade Limited (Marvell Technology Group Limited)
Original Assignee
Marvell World Trade Limited (Marvell Technology Group Limited)
Inventors
Erell, Adoram, Melzer, Ezer

Application Number

US10/857,848
Publication Number

US 20050273337A1
Time in Patent Office

Days
Field of Search
US Class Current

704/260
CPC Class Codes

G10L 13/08 Text analysis or generation...

G10L 15/07 to the speaker

Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links