×

Text-independent speaker recognition system and method based on acoustic segment matching

  • US 4,773,093 A
  • Filed: 12/31/1984
  • Issued: 09/20/1988
  • Est. Priority Date: 12/31/1984
  • Status: Expired due to Term
First Claim
Patent Images

1. A speaker recognition system for automatically recognizing a given speaker from a group of enrolled speakers where said system can select a given enrolled speaker from said other enrolled speakers, comprising:

  • enrollment means including first acoustic analysis means for enabling each speaker to provide an input speech training utterance for converting said input speech utterance into frames of equal duration by providing at an output a parametric representation of each frame,covering analysis means coupled to said output of said acoustic analysis means for dividing said parametric representation to shorter, equal length segments indicative of sub-word units and providing at an output a subset of said segments that represent said training utterance, with said subset of segments representing an initial template set for each enrolled speaker, template storage means for storing said initial template set,aligning means coupled to said template set storing means for aligning each template frame with at least one frame of said input speech utterance to provide a label for each utterance frame as aligned with a template frame,frame averaging means coupled to said aligning means for averaging all input speech utterance frames that were aligned with each template frame,template update means coupled to said frame averaging means and said template set storing means to replace each template set as stored with the corresponding average of said utterance frames to provide a new set of stored templates,recognition means including second acoustic analysis means for enabling a speaker to be recognized to speak and for dividing said speech into said equal duration frames by providing at an output a parametric representation of each frame, and means for matching said new set of stored templates for each enrolled speaker with said parametric representation of each frame to provide at an output a match score for each enrolled speaker and means responsive to the minimum match score to identify said one of said enrolled speakers who is speaking.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×