Method and System for Masking Speech

US 20060247924A1
Filed: 07/12/2006
Published: 11/02/2006
Est. Priority Date: 07/24/2002
Status: Active Grant

First Claim

Patent Images

1. A method of producing a substantially unintelligible, obfuscated speech signal from intelligible speech, comprising the steps of:

obtaining a speech signal representing a speech stream;

temporally partitioning said speech signal into a plurality of segments, said segments occurring in an initial order within said speech signal;

selecting a plurality of selected segments from among said segments; and

assembling said selected segments, in an order different than said initial order, to produce said obfuscated speech signal;

wherein said segments represent phonemes within said speech stream;

wherein said temporally partitioning step comprises the steps of;

squaring said speech signal;

calculating a short time average of said speech signal over a short time scale;

calculating a medium time average of said speech signal over a medium time scale;

calculating a difference between said short time average and said medium time average; and

detecting zero crossings in said difference;

wherein said zero crossings delineate said segments.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A simple and efficient method for producing an obfuscated speech signal which may be used to mask a stream of speech, is disclosed. A speech signal representing the speech stream to be masked is obtained. The speech signal is then temporally partitioned into segments, preferably corresponding to phonemes within the speech stream. The segments are then stored in a memory, and some or all of the segments are subsequently selected, retrieved, and assembled into an obfuscated speech signal representing an unintelligible speech stream that, when combined with the speech signal or reproduced and combined with the speech stream, provides a masking effect. While the presently preferred embodiment finds application most readily in an open plan office, embodiments suitable for use in restaurants, classrooms, and in telecommunications systems are also disclosed.

Citations

14 Claims

1. A method of producing a substantially unintelligible, obfuscated speech signal from intelligible speech, comprising the steps of:
- obtaining a speech signal representing a speech stream;
  
  temporally partitioning said speech signal into a plurality of segments, said segments occurring in an initial order within said speech signal;
  
  selecting a plurality of selected segments from among said segments; and
  
  assembling said selected segments, in an order different than said initial order, to produce said obfuscated speech signal;
  
  wherein said segments represent phonemes within said speech stream;
  
  wherein said temporally partitioning step comprises the steps of;
  
  squaring said speech signal;
  
  calculating a short time average of said speech signal over a short time scale;
  
  calculating a medium time average of said speech signal over a medium time scale;
  
  calculating a difference between said short time average and said medium time average; and
  
  detecting zero crossings in said difference;
  
  wherein said zero crossings delineate said segments.
- View Dependent Claims (2, 3)
- - 2. The method of claim 1, wherein said short time scale characterizes a length of a typical phoneme in said speech stream.
  - 3. The method of claim 1, wherein said medium time scale characterizes a length of a typical word in said speech stream.

4. A method of producing a substantially unintelligible, obfuscated speech signal from intelligible speech, comprising the steps of:
- obtaining a speech signal representing a speech stream;
  
  temporally partitioning said speech signal into a plurality of segments, said segments occurring in an initial order within said speech signal;
  
  selecting a plurality of selected segments from among said segments; and
  
  assembling said selected segments, in an order different than said initial order, to produce said obfuscated speech signal;
  
  further comprising the step, immediately following said temporally partitioning step, of;
  
  storing said segments in a memory; and
  
  further comprising the step, immediately following said selecting step, of;
  
  retrieving said selected segments from said memory;
  
  wherein said storing step comprises the steps of;
  
  squaring said speech signal;
  
  calculating a long time average of said speech signal over a long time scale;
  
  determining when said long time average is above a first threshold and when said long time average is below a second threshold;
  
  halting said storing of said segments in said memory when said long time average is below said second threshold; and
  
  resuming said storing of said segments in said memory when said long time average is above said first threshold.
- View Dependent Claims (5)
- - 5. The method of claim 4, wherein said long time scale characterizes a conversational time scale of said speech stream.

6. A method of producing a substantially unintelligible, obfuscated speech signal from intelligible speech, comprising the steps of:
- obtaining a speech signal representing a speech stream;
  
  temporally partitioning said speech signal into a plurality of segments, said segments occurring in an initial order within said speech signal;
  
  selecting a plurality of selected segments from among said segments; and
  
  assembling said selected segments, in an order different than said initial order, to produce said obfuscated speech signal;
  
  further comprising the step, immediately following said temporally partitioning step, of;
  
  storing said segments in a memory; and
  
  further comprising the step, immediately following said selecting step, of;
  
  retrieving said selected segments from said memory;
  
  wherein said retrieving step comprises the steps of;
  
  squaring said speech signal;
  
  calculating a long time average of said speech signal over a long time scale;
  
  determining when said long time average is above a first threshold and when said long time average is below a second threshold;
  
  halting said retrieving of said segments from said memory when said long time average is below said second threshold; and
  
  resuming said retrieving of said segments from said memory when said long time average is above said first threshold.
- View Dependent Claims (7)
- - 7. The method of claim 6, wherein said long time scale characterizes a conversational time scale of said speech stream.

8. An apparatus for producing a substantially unintelligible, obfuscated speech signal from intelligible speech, comprising:
- a module for obtaining a speech signal representing a speech stream;
  
  a module for temporally partitioning said speech signal into a plurality of segments, said segments occurring in an initial order within said speech signal;
  
  a module for selecting a plurality of selected segments from among said segments; and
  
  a module for assembling said selected segments, in an order different than said initial order, to produce said obfuscated speech signal;
  
  wherein said module for temporally partitioning further comprises;
  
  a module for squaring said speech signal;
  
  a module for calculating a short time average of said speech signal over a short time scale;
  
  a module for calculating a medium time average of said speech signal over a medium time scale;
  
  a module for calculating a difference between said short time average and said medium time average; and
  
  a module for detecting zero crossings in said difference;
  
  wherein said zero crossings delineate said segments.
- View Dependent Claims (9, 10)
- - 9. The apparatus of claim 8, wherein said short time scale characterizes a length of a typical phoneme in said speech stream.
  - 10. The apparatus of claim 8, wherein said medium time scale characterizes a length of a typical word in said speech stream.

11. An apparatus for producing a substantially unintelligible, obfuscated speech signal from intelligible speech, comprising:
- a module for obtaining a speech signal representing a speech stream;
  
  a module for temporally partitioning said speech signal into a plurality of segments, said segments occurring in an initial order within said speech signal;
  
  a module for selecting a plurality of selected segments from among said segments;
  
  a module for assembling said selected segments, in an order different than said initial order, to produce said obfuscated speech signal;
  
  a memory for storing said segments; and
  
  a module for retrieving said selected segments from said memory;
  
  wherein said memory comprises;
  
  a module for squaring said speech signal;
  
  a module for calculating a long time average of said speech signal over a long time scale;
  
  a module for determining when said long time average is above a first threshold and when said long time average is below a second threshold;
  
  a module for halting said storing of said segments in said memory when said long time average is below said second threshold; and
  
  a module for resuming said storing of said segments in said memory when said long time average is above said first threshold.
- View Dependent Claims (12)
- - 12. The apparatus of claim 11, wherein said long time scale characterizes a conversational time scale of said speech stream.

13. An apparatus for producing a substantially unintelligible, obfuscated speech signal from intelligible speech, comprising:
- a module for obtaining a speech signal representing a speech stream;
  
  a module for temporally partitioning said speech signal into a plurality of segments, said segments occurring in an initial order within said speech signal;
  
  a module for selecting a plurality of selected segments from among said segments;
  
  a module for assembling said selected segments, in an order different than said initial order, to produce said obfuscated speech signal;
  
  a memory for storing said segments; and
  
  a module for retrieving said selected segments from said memory;
  
  wherein said module for retrieving comprises;
  
  a module for squaring said speech signal;
  
  a module for calculating a long time average of said speech signal over a long time scale;
  
  a module for determining when said long time average is above a first threshold and when said long time average is below a second threshold;
  
  a module for halting said retrieving of said segments from said memory when said long time average is below said second threshold; and
  
  a module for resuming said retrieving of said segments from said memory when said long time average is above said first threshold.
- View Dependent Claims (14)
- - 14. The apparatus of claim 13, wherein said long time scale characterizes a conversational time scale of said speech stream.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Applied Invention, LLC
Original Assignee
Applied Minds, Inc.
Inventors
Hillis, W. Daniel, Ferren, Bran, Howe, Russel

Granted Patent

US 7,184,952 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/213
CPC Class Codes

G10K 11/1754   Speech masking

G10K 15/02   Synthesis of acoustic waves...

G10L 21/00   Speech or voice signal proc...

G10L 21/06   Transformation of speech in...

H04K 1/02   by adding a second signal t...

H04K 1/06   by transmitting the informa...

H04K 2203/12   for acoustic communication

H04K 3/825   by jamming

Method and System for Masking Speech

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Method and System for Masking Speech

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links