System and method for the secure, real-time, high accuracy conversion of general-quality speech into text

US 7,539,086 B2
Filed: 08/03/2004
Issued: 05/26/2009
Est. Priority Date: 10/23/2002
Status: Active Grant

First Claim

Patent Images

1. A system, comprising:

a receiving element to receive audio segments which are portions of audio streams, the receiving element creating sub-segments from the audio segments;

a mixing element to receive the sub-segments from the audio streams and randomize the order of the sub-segments;

a transmitting element to send the randomized sub-segments to a plurality of transcribers, each of the randomized sub-segments to be transcribed into text by the transcriber which received the randomized sub-segment; and

a text receiving element to receive the transcribed text from each of the transcribers.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The system is designed to interface with external devices and services, to transcribe audio that may be stored elsewhere such as a wireless phone'"'"'voice mail, or occurring between two or more parties such as a conference call. An audio stream is separated into many audio shreds, each of which has duration of only a few seconds and cannot reveal the context of the conversation. A workforce of geographically distributed transcription agents who transcribe the audio shreds is able to generate transcription in real time, with many agents working in parallel on a single conversation. No one agent (or group of agents) receives a sufficient number of audio shreds to reconstruct the context of any conversation. The use of human transcribers allows the system to overcome limitations typical of computer-based speech recognition and permits accurate transcription of general-quality speech even in acoustically hostile environments.

Citations

33 Claims

1. A system, comprising:
- a receiving element to receive audio segments which are portions of audio streams, the receiving element creating sub-segments from the audio segments;
  
  a mixing element to receive the sub-segments from the audio streams and randomize the order of the sub-segments;
  
  a transmitting element to send the randomized sub-segments to a plurality of transcribers, each of the randomized sub-segments to be transcribed into text by the transcriber which received the randomized sub-segment; and
  
  a text receiving element to receive the transcribed text from each of the transcribers.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 2. The system of claim 1, further comprising:
    - a reassembling element to receive the text corresponding to the sub-segments of each of the audio streams and to combine the text to create a text file corresponding to the audio stream.
  - 3. The system of claim 1, further comprising:
    - an accounting element to maintain subscriber accounts, wherein the audio streams are to be received from subscribers to the system, each subscriber account to be debited when the system transcribes audio streams corresponding to the subscriber account.
  - 4. The system of claim 3, wherein the accounting element is to further maintain transcriber accounts, each transcriber account being credited when the transcriber corresponding to the transcriber account transcribes audio sub-segments.
  - 5. The system of claim 4, wherein the transcriber accounts are to be credited based on one of a number of words transcribed and an amount of time of transcription.
  - 6. The system of claim 1, further comprising:
    - a quality assurance element to monitor one of an accuracy and a speed of each of the transcribers.
  - 7. The system of claim 1, further comprising:
    - a workforce management element to monitor an availability of each of the transcribers, wherein the workforce management element is to distribute randomized sub-segments to each of the transcribers based on their availability.
  - 8. The system of claim 7, wherein the workforce management element is to further schedule a work time for each of the transcribers.
  - 9. The system of claim 7, wherein the workforce management element is to further monitor a skill of each of the transcribers, the distribution of the randomized sub-segments being further based on the skill.
  - 10. The system of claim 9, wherein the skill includes one of a language skill, a professional field skill and a dialect skill.
  - 11. The system of claim 9, wherein the workforce management element is to further invite additional transcribers to the system when the skill is in short supply among the transcribers.
  - 12. The system of claim 11, wherein the invitation is by one of an electronic mail message, a telephone call and a pager message.
  - 13. The system of claim 7, wherein the workforce management element is to further set a price for transcription services based on a supply of transcribers and demand for transcription services.
  - 14. The system of claim 1, wherein the audio streams include one of calls directed to a phone that are redirected to the receiving element, voice communications from a personal computer, voice communications from a handheld dictation device, recordings of a voice mail service connected directly to the system, recordings of an external system and recordings on a server.
  - 15. The system of claim 14, wherein the phone includes one of a POTS phone, a PBX phone and an Internet Protocol phone.
  - 16. The system of claim 14, wherein the redirected calls are through a telephone company switch.
  - 17. The system of claim 16, wherein the telephone company switch is under a direction of Communications Assistance to Law Enforcement Act, CALEA.
  - 18. The system of claim 16, wherein the telephone company switch is under direction of one of call forward on busy, CFB, call forward on no answer, CFNA, and call forwarding, CF, services.
  - 19. The system of claim 14, wherein the audio streams include a telephone call between at least two parties.
  - 20. The system of claim 19, wherein biometrics are used to identify the at least two parties.
  - 21. The system of claim 14, wherein the system accesses the recordings of the external system by directing a phone call to the external system and navigating a menu of the external system using one of DTMF and speech recognition.
  - 22. The system of claim 21, wherein the external system is one of a voice mail system and a general interactive voice response, IVR, system.

23. A method, comprising:
- receiving audio segments which are portions of audio streams;
  
  creating sub-segments from the audio segments;
  
  randomizing the order of the sub-segments;
  
  sending the randomized sub-segments to a plurality of transcribers, each of the randomized sub-segments being transcribed into text by the transcriber which received the randomized sub-segment; and
  
  receiving the transcribed text from each of the transcribers.
- View Dependent Claims (24, 25, 26, 27)
- - 24. The method of claim 23, further comprising the steps of:
    - reassembling text corresponding to the sub-segments to create a text file corresponding to each of the audio streams.
  - 25. The method of claim 23, wherein a duration of the sub-segments is in a range of 1-10 seconds.
  - 26. The method of claim 23, wherein a duration of the sub-segments is based on one of a transcription accuracy, a transcription speed and a security level.
  - 27. The method of claim 23, wherein a foreign language of one of the audio segments is determined by sending the one of the audio segments to a plurality of transcribers with different foreign language skills.

28. A system comprising:
- a receiving element to create a plurality of audio shreds;
  
  a mixing element to randomize the order of the audio shreds;
  
  a transmitting element to send the randomized audio shreds to a plurality of transcribers, each of the randomized audio shreds to be transcribed into text by the transcriber which received the randomized audio shred; and
  
  a reassembling element to receive text corresponding to the randomized audio shreds and to combine the text to create a text file.
- View Dependent Claims (29, 30, 31, 32, 33)
- - 29. The system of claim 28 further comprising:
    - a work force management element to monitor an availability of each of the transcribers, wherein the workforce management element is to distribute randomized sub-segments to each of the transcribers based on their availability.
  - 30. The system of claim 29, wherein the workforce management element is to further schedule a work time for each of the transcribers.
  - 31. The system of claim 29, wherein the workforce management element is to further monitor a skill of each of the transcribers, the distribution of the randomized sub-segments being further based on the skill.
  - 32. The system of claim 31, wherein the skill includes one of a language skill, a professional field skill and a dialect skill.
  - 33. The system of claim 31, wherein the workforce management element is to further invite additional transcribers to the system when the skill is in short supply among the transcribers.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Advanced Messaging Technologies, Inc. (J2 Global, Inc.)
Original Assignee
J2 Global, Inc.
Inventors
Jaroker, Jon
Primary Examiner(s)
Hudspeth; David R
Assistant Examiner(s)
Kovacek; David

Application Number

US10/910,723
Publication Number

US 20050010407A1
Time in Patent Office

1,757 Days
Field of Search

704 1- 8, 704/200, 704/201, 704/203, 704206-210, 704231-236, 704243-245, 704/258, 704/260, 704/270, 704/275, 704/278, 369 1- 5, 369 2401- 3001, 369 471- 4713, 369 68- 69, 369 86- 92, 709201-202, 709204-207, 709223-226, 709238-246, 709/249
US Class Current

369/25.01
CPC Class Codes

G06Q 10/00   Administration; Management

G06Q 10/10   Office automation; Time man...

G06Q 50/18   Legal services

G10L 15/26   Speech to text systems G10L...

H04M 2201/40   using speech recognition

H04M 2201/60   Medium conversion

System and method for the secure, real-time, high accuracy conversion of general-quality speech into text

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

33 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for the secure, real-time, high accuracy conversion of general-quality speech into text

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

33 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links