Adaptive telephone relay service systems

US 9,324,324 B2
Filed: 03/18/2015
Issued: 04/26/2016
Est. Priority Date: 05/22/2014
Status: Active Grant

First Claim

Patent Images

1. A computer system, comprising:

one or more hardware processors; and

one or more non-transitory computer-readable media having stored thereon computer-executable instructions that are structured such that, when the computer-executable instructions are executed by the one or more hardware processors, the computer system generates text captions from speech data, including at least the following;

receiving, from a first communications device, the speech data based on a remote party'"'"'s voice;

generating, at the one or more hardware processors, first text captions from the speech data using a speech recognition algorithm;

determining, at the one or more hardware processors, whether the generated first text captions meet a first predetermined quality threshold; and

when the first text captions meet the first predetermined quality threshold, sending the first text captions to a second communications device for display at a display device;

orwhen the first text captions do not meet the first predetermined quality threshold, performing at least the following;

generating, at the one or more hardware processors, second text captions from the speech data based on user input to the speech recognition algorithm from a human user; and

sending the second text captions to the second communications device for display at the display device when the second text captions meet a second predetermined quality threshold.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Adaptive telephone relay service systems. Embodiments herein provide technical solutions for improving text captioning of Captioned Telephone Service calls, including computer systems, computer-implemented methods, and computer program products for automating the text captioning of CTS calls. These technical solutions include, among other things, embodiments for generating text captions from speech data using an adaptive captioning service to provide full automated text captioning and/or operator assisted automated text captioning, embodiments for intercepting and modifying a calling sequence for calls to captioned telephone service users, and embodiments for generating progressive text captions from speech data.

89 Citations

View as Search Results

20 Claims

1. A computer system, comprising:
- one or more hardware processors; and
  
  one or more non-transitory computer-readable media having stored thereon computer-executable instructions that are structured such that, when the computer-executable instructions are executed by the one or more hardware processors, the computer system generates text captions from speech data, including at least the following;
  
  receiving, from a first communications device, the speech data based on a remote party'"'"'s voice;
  
  generating, at the one or more hardware processors, first text captions from the speech data using a speech recognition algorithm;
  
  determining, at the one or more hardware processors, whether the generated first text captions meet a first predetermined quality threshold; and
  
  when the first text captions meet the first predetermined quality threshold, sending the first text captions to a second communications device for display at a display device;
  
  orwhen the first text captions do not meet the first predetermined quality threshold, performing at least the following;
  
  generating, at the one or more hardware processors, second text captions from the speech data based on user input to the speech recognition algorithm from a human user; and
  
  sending the second text captions to the second communications device for display at the display device when the second text captions meet a second predetermined quality threshold.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The computer system as recited in claim 1, wherein the speech data based on the remote party'"'"'s voice comprises a high-fidelity audio recording of the remote party'"'"'s voice, and wherein the high-fidelity audio recording is received over a high-fidelity network.
  - 3. The computer system as recited in claim 1, wherein the speech data based on the remote party'"'"'s voice comprises intermediary speech data elements that were generated at the remote communications device based on a high-fidelity audio recording of the remote party'"'"'s voice.
  - 4. The computer system as recited in claim 1, wherein when the first text captions do not meet the first predetermined quality threshold, the computer system also performs at least the following:
    - sending the first text captions to the second communications device for display at the display device prior to sending the second text captions to the second communications device; and
      
      wherein to sending the second text captions to the second communications device comprises sending one or more updates to the first text captions.
  - 5. The computer system as recited in claim 1, wherein sending the first text captions to the second communications device for display at the display device comprises sending an instruction for the second communications device to annotate at least one text caption having a low confidence score.
  - 6. The computer system as recited in claim 1, wherein generating the second text captions from the speech data based on the user input to the speech recognition algorithm from the human user comprises generating text captions that are annotated with one or more visual cues.
  - 7. The computer system as recited in claim 6, wherein the one or more visual cues convey one or more of humor, emotion, sarcasm, singing, or laughing.
  - 8. The computer system as recited in claim 1, wherein generating the second text captions from the speech data based on the user input to the speech recognition algorithm from the human user comprises receiving, from the human user, at least a first letter for each of a plurality of words contained in the speech data.
  - 9. The computer system as recited in claim 1, wherein generating the second text captions from the speech data based on the user input to the speech recognition algorithm from the human user comprises:
    - presenting, to the human user, a plurality of recognition candidates corresponding a portion of the speech data;
      
      receiving, from the human user, selection of one of the plurality of recognition candidates; and
      
      generating a text caption based on the selected one of the plurality of recognition candidates.
  - 10. The computer system as recited in claim 1, wherein when the second text captions do not meet the second predetermined quality threshold, the computer system requests conventional operator-assisted captioning techniques.
  - 11. The computer system as recited in claim 1, wherein the second predetermined quality threshold is equal to the first predetermined quality threshold.

12. A computer system comprising a mobile phone, the computer system comprising:
- one or more hardware processors;
  
  one or more audio capture devices; and
  
  one or more non-transitory computer-readable media having stored thereon computer-executable instructions that are structured such that, when the computer-executable instructions are executed by the one or more hardware processors, the computer system intercepts a calling sequence, including at least the following;
  
  detecting that the mobile phone is to participate in a phone call;
  
  determining that the phone call is with a captioned telephone service user;
  
  based on the phone call being with the captioned telephone service user, capturing, at the one or more audio capture devices, a high-fidelity recording of a user'"'"'s voice, wherein the high-fidelity recording comprises audio of a frequency range greater than 300 Hz to 3.4 kHz;
  
  sending speech data to an adaptive captioning service based on the high-fidelity recording;
  
  capturing, at the one or more audio capture devices, a low-fidelity recording of the user'"'"'s voice in parallel with capturing the high-fidelity recording of the user'"'"'s voice; and
  
  sending the low-fidelity recording over a first network connection, while sending the high-fidelity recording over a second network connection.
- View Dependent Claims (13, 14, 15, 16, 17, 18)
- - 13. The computer system as recited in claim 12, further comprising:
    - generating intermediary speech data elements from the high-fidelity recording of the user'"'"'s voice; and
      
      wherein sending the speech data to the adaptive captioning service comprises sending the intermediary speech data elements to the adaptive captioning service.
  - 14. The computer system as recited in claim 12, wherein sending the speech data to the adaptive captioning service comprises sending high-fidelity audio to the adaptive captioning service.
  - 15. The computer system as recited in claim 12, further comprising, based on the phone call being with the captioned telephone service user:
    - intercepting a native dialing sequence of the mobile phone to prevent the mobile phone from initiating the phone call over a network connection that is incapable of transporting high-fidelity audio; and
      
      initiating a connection with the adaptive captioning service using the network connection capable of transporting high-fidelity audio.
  - 16. The computer system as recited in claim 12, further comprising, based on the phone call being with the captioned telephone service user:
    - influencing a dialing sequence of the mobile phone to give preference for transporting high-fidelity audio as part of the phone call.
  - 17. The computer system as recited in claim 16, wherein influencing the dialing sequence of the mobile phone to give preference for transporting high-fidelity audio as part of the phone call comprises identifying a preferred audio codec.
  - 18. The computer system as recited in claim 12, wherein the high-fidelity recording comprises audio of a second frequency range of at least approximately 50 Hz to approximately 7 kHz.

19. A computer system, comprising:
- one or more hardware processors; and
  
  one or more non-transitory computer-readable media having stored thereon computer-executable instructions that are structured such that, when the computer-executable instructions are executed by the one or more hardware processors, the computer system generates progressive text captions from speech data, including at least the following;
  
  receiving, from a first communications device, the speech data based on a remote party'"'"'s voice;
  
  generating, at the one or more hardware processors, preliminary text captions from the speech data, the preliminary text captions including at least one text caption having a confidence score below a predefined threshold;
  
  sending the preliminary text captions to a second communications device for display at a display device, including sending an instruction to visually annotate the least one text caption having the confidence score below the predefined threshold;
  
  generating, at the one or more hardware processors, final text captions from the speech data, the final text captions including a different caption result for the least one text caption; and
  
  sending the different caption result for the least one text caption to the second communications device for display at the display device, including sending an instruction for the second communications device to dynamically update the at least one text caption with the different caption result.
- View Dependent Claims (20)
- - 20. The computer system as recited in claim 19, wherein sending the different caption result for the least one text caption to the second communications device includes sending the instruction for a second computer system to visually annotate the least one text caption with a different annotation mechanism after dynamically updating the least one text caption with the different caption result.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Captel Incorporated
Original Assignee
Nedelco, Inc.
Inventors
Knighton, Jeffery F.
Primary Examiner(s)
Bolourchi, Nader

Application Number

US14/662,008
Publication Number

US 20150341486A1
Time in Patent Office

405 Days
Field of Search

455/414.1
US Class Current

1/1
CPC Class Codes

G10L 15/26   Speech to text systems G10L...

H04M 1/2475   for a hearing impaired user

H04M 1/72433   for voice messaging, e.g. d...

H04M 1/72478   for hearing-impaired users

H04M 2250/74   with voice recognition mean...

H04M 3/42   Systems providing special s...

H04M 3/42221   Conversation recording syst...

H04M 3/42391   where the subscribers are h...

H04W 4/16   Communication-related suppl...

H04W 4/18   Information format or conte...

Adaptive telephone relay service systems

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

89 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Adaptive telephone relay service systems

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

89 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links