Adaptive telephone relay service systems
First Claim
1. A computer system, comprising:
- one or more hardware processors; and
one or more non-transitory computer-readable media having stored thereon computer-executable instructions that are structured such that, when the computer-executable instructions are executed by the one or more hardware processors, the computer system generates text captions from speech data, including at least the following;
receiving, from a first communications device, the speech data based on a remote party'"'"'s voice;
generating, at the one or more hardware processors, first text captions from the speech data using a speech recognition algorithm;
determining, at the one or more hardware processors, whether the generated first text captions meet a first predetermined quality threshold; and
when the first text captions meet the first predetermined quality threshold, sending the first text captions to a second communications device for display at a display device;
orwhen the first text captions do not meet the first predetermined quality threshold, performing at least the following;
generating, at the one or more hardware processors, second text captions from the speech data based on user input to the speech recognition algorithm from a human user; and
sending the second text captions to the second communications device for display at the display device when the second text captions meet a second predetermined quality threshold.
3 Assignments
0 Petitions
Accused Products
Abstract
Adaptive telephone relay service systems. Embodiments herein provide technical solutions for improving text captioning of Captioned Telephone Service calls, including computer systems, computer-implemented methods, and computer program products for automating the text captioning of CTS calls. These technical solutions include, among other things, embodiments for generating text captions from speech data using an adaptive captioning service to provide full automated text captioning and/or operator assisted automated text captioning, embodiments for intercepting and modifying a calling sequence for calls to captioned telephone service users, and embodiments for generating progressive text captions from speech data.
89 Citations
20 Claims
-
1. A computer system, comprising:
-
one or more hardware processors; and one or more non-transitory computer-readable media having stored thereon computer-executable instructions that are structured such that, when the computer-executable instructions are executed by the one or more hardware processors, the computer system generates text captions from speech data, including at least the following; receiving, from a first communications device, the speech data based on a remote party'"'"'s voice; generating, at the one or more hardware processors, first text captions from the speech data using a speech recognition algorithm; determining, at the one or more hardware processors, whether the generated first text captions meet a first predetermined quality threshold; and when the first text captions meet the first predetermined quality threshold, sending the first text captions to a second communications device for display at a display device;
orwhen the first text captions do not meet the first predetermined quality threshold, performing at least the following; generating, at the one or more hardware processors, second text captions from the speech data based on user input to the speech recognition algorithm from a human user; and sending the second text captions to the second communications device for display at the display device when the second text captions meet a second predetermined quality threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer system comprising a mobile phone, the computer system comprising:
-
one or more hardware processors; one or more audio capture devices; and one or more non-transitory computer-readable media having stored thereon computer-executable instructions that are structured such that, when the computer-executable instructions are executed by the one or more hardware processors, the computer system intercepts a calling sequence, including at least the following; detecting that the mobile phone is to participate in a phone call; determining that the phone call is with a captioned telephone service user; based on the phone call being with the captioned telephone service user, capturing, at the one or more audio capture devices, a high-fidelity recording of a user'"'"'s voice, wherein the high-fidelity recording comprises audio of a frequency range greater than 300 Hz to 3.4 kHz; sending speech data to an adaptive captioning service based on the high-fidelity recording; capturing, at the one or more audio capture devices, a low-fidelity recording of the user'"'"'s voice in parallel with capturing the high-fidelity recording of the user'"'"'s voice; and sending the low-fidelity recording over a first network connection, while sending the high-fidelity recording over a second network connection. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A computer system, comprising:
-
one or more hardware processors; and one or more non-transitory computer-readable media having stored thereon computer-executable instructions that are structured such that, when the computer-executable instructions are executed by the one or more hardware processors, the computer system generates progressive text captions from speech data, including at least the following; receiving, from a first communications device, the speech data based on a remote party'"'"'s voice; generating, at the one or more hardware processors, preliminary text captions from the speech data, the preliminary text captions including at least one text caption having a confidence score below a predefined threshold; sending the preliminary text captions to a second communications device for display at a display device, including sending an instruction to visually annotate the least one text caption having the confidence score below the predefined threshold; generating, at the one or more hardware processors, final text captions from the speech data, the final text captions including a different caption result for the least one text caption; and sending the different caption result for the least one text caption to the second communications device for display at the display device, including sending an instruction for the second communications device to dynamically update the at least one text caption with the different caption result. - View Dependent Claims (20)
-
Specification