System and method for validating natural language content using crowdsourced validation jobs
First Claim
1. A computer-implemented method, the method being implemented in a computer system having one or more physical processors programmed with computer program instructions that, when executed by the one or more physical processors, program the computer system to perform the method, the method comprising:
- obtaining, by the computer system, a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content;
creating, by the computer system, a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content;
causing, by the computer system, the first crowdsourced validation job to be provided to the one or more first validation devices;
receiving, by the computer system, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content;
assigning, by the computer system, a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair;
determining, by the computer system, whether or not the one or more first validation devices agreed the text is an accurate transcription of the natural language content; and
determining, by the computer system, whether to provide the transcription pair to one or more second validation devices.
7 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods of validating transcriptions of natural language content using crowdsourced validation jobs are provided herein. In various implementations, a transcription pair comprising natural language content and text corresponding to a transcription of the natural language content may be gathered. A first group of validation devices may be selected for reviewing the transcription pair. A first crowdsourced validation job may be created for the first group of validation devices. The first crowdsourced validation job may be provided to the first group of validation devices. A vote representing whether or not the text accurately represents the natural language content may be received from each of the first group of validation devices. A validation score may be assigned to the transcription pair based, at least in part, on the votes from each of the first group of validation devices.
44 Citations
24 Claims
-
1. A computer-implemented method, the method being implemented in a computer system having one or more physical processors programmed with computer program instructions that, when executed by the one or more physical processors, program the computer system to perform the method, the method comprising:
-
obtaining, by the computer system, a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; creating, by the computer system, a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; causing, by the computer system, the first crowdsourced validation job to be provided to the one or more first validation devices; receiving, by the computer system, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; assigning, by the computer system, a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair; determining, by the computer system, whether or not the one or more first validation devices agreed the text is an accurate transcription of the natural language content; and determining, by the computer system, whether to provide the transcription pair to one or more second validation devices. - View Dependent Claims (2)
-
-
3. A computer-implemented method, the method being implemented in a computer system having one or more physical processors programmed with computer program instructions that, when executed by the one or more physical processors, program the computer system to perform the method, the method comprising:
-
obtaining, by the computer system, a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; creating, by the computer system, a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; causing, by the computer system, the first crowdsourced validation job to be provided to the one or more first validation devices; receiving, by the computer system, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; assigning, by the computer system, a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair; identifying, by the computer system, a confidence score of each of the one or more first validation devices, the confidence score representing confidence in the vote from the each of the one or more first validation devices; and updating, by the computer system, the validation score with respect to the transcription pair using the confidence score.
-
-
4. A computer-implemented method, the method being implemented in a computer system having one or more physical processors programmed with computer program instructions that, when executed by the one or more physical processors, program the computer system to perform the method, the method comprising:
-
obtaining, by the computer system, a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; creating, by the computer system, a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; causing, by the computer system, the first crowdsourced validation job to be provided to the one or more first validation devices, wherein the one or more first validation devices comprises at least two validation devices; receiving, by the computer system, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; and assigning, by the computer system, a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair. - View Dependent Claims (5)
-
-
6. A computer-implemented method, the method being implemented in a computer system having one or more physical processors programmed with computer program instructions that, when executed by the one or more physical processors, program the computer system to perform the method, the method comprising:
-
obtaining, by the computer system, a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; creating, by the computer system, a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content, wherein the first instructions configure the one or more first validation devices to provide the natural language content and the text in a survey on a mobile application on the one or more first validation devices; causing, by the computer system, the first crowdsourced validation job to be provided to the one or more first validation devices; receiving, by the computer system, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; and assigning, by the computer system, a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair. - View Dependent Claims (7, 8)
-
-
9. A computer-implemented method, the method being implemented in a computer system having one or more physical processors programmed with computer program instructions that, when executed by the one or more physical processors, program the computer system to perform the method, the method comprising:
-
obtaining, by the computer system, a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; creating, by the computer system, a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; causing, by the computer system, the first crowdsourced validation job to be provided to the one or more first validation devices; receiving, by the computer system, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; assigning, by the computer system, a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair; and storing the transcription pair in a validated transcription library if the validation score of the transcription pair exceeds a validation threshold. - View Dependent Claims (10)
-
-
11. A system comprising:
-
one or more physical processors programmed with one or more computer program instructions which, when executed, program the one or more physical processors to; obtain a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; create a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; cause the first crowdsourced validation job to be provided to the one or more first validation devices; receive, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; assign a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair; determine whether or not the one or more first validation devices agreed the text is an accurate transcription of the natural language content; and determine whether to provide the transcription pair to one or more second validation devices. - View Dependent Claims (12)
-
-
13. A system comprising:
-
one or more physical processors programmed with one or more computer program instructions which, when executed, program the one or more physical processors to; obtain a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; create a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; cause the first crowdsourced validation job to be provided to the one or more first validation devices; receive, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; assign a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair; identify a confidence score of each of the one or more first validation devices, the confidence score representing confidence in the vote from the each of the one or more first validation devices; and update the validation score with respect to the transcription pair using the confidence score.
-
-
14. A system comprising:
-
one or more physical processors programmed with one or more computer program instructions which, when executed, program the one or more physical processors to; obtain a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; create a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; cause the first crowdsourced validation job to be provided to the one or more first validation devices, wherein the one or more first validation devices comprises at least two validation devices; receive, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; and assign a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair. - View Dependent Claims (17)
-
-
15. A system comprising:
-
one or more physical processors programmed with one or more computer program instructions which, when executed, program the one or more physical processors to; obtain a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; create a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content, wherein the first instructions configure the one or more first validation devices to provide the natural language content and the text in a survey on a mobile application on the one or more first validation devices; cause the first crowdsourced validation job to be provided to the one or more first validation devices; receive, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; and assign a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair. - View Dependent Claims (16, 18)
-
-
19. A system comprising:
-
one or more physical processors programmed with one or more computer program instructions which, when executed, program the one or more physical processors to; obtain a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; create a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; cause the first crowdsourced validation job to be provided to the one or more first validation devices, wherein to cause the first crowdsourced validation job to be provided to the one or more first validation devices, the one or more physical processors are further programmed to; provide, to the one or more first validation devices, a message that includes a direct selectable link to an interface, wherein the interface includes the first crowdsourced validation job; receive, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; and assign a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair.
-
-
20. A system comprising:
-
one or more physical processors programmed with one or more computer program instructions which, when executed, program the one or more physical processors to; obtain a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; create a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; cause the first crowdsourced validation job to be provided to the one or more first validation devices in a validator application; receive, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; and assign a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair.
-
-
21. A system comprising:
-
one or more physical processors programmed with one or more computer program instructions which, when executed, program the one or more physical processors to; obtain a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; create a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; cause the first crowdsourced validation job to be provided to the one or more first validation devices; receive, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; assign a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair; and store the transcription pair in a validated transcription library if the validation score of the transcription pair exceeds a validation threshold. - View Dependent Claims (22)
-
-
23. A computer-implemented method, the method being implemented in a computer system having one or more physical processors programmed with computer program instructions that, when executed by the one or more physical processors, program the computer system to perform the method, the method comprising:
-
obtaining, by the computer system, a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; creating, by the computer system, a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; causing, by the computer system, the first crowdsourced validation job to be provided to the one or more first validation devices, wherein causing the first crowdsourced validation job to be provided to the one or more first validation devices comprises; providing, by the computer system, to the one or more first validation devices, a message that includes a direct selectable link to an interface, wherein the interface includes the first crowdsourced validation job; receiving, by the computer system, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; and assigning, by the computer system, a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair.
-
-
24. A computer-implemented method, the method being implemented in a computer system having one or more physical processors programmed with computer program instructions that, when executed by the one or more physical processors, program the computer system to perform the method, the method comprising:
-
obtaining, by the computer system, a transcription pair comprising natural language content and text, wherein the natural language content comprises audio content received from one or more audio input components, and wherein the text corresponds to a transcription of the natural language content; creating, by the computer system, a first crowdsourced validation job to be performed at one or more first validation devices, the first crowdsourced validation job providing the one or more first validation devices with first instructions for a crowd user to provide a determination of whether or not the text is an accurate transcription of the natural language content; causing, by the computer system, the first crowdsourced validation job to be provided to the one or more first validation devices in a validator application; receiving, by the computer system, from each of the one or more first validation devices, a vote representing the determination of whether or not the text is an accurate transcription of the natural language content; and assigning, by the computer system, a validation score to the transcription pair based, at least in part, on votes received from the one or more first validation devices with respect to the transcription pair.
-
Specification