SYSTEM AND METHOD FOR CROSS-DOMAIN TRANSFERABLE NEURAL COHERENCE MODEL
First Claim
1. A system of automatically generating a coherence score for a target text data object, the system comprising a processor operating in conjunction with non-transitory computer memory and a data storage, the processor configured to:
- receive, at a string token receiver, a plurality of string tokens representing decomposed portions of the target text data object;
maintain, on the data storage, a neural network trained against a plurality of corpuses of training text across a plurality of topics, the neural network trained using string tokens of adjacent sentence pairs of the training text as positive training examples and string tokens of non-adjacent sentence pairs of the training text as negative training examples;
arrange the string tokens to extract string tokens representing adjacent sentence pairs of the target text data object;
for each adjacent sentence pair, determine, using the neural network, a local coherence score representing a coherence level of the adjacent sentence pair of the target text data object;
aggregate the generated local coherence scores for each adjacent sentence pair of the target text data object to generate a global coherence score for the target text data object; and
store the global coherence score or the generated local coherence scores in a data storage.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods of automatically generating a coherence score for text data is provided. The approach includes receiving a plurality of string tokens representing decomposed portions of the target text data object. A trained neural network is provided that has been trained against a plurality of corpuses of training text across a plurality of topics. The string tokens are arranged to extract string tokens representing adjacent sentence pairs of the target text data object. For each adjacent sentence pair, the neural network generates a local coherence score representing a coherence level of the adjacent sentence pair of the target text data object, which are then aggregated for each adjacent sentence pair of the target text data object to generate a global coherence score for the target text data object.
-
Citations
22 Claims
-
1. A system of automatically generating a coherence score for a target text data object, the system comprising a processor operating in conjunction with non-transitory computer memory and a data storage, the processor configured to:
-
receive, at a string token receiver, a plurality of string tokens representing decomposed portions of the target text data object; maintain, on the data storage, a neural network trained against a plurality of corpuses of training text across a plurality of topics, the neural network trained using string tokens of adjacent sentence pairs of the training text as positive training examples and string tokens of non-adjacent sentence pairs of the training text as negative training examples; arrange the string tokens to extract string tokens representing adjacent sentence pairs of the target text data object; for each adjacent sentence pair, determine, using the neural network, a local coherence score representing a coherence level of the adjacent sentence pair of the target text data object; aggregate the generated local coherence scores for each adjacent sentence pair of the target text data object to generate a global coherence score for the target text data object; and store the global coherence score or the generated local coherence scores in a data storage. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of automatically generating a coherence score for a target text data object, the method comprising:
-
receiving a plurality of string tokens representing decomposed portions of the target text data object; providing a neural network trained against a plurality of corpuses of training text across a plurality of topics, the neural network trained using string tokens of adjacent sentence pairs of the training text as positive examples and string tokens of non-adjacent sentence pairs of the training text as negative examples; arranging the string tokens to extract string tokens representing adjacent sentence pairs of the target text data object; for each adjacent sentence pair, determining, using the neural network, a local coherence score representing a coherence level of the adjacent sentence pair of the target text data object; aggregating the generated local coherence scores for each adjacent sentence pair of the target text data object to generate a global coherence score for the target text data object; and storing the global coherence score or the generated local coherence scores in a data storage. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A non-transitory computer readable medium storing machine interpretable instructions, which when executed by a processor, cause the processor to perform a method of automatically generating a coherence score for a target text data object, the method comprising:
-
receiving a plurality of string tokens representing decomposed portions of the target text data object; providing a neural network trained against a plurality of corpuses of training text across a plurality of topics, the neural network trained using string tokens of adjacent sentence pairs of the training text as positive examples and string tokens of non-adjacent sentence pairs of the training text as negative examples; arranging the string tokens to extract string tokens representing adjacent sentence pairs of the target text data object; for each adjacent sentence pair, determining, using the neural network, a local coherence score representing a coherence level of the adjacent sentence pair of the target text data object; aggregating the generated local coherence scores for each adjacent sentence pair of the target text data object to generate a global coherence score for the target text data object; and storing the global coherence score or the generated local coherence scores in a data storage.
-
-
22. A non-transitory computer readable medium storing a trained neural network as machine interpretable instructions, the trained neural network which when executed by a processor, causes the processor to perform a method of automatically generating a coherence score for a target text data object, the method comprising:
-
receiving a plurality of string tokens representing decomposed portions of the target text data object; providing the trained neural network that was trained against a plurality of corpuses of training text across a plurality of topics, the trained neural network trained using string tokens of adjacent sentence pairs of the training text as positive examples and string tokens of non-adjacent sentence pairs of the training text as negative examples; arranging the string tokens to extract string tokens representing adjacent sentence pairs of the target text data object; for each adjacent sentence pair, determining, using the trained neural network, a local coherence score representing a coherence level of the adjacent sentence pair of the target text data object; aggregating the generated local coherence scores for each adjacent sentence pair of the target text data object to generate a global coherence score for the target text data object; and storing the global coherence score or the generated local coherence scores in a data storage.
-
Specification