Machined book detection
First Claim
Patent Images
1. A computer-implemented method, comprising:
- storing, in computer memory, a textual work;
generating an N-gram of N words of the textual work;
generating a 2-dimensional plot based, at least in part, on the N-gram;
storing, in computer memory, data corresponding to at least one 2-dimensional representation of at least one pre-determined machine generated work;
processing the 2-dimensional plot and the at least one 2-dimensional representation to obtain a score indicative of a correlation between characteristics of the textual work and characteristics of the at least one pre-determined machine generated work; and
determining, based at least in part on the score, that the textual work comprises at least a portion of machine generated grammatically unintelligible text.
0 Assignments
0 Petitions
Accused Products
Abstract
A system and method for determining whether a textual work submitted for publishing is machine generated or non-machine generated by identifying and quantifying various aspects of the textual work and comparing those aspects to known works. For example, the system and method may identify aspects of a textual work, including, a relationship between the sentences within the textual work, a writing style of the author of the textual work, a grammatical structure of the sentences within the textual work, a quality of the textual work, and other aspects of the textual work. Upon determining that the textual work is machine generated the textual work may be rejected for publishing.
31 Citations
18 Claims
-
1. A computer-implemented method, comprising:
-
storing, in computer memory, a textual work; generating an N-gram of N words of the textual work; generating a 2-dimensional plot based, at least in part, on the N-gram; storing, in computer memory, data corresponding to at least one 2-dimensional representation of at least one pre-determined machine generated work; processing the 2-dimensional plot and the at least one 2-dimensional representation to obtain a score indicative of a correlation between characteristics of the textual work and characteristics of the at least one pre-determined machine generated work; and determining, based at least in part on the score, that the textual work comprises at least a portion of machine generated grammatically unintelligible text. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computing device, comprising:
at least one processor; a memory device including instructions operable to be executed by the at least one processor to perform a set of actions, configuring the at least one processor; to receive a textual work; store, in computer memory, the textual work; to generate an N-gram of N words of the textual work; to generate a 2-dimensional plot based, at least in part, on the N-gram in the N-dimensional space; to store, in computer memory, data corresponding to at least one 2-dimensional representation of at least one pre-determined machine generated work; to process the 2-dimensional plot and the at least one 2-dimensional representation to obtain a score indicative of a correlation between characteristics of the textual work and characteristics of the at least one pre-determined machine generated work; and to determine, based at least in part on the score, that the textual work comprises at least a portion of machine generated grammatically unintelligible text. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
15. A computer-implemented method of identifying machine generated text, comprising:
-
receiving a textual work; storing, in computer memory, the textual work; parsing a portion of the textual work and identifying at least one of verbs, nouns, pronouns, adjectives, adverbs, prepositions, conjunctions, and interjections in each sentence of the portion; representing, in the computer memory, the parsed portion of the textual work as a 2-dimensional representation of the identified at least one of verbs, nouns, pronouns, adjectives, adverbs, prepositions, conjunctions, and interjections; applying one or more grammatical rules to the 2-dimensional representation; calculating a confidence score corresponding to the application of the one or more grammatical rules; and determining, based at least in part on the confidence score, that the textual work comprises at least a portion of machine generated grammatically unintelligible text. - View Dependent Claims (16, 17, 18)
-
Specification