Machined book detection
First Claim
1. A computer-implemented method of identifying machine generated text, comprising:
- analyzing a plurality of pre-determined machine generated works;
storing, in a database, pre-determined centers of mass corresponding to the pre-determined machine generated works;
storing, in the database, pre-determined shape descriptors corresponding to the pre-determined machine generated works;
analyzing a plurality of pre-determined non-machine generated works;
storing, in the database, pre-determined centers of mass corresponding to the pre-determined non-machine generated works;
storing, in the database, pre-determined shape descriptors corresponding to the pre-determined non-machine generated works;
receiving a textual work submitted for publishing;
generating an N-gram of N words of the textual work;
plotting the N-gram in an N-dimensional space;
generating a 2-dimensional plot based, at least in part, on the plot of the N-gram in N-dimensional space;
calculating a center of mass of the 2-dimensional plot;
calculating a shape descriptor of the 2-dimensional plot, wherein the shape descriptor includes one or more points defining a shape of the 2-dimensional plot;
comparing the center of mass to the pre-determined centers of mass and the shape descriptor to the pre-determined shape descriptors;
calculating, based at least in part on the comparison, a confidence score indicative of a correlation between the textual work and at least one of the pre-determined machine generated works;
determining, based at least in part on the confidence score, that the textual work is machine generated; and
rejecting the textual work for publishing based on the determination.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for determining whether a textual work submitted for publishing is machine generated or non-machine generated by identifying and quantifying various aspects of the textual work and comparing those aspects to known works. For example, the system and method may identify aspects of a textual work, including, a relationship between the sentences within the textual work, a writing style of the author of the textual work, a grammatical structure of the sentences within the textual work, a quality of the textual work, and other aspects of the textual work. Upon determining that the textual work is machine generated the textual work may be rejected for publishing.
25 Citations
22 Claims
-
1. A computer-implemented method of identifying machine generated text, comprising:
-
analyzing a plurality of pre-determined machine generated works; storing, in a database, pre-determined centers of mass corresponding to the pre-determined machine generated works; storing, in the database, pre-determined shape descriptors corresponding to the pre-determined machine generated works; analyzing a plurality of pre-determined non-machine generated works; storing, in the database, pre-determined centers of mass corresponding to the pre-determined non-machine generated works; storing, in the database, pre-determined shape descriptors corresponding to the pre-determined non-machine generated works; receiving a textual work submitted for publishing; generating an N-gram of N words of the textual work; plotting the N-gram in an N-dimensional space; generating a 2-dimensional plot based, at least in part, on the plot of the N-gram in N-dimensional space; calculating a center of mass of the 2-dimensional plot; calculating a shape descriptor of the 2-dimensional plot, wherein the shape descriptor includes one or more points defining a shape of the 2-dimensional plot; comparing the center of mass to the pre-determined centers of mass and the shape descriptor to the pre-determined shape descriptors; calculating, based at least in part on the comparison, a confidence score indicative of a correlation between the textual work and at least one of the pre-determined machine generated works; determining, based at least in part on the confidence score, that the textual work is machine generated; and rejecting the textual work for publishing based on the determination. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method, comprising:
-
identifying pre-determined shape descriptors corresponding to pre-determined non-machine generated works; generating an N-gram of N words of a textual work; plotting the N-gram in an N-dimensional space; generating a 2-dimensional plot based, at least in part, on the plot of the N-gram in N-dimensional space; comparing a shape of the 2-dimensional plot of the textual work to the pre-determined shape descriptors; and determining, based at least in part on the comparison, that the textual work is a desired work. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A computing device, comprising:
at least one processor; a memory device including instructions operable to be executed by the at least one processor to perform a set of actions, configuring the processor; to identify pre-determined shape descriptors corresponding to pre-determined non-machine generated works; to generate an N-gram of N words of a textual work; to plot the N-gram in an N-dimensional space; to generate a 2-dimensional plot based, at least in part, on the plot of the N-gram in N-dimensional space; to compare a shape of the 2-dimensional plot of the textual work to the pre-determined shape descriptors; and to determine, based at least in part on the comparison, that the textual work is a desired work. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
Specification