Cloud-based plagiarism detection system
First Claim
Patent Images
1. A system, comprising:
- a database for storing a plurality of documents, each of the plurality of documents associated with at least one user, wherein the database is configured to;
receive a group of documents related to a course;
receive at least one edit to one of the group of documents by at least one user;
store the at least one edit and at least one time reference corresponding to the time during which the at least one edit was made;
wherein sharing of content between each of the plurality of documents is restricted;
a feature extraction module configured to;
obtain a writing history for the at least one user associated with the one of the group of documents;
determine a writing pattern associated with the one of the group of documents based on the writing history for the at least one user, the at least one edit, and at least one time reference; and
generate a feature vector for the writing pattern.
2 Assignments
0 Petitions
Accused Products
Abstract
Plagiarism may be detected, as disclosed herein, utilizing a database that stores documents for one or more courses. The database may restrict sharing of content between documents. A feature extraction module may receive edits and timestamp the edits to the document. A writing pattern for a particular user or group of users may be discerned from the temporal data and the documents for the particular user or group of users. A feature vector may be generated that represents the writing pattern. A machine learning technique may be applied to the feature vector to determine whether or not a document is plagiarized.
-
Citations
13 Claims
-
1. A system, comprising:
-
a database for storing a plurality of documents, each of the plurality of documents associated with at least one user, wherein the database is configured to; receive a group of documents related to a course; receive at least one edit to one of the group of documents by at least one user; store the at least one edit and at least one time reference corresponding to the time during which the at least one edit was made; wherein sharing of content between each of the plurality of documents is restricted; a feature extraction module configured to; obtain a writing history for the at least one user associated with the one of the group of documents; determine a writing pattern associated with the one of the group of documents based on the writing history for the at least one user, the at least one edit, and at least one time reference; and generate a feature vector for the writing pattern. - View Dependent Claims (2, 3)
-
-
4. A system, comprising:
-
a database for storing a plurality of documents, wherein sharing of the documents is restricted; a processor connected to the database and configured to; receive an edit to a document stored in the database; associate a time reference with the edit to the document; store the edit and the time reference to the database as a document history; generate a feature vector based on the document history; and determine a probability that the document is plagiarized based on a classification of the feature vector by a machine learning technique. - View Dependent Claims (5, 6)
-
-
7. A method, comprising:
-
receiving a group of documents related to a course; receiving at least one edit to one of the group of documents by at least one user; storing the at least one edit and at least one time reference corresponding to the time during which the at least one edit was made; obtaining a writing history for the at least one user associated with the one of the group of documents; determining a writing pattern associated with the one of the group of documents based on the writing history for the at least one user, the at least one edit, and at least one time reference; and generating a feature vector for the writing pattern. - View Dependent Claims (8, 9, 10)
-
-
11. A method, comprising:
-
receiving an edit to a document stored in a database; associating a time reference with the edit to the document; storing the edit and the time reference to the database as a document history; generating a feature vector based on the document history; and determining a probability that the document is plagiarized based on a classification of the feature vector by a machine learning technique. - View Dependent Claims (12, 13)
-
Specification