×

Cloud-based plagiarism detection system performing predicting based on classified feature vectors

  • US 9,514,417 B2
  • Filed: 12/30/2013
  • Issued: 12/06/2016
  • Est. Priority Date: 12/30/2013
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method comprising:

  • generating training data for training a predictive model using a machine learning technique to estimate a probability that a given document is plagiarized or is not plagiarized, the training data including, for each of a plurality of training documents, a feature vector that includes (i) data referencing a content of an edit to the training document, (ii) data referencing a type of the edit to the training document, (iii) data referencing a time associated with the edit to the training document, and (iv) a label indicating whether the training document is or is not plagiarized;

    training the predictive model using the training data;

    after training the predictive model, identifying a particular document stored in a database;

    receiving data referencing (i) a content of an edit to the particular document stored in the database, and (ii) a time associated with the edit to the particular document;

    generating a feature vector based at least on the data referencing (i) the content of the edit to the particular document stored in the database, and (ii) the time associated with the edit to the particular document; and

    determining a probability that the particular document is plagiarized or is not plagiarized based on classifying the feature vector by the predictive model that is trained using the machine learning technique.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×