Searching the internet for common elements in a document in order to detect plagiarism
DCFirst Claim
Patent Images
1. A computer-implemented method for detecting plagiarism between files, the method comprising:
- reading, by a computer system, an element from a matching element database, wherein the element in the matching element database is text that has been determined to exist in each of first and second files and an indication of a correlation between the first and second files;
sending, by the computer system, said element that has been determined to exist in each of first and second files to a search engine, wherein the search engine searches a plurality of sources for one or more hits of said element with respect to the plurality of sources;
receiving, by the computer system, from said search engine a number of the hits;
displaying, by the computer system, to a user said element and said number of hits for said element as an indication of whether or not the correlation is due to plagiarism between the first and second files.
2 Assignments
Litigations
0 Petitions
Accused Products
Abstract
A method and system for detecting plagiarism of software source code is disclosed. In one embodiment, a database exists of program elements that have previously been found to be matching within the source code for two different programs. This embodiment searches the Internet for occurrences of these matching program elements to determine how many times they appear and thus whether they are commonly used or not. The elements and their associated number of hits are placed in a spreadsheet for further observation and manipulation. One of skill in the art will see that this invention also applies to other kinds of text documents.
-
Citations
15 Claims
-
1. A computer-implemented method for detecting plagiarism between files, the method comprising:
-
reading, by a computer system, an element from a matching element database, wherein the element in the matching element database is text that has been determined to exist in each of first and second files and an indication of a correlation between the first and second files; sending, by the computer system, said element that has been determined to exist in each of first and second files to a search engine, wherein the search engine searches a plurality of sources for one or more hits of said element with respect to the plurality of sources; receiving, by the computer system, from said search engine a number of the hits; displaying, by the computer system, to a user said element and said number of hits for said element as an indication of whether or not the correlation is due to plagiarism between the first and second files. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A non-transitory computer-readable storage medium storing executable instructions to cause a computer system to perform a method for detecting plagiarism between files, the method comprising:
-
reading an element from a matching element database, wherein the element in the matching element database is text determined to exist in each of first and second files and an indication of a correlation between the first and second files; sending said element that has been determined to exist in each of first and second files to a search engine, wherein the search engine searches a plurality of sources for one or more hits of said element with respect to the plurality of sources; receiving from said search engine a number of the hits; displaying to a user said element and said number of hits for said element as an indication of whether or not the correlation is due to plagiarism between the first and second files. - View Dependent Claims (7, 8, 9, 10)
-
-
11. An apparatus for detecting plagiarism between files, the apparatus comprising:
-
a memory; and a processor configured to read an element from a matching element database, wherein the element in the matching element database is text that has been determined to exist in each of first and second files and an indication of a correlation between the first and second files; send said element that has been determined to exist in each of first and second files to a search engine to search a plurality of sources for one or more hits of said element with respect to the plurality of sources; receive from said search engine a number of the hits; and display to a user said element and said number of hits for said element as an indication of whether or not the correlation is due to plagiarism between the first and second files. - View Dependent Claims (12, 13, 14, 15)
-
Specification