Automatic detection and application of editing patterns in draft documents
First Claim
1. A method performed by a computer processor executing computer program instructions tangibly stored on at least one non-transitory computer-readable medium, the method comprising:
- (A) monitoring, by an editing pattern identifier, an editing operation performed while a user edits content D in an original document corpus to produce content E in an edited document corpus;
(B) tangibly storing, on the at least one computer-readable medium, a data structure representing at least one editing pattern identified based on the monitored editing operation, the at least one editing pattern of the form T=(D,E,C), wherein each of the plurality of editing patterns relates particular content D in an original document corpus to corresponding content E in an edited document corpus in a context C shared by contents D and E, wherein the original document corpus and the edited document corpus are tangibly stored on the at least one computer-readable medium;
(C) deriving a plurality of correction rules, tangibly stored on the at least one computer-readable medium, from the plurality of editing patterns; and
(D) deriving a classifier, tangibly stored on the at least one computer-readable medium, for particular content D based on the data structure representing the plurality of editing patterns, the classifier defining decision criteria for selecting one of the plurality of correction rules to apply to content D based on a context C* of content D.
10 Assignments
0 Petitions
Accused Products
Abstract
An error detection and correction system extracts editing patterns and derives correction rules from them by observing differences between draft documents and corresponding edited documents, and/or by observing editing operations performed on the draft documents to produce the edited documents. The system develops classifiers that partition the space of all possible contexts into equivalence classes and assigns one or more correction rules to each such class). Once the system has been trained, it may be used to detect and (optionally) correct errors in new draft documents. When presented with a draft document, the system identifies first content (e.g., text) in the draft document and identifies a context of the first content. The system identifies a correction rule based on the first content and the first context. The system may use a classifier to identify the correction rule. The system applies the correction rule to the first content to produce second content.
-
Citations
20 Claims
-
1. A method performed by a computer processor executing computer program instructions tangibly stored on at least one non-transitory computer-readable medium, the method comprising:
-
(A) monitoring, by an editing pattern identifier, an editing operation performed while a user edits content D in an original document corpus to produce content E in an edited document corpus; (B) tangibly storing, on the at least one computer-readable medium, a data structure representing at least one editing pattern identified based on the monitored editing operation, the at least one editing pattern of the form T=(D,E,C), wherein each of the plurality of editing patterns relates particular content D in an original document corpus to corresponding content E in an edited document corpus in a context C shared by contents D and E, wherein the original document corpus and the edited document corpus are tangibly stored on the at least one computer-readable medium; (C) deriving a plurality of correction rules, tangibly stored on the at least one computer-readable medium, from the plurality of editing patterns; and (D) deriving a classifier, tangibly stored on the at least one computer-readable medium, for particular content D based on the data structure representing the plurality of editing patterns, the classifier defining decision criteria for selecting one of the plurality of correction rules to apply to content D based on a context C* of content D. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer readable medium comprising computer program instructions executable by a computer process to perform a method, the method comprising:
-
(A) monitoring, by an editing pattern identifier, an editing operation performed while a user edits content D in an original document corpus to produce content E in an edited document corpus; (B) tangibly storing, on the at least one computer-readable medium, a data structure representing at least one editing pattern identified based on the monitored editing operation, the at least one editing pattern of the form T=(D,E,C), wherein each of the plurality of editing patterns relates particular content D in an original document corpus to corresponding content E in an edited document corpus in a context C shared by contents D and E, wherein the original document corpus and the edited document corpus are tangibly stored on the at least one computer-readable medium; (C) deriving a plurality of correction rules, tangibly stored on the at least one computer-readable medium, from the plurality of editing patterns; and (D) deriving a classifier, tangibly stored on the at least one computer-readable medium, for particular content D based on the data structure representing the plurality of editing patterns, the classifier defining decision criteria for selecting one of the plurality of correction rules to apply to content D based on a context C* of content D. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification