Method and apparatus for structured document difference string extraction
First Claim
1. An inter-document difference extraction method for extracting a non-coincident character string between two structured documents as a difference comprising the steps of:
- inputting a first and a second structured document each including a plurality of elements and structure information thereof;
comparing said input first structured document and said second structured document with each other as to elements representing the structure thereof; and
when a comparison objective element is defined as one for which non-coincidence of occurrence order of the element is not to be taken into consideration, determining the comparison result of said comparing step indicating non-coincidence in the occurrence order as a difference between the structured documents to be excluded from extraction.
0 Assignments
0 Petitions
Accused Products
Abstract
A document difference extraction method and apparatus which is used for extracting the difference between structured documents properly meeting the sense of a document editor taking the logical meaning and structure of the structured documents into consideration. Structured documents are edited and stored in a memory unit by a document editing program. With reference to a comparison criterion set for the logical structure of each structured document before and after edition, the logical structure of the structural documents before and after edition read from the memory unit is analyzed by a structured document parsing program, and the difference between the structured documents is extracted by a structured document difference extraction program in such a manner as to satisfy the comparison criterion in accordance with the result of parsing. The comparison criterion assumes the form of a table containing a plurality of tags representing logical structures and types of tags for the comparison criterion. The tag types for comparison criterion include tags having contents which are compared only when the particular tags are coincident with each other, tags having contents which are ignored at the time of comparison, a set of tags having the same logical meaning, and a set of tags having contents which are not compared with each other.
-
Citations
4 Claims
-
1. An inter-document difference extraction method for extracting a non-coincident character string between two structured documents as a difference comprising the steps of:
-
inputting a first and a second structured document each including a plurality of elements and structure information thereof;
comparing said input first structured document and said second structured document with each other as to elements representing the structure thereof; and
when a comparison objective element is defined as one for which non-coincidence of occurrence order of the element is not to be taken into consideration, determining the comparison result of said comparing step indicating non-coincidence in the occurrence order as a difference between the structured documents to be excluded from extraction.
-
-
2. An inter-document difference extraction method for extracting a non-coincident character string between two structured documents as a difference comprising the steps of:
-
inputting a first and a second structured document each including a plurality of elements and structure information thereof;
comparing said input first input structured document and said second structured document with each other as to elements representing the structure thereof; and
when a comparison objective tag is coincident and an attribute thereof is non-coincident between elements as per the comparison step, extracting the difference in the structured documents by ignoring the non-coincidence of an attribute if the non-coincidence of an attribute is pre-defined to be ignored.
-
-
3. An inter-document difference extraction method for extracting a non-coincident character string between two structured documents as a difference comprising the steps of:
-
inputting a first and a second structured document each including a plurality of elements and structure information thereof;
comparing said input first input structured document and said second structured document each other as to elements representing the structure thereof; and
when a comparison objective tag is coincident and an attribute thereof is non-coincident between the elements in the comparison steps, determining attributes coincident if the attributes under comparison are pre-defined to belong to the same attribute group to extract a difference in the structured documents.
-
-
4. An inter-document difference extraction method for extracting a non-coincident character string between two structured documents as a difference comprising the steps of:
-
inputting a first and a second structured document each including a plurality of elements and structure information thereof;
comparing said input first structured document and said second structured document with each other as to elements representing the structure thereof; and
when a comparison objective tag is coincident and the occurrence order of attribute is non-coincident as to the elements under comparison, determining a difference of the occurrence order to be ignored if the occurrence order of attribute is pre-defined not to be taken into consideration to extract a difference in the structured documents.
-
Specification