Hash compact XML parser
First Claim
Patent Images
1. A method of parsing a markup language document comprising syntactic elements, said method comprising, for one of said syntactic elements, the steps of:
- identifying a type of the element;
processing the element by determining a hash representation thereof if said type is a first type; and
augmenting an at least partial structural representation of the document using the hash representation if said type is said first type.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of parsing a markup language document comprising syntactic elements is disclosed, said method comprising, for one of said syntactic elements, the steps of identifying (310) a type of the element, processing (318) the element by determining a hash representation thereof if said type is a first type, and augmenting (314) an at least partial structural representation of the document using the hash representation if said type is said first type.
159 Citations
58 Claims
-
1. A method of parsing a markup language document comprising syntactic elements, said method comprising, for one of said syntactic elements, the steps of:
-
identifying a type of the element;
processing the element by determining a hash representation thereof if said type is a first type; and
augmenting an at least partial structural representation of the document using the hash representation if said type is said first type. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 58)
-
-
27. A method of validating a markup language document against a VRD, said method comprising steps of:
-
(a) processing the markup language document, for each document tag identified therein, if said document tag is not a first document tag in a corresponding markup language document tag hierarchy, said processing comprising steps of;
(i) determining a hierarchy position of said document tag;
(ii) determining an extended hashed representation of said document tag concatenated with a hashed representation of a previous document tag in the document tag hierarchy; and
(iii) storing said extended hashed representation of said document tag if said document tag is more deeply nested than a previous document tag;
(b) processing said VRD, for each tag identified therein, if said tag is not a first tag in a corresponding tag hierarchy, said processing comprising steps of;
(i) determining a hierarchy position of said tag;
(ii) determining an extended hashed representation of said tag concatenated with a hashed representation of a previous tag in the corresponding tag hierarchy; and
(iii) storing said extended hashed representation of said tag in a list; and
(c) validating said markup language document if said extended hashed representation of said document tag is one of found in said list and is a valid subset of a member of said list.
-
-
28. A method of validating a markup language document against a VRD, said method comprising steps of:
-
(a) processing said VRD, for each structural element identified therein, said processing comprising steps of;
(i) determining syntactic attributes of said structural element;
(ii) determining a hashed representation of said structural element; and
(iii) storing said hashed representation and syntactic attributes of said structural element in a structural representation of said VRD; and
(b) processing the markup language document, for each document structural element identified therein, said processing comprising steps of;
(i) determining syntactic attributes of said document structural element;
(ii) determining a hashed representation of said document structural element; and
(iii) storing said hashed representation and syntactic attributes of said document structural element in a structural representation of the document; and
(c) validating said markup language document if syntactic attributes and hashed representations of said each document structural element in the structural representation of the document conforms to corresponding syntactic attributes and hashed representations in said structural representation of said VRD.
-
-
32. A method of encoding a markup language document comprising syntactic elements, said method comprising, for one of said syntactic elements, steps of:
-
identifying a type of the syntactic element; and
processing the syntactic element by one of;
(i) determining a hash representation thereof if said type is a first type;
(ii) determining a compressed representation thereof if said type is not a first type; and
(iii) retaining the syntactic element.
-
-
33. A method of decoding a markup language document comprising encoded syntactic elements, said method comprising, for one of said encoded syntactic elements, steps of:
-
identifying a type of the encoded syntactic element;
processing the encoded syntactic element by at least one of;
(i) determining an inverse hash representation thereof if said type is a first type; and
(ii) determining a decompressed representation thereof if said type is not a first type; and
(iii) retaining the encoded syntactic element.
-
-
34. An apparatus for parsing a markup language document comprising syntactic elements, said apparatus comprising:
-
identifying means for identifying a type of the element;
processing means for processing the element by determining a hash representation thereof if said type is a first type; and
augmenting means for augmenting an at least partial structural representation of the document using the hash representation if said type is said first type.
-
-
35. An apparatus for validating a markup language document against a VRD, said apparatus comprising:
-
(a) means for processing the markup language document, for each document tag identified therein, if said document tag is not a first document tag in a corresponding markup language document tag hierarchy, said means comprising;
(i) means for determining a hierarchy position of said document tag;
(ii) means for determining an extended hashed representation of said document tag concatenated with a hashed representation of a previous document tag in the document tag hierarchy; and
(iii) means for storing said extended hashed representation of said document tag if said document tag is more deeply nested than an extended hashed representation of a previous document tag;
(b) means for processing said VRD, for each tag identified therein, if said tag is not a first tag in a corresponding tag hierarchy, said means comprising;
(i) means for determining a hierarchy position of said tag;
(ii) means for determining an extended hashed representation of said tag concatenated with a hashed representation of a previous tag in the corresponding tag hierarchy; and
(iii) means for storing said extended hashed representation of said tag in a list; and
(c) means for establishing whether said extended hashed representation of said document tag is one of to be found in said list, and is a valid subset of a member of said list, thereby validating said markup language document.
-
-
36. An apparatus for validating a markup language document against a VRD, said apparatus comprising:
-
(a) means for processing said VRD, for each structural element identified therein, said means comprising;
(i) means for determining syntactic attributes of said structural element;
(ii) means for determining a hashed representation of said structural element; and
(iii) means for storing said hashed representation and syntactic attributes of said structural element in a structural representation of said VRD; and
(b) means for processing the markup language document, for each document structural element identified therein, said means comprising;
(i) means for determining syntactic attributes of said document structural element;
(ii) means for determining a hashed representation of said document structural element; and
(iii) means for storing said hashed representation and syntactic attributes of said document structural element in a structural representation of the document; and
(c) means for comparing syntactic attributes and hashed representations of said each document structural element in the structural representation of the document to corresponding syntactic attributes and hashed representations in said structural representation of said VRD to thereby establish validity of the markup language document.
-
-
37. An apparatus for encoding a markup language document comprising syntactic elements, to form an at least partial structural representation of the document, said apparatus comprising:
-
means for identifying a type of the syntactic element; and
means for processing the syntactic element by one of;
(i) determining a hash representation thereof if said type is a first type;
(ii) determining a compressed representation thereof if said type is not a first type; and
(iii) retaining the syntactic element.
-
-
38. An apparatus for decoding a markup language document comprising encoded syntactic elements, said apparatus comprising:
-
means for identifying a type of the encoded syntactic element;
means for processing the encoded syntactic element by at least one of;
(i) determining an inverse hash representation thereof if said type is a first type;
(ii) determining a decompressed representation thereof if said type is not a first type; and
(iii) retaining the encoded syntactic element.
-
-
39. A computer program which is configured to make a computer execute a procedure to parse a markup language document comprising syntactic elements, said program comprising:
-
code for identifying a type of an element;
code for processing the element by determining a hash representation thereof if said type is a first type; and
code for augmenting an at least partial structural representation of the document using the hash representation if said type is said first type.
-
-
40. A computer program which is configured to make a computer execute a procedure to validate a markup language document against a VRD, said program comprising:
-
(a) code for processing the markup language document, for each document tag identified therein, if said document tag is not a first document tag in a corresponding markup language document tag hierarchy, said code comprising;
(i) code for determining a hierarchy position of said document tag;
(ii) code for determining an extended hashed representation of said document tag concatenated with a hashed representation of a previous document tag in the document tag hierarchy; and
(iii) code for storing said extended hashed representation of said document tag if said tag is more deeply nested than a previous document tag;
(b) code for processing said VRD, for each tag identified therein, if said tag is not a first tag in a corresponding tag hierarchy, said code comprising;
(i) code for determining a hierarchy position of said tag;
(ii) code for determining an extended hashed representation of said tag concatenated with a hashed representation of a previous tag in the corresponding tag hierarchy; and
(iii) code for storing said extended hashed representation of said tag in a list; and
(c) code for validating said markup language document if said extended hashed representation of said document tag is one of found in said list, and is a valid subset of a member of said list.
-
-
41. A computer program which is configured to make a computer execute a procedure to validate a markup language document against a VRD, said program comprising:
-
(a) code for processing said VRD, for each structural element identified therein, said code comprising;
(i) code for determining syntactic attributes of said structural element;
(ii) code for determining a hashed representation of said structural element; and
(iii) code for storing said hashed representation and syntactic attributes of said structural element in a structural representation of said VRD; and
(b) code for processing the markup language document, for each document structural element identified therein, said code comprising;
(i) code for determining syntactic attributes of said document structural element;
(ii) code for determining a hashed representation of said document structural element; and
(iii) code for storing said hashed representation and syntactic attributes of said document structural element in a structural representation of the document; and
(c) code for validating said markup language document if syntactic attributes and hashed representations of said each document structural element in the structural representation of the document conforms to corresponding syntactic attributes and hashed representations in said structural representation of said VRD.
-
-
42. A computer program which is configured to make a computer execute a procedure to encode a markup language document comprising syntactic elements, said program comprising:
-
code for identifying a type of the syntactic element; and
code for processing the syntactic element by one of;
(i) determining a hash representation thereof if said type is a first type;
(ii) determining a compressed representation thereof if said type is not a first type; and
(iii) retaining the syntactic element.
-
-
43. A computer program which is configured to make a computer execute a procedure to decode a markup language document comprising encoded syntactic elements, said program comprising:
-
code for identifying a type of the encoded syntactic element;
code for processing the encoded syntactic element by at least one of;
(i) determining an inverse hash representation thereof if said type is a first type; and
(ii) determining a decompressed representation thereof if said type is not a first type; and
(iii) retaining the encoded syntactic element.
-
-
44. A computer program product including a computer readable medium having recorded thereon a computer program which is configured to make a computer execute a procedure to parse a markup language document, said program comprising:
-
code for identifying a type of the element;
code for processing the element by determining a hash representation thereof if said type is a first type; and
code for augmenting an at least partial structural representation of the document using the hash representation if said type is said first type.
-
-
45. A computer program product including a computer readable medium having recorded thereon a computer program which is configured to make a computer execute a procedure to validate a markup language document against a VRD, said program comprising:
-
(a) code for processing the markup language document, for each document tag identified therein, if said document tag is not a first document tag in a corresponding markup language document tag hierarchy, said code comprising;
(i) code for determining a hierarchy position of said document tag;
(ii) code for determining an extended hashed representation of said document tag concatenated with a hashed representation of a previous document tag in the document tag hierarchy; and
(iii) code for storing said extended hashed representation of said document tag if said document tag is more deeply nested than a previous document tag;
(b) code for processing said VRD, for each tag identified therein, if said tag is not a first tag in a corresponding tag hierarchy, said code comprising;
(i) code for determining a hierarchy position of said tag;
(ii) code for determining an extended hashed representation of said tag concatenated with a hashed representation of a previous tag in the corresponding tag hierarchy; and
(iii) code for storing said extended hashed representation of said tag in a list; and
(c) code for validating said markup language document if said extended hashed representation of said document tag is one of found in said list and is a valid subset of a member of said list.
-
-
46. A computer program product including a computer readable medium having recorded thereon a computer program which is configured to make a computer execute a procedure to validate a markup language document against a VRD, said program comprising:
-
(a) code for processing said VRD, for each structural element identified therein, said code comprising;
(i) code for determining syntactic attributes of said structural element;
(ii) code for determining a hashed representation of said structural element; and
(iii) code for storing said hashed representation and syntactic attributes of said structural element in a structural representation of said VRD; and
(b) code for processing the markup language document, for each document structural element identified therein, said code comprising;
(i) code for determining syntactic attributes of said document structural element;
(ii) code for determining a hashed representation of said document structural element; and
(iii) code for storing said hashed representation and syntactic attributes of said document structural element in a structural representation of the document; and
(c) code for validating said markup language document if syntactic attributes and hashed representations of said each document structural element in the structural representation of the document conforms to corresponding syntactic attributes and hashed representations in said structural representation of said VRD.
-
-
47. An at least partial structural representation a markup language document comprising syntactic elements, said at least partial representation having been produced by a method comprising, for one of said syntactic elements, the steps of:
-
identifying a type of the element;
processing the element by determining a hash representation thereof if said type is a first type; and
augmenting an at least partial structural representation of the document using the hash representation if said type is said first type.
-
-
48. An apparatus for parsing a markup language document comprising syntactic elements, said apparatus comprising:
-
a processor;
a memory for storing (i) the document, and (ii) a program which is configured to make the processor execute a procedure to parse the document;
said program comprising;
(i) code for identifying a type of an element;
(ii) code for processing the element by determining a hash representation thereof if said type is a first type; and
(iii) code for augmenting an at least partial structural representation of the document using the hash representation if said type is said first type.
-
-
49. An apparatus for validating a markup language document comprising syntactic elements against a VRD comprising syntactic elements, said apparatus comprising:
-
(a) a processor;
(b) a memory for storing (i) the document, (ii) said VRD, and (iii) a program which is configured to make the processor execute a procedure to validate the document;
(c) said program comprising;
(ca) code for processing the markup language document, for each document tag identified therein, if said document tag is not a first document tag in a corresponding markup language document tag hierarchy, said code comprising;
(caa) code for determining a hierarchy position of said document tag;
(cab) code for determining an extended hashed representation of said document tag concatenated with a hashed representation of a previous document tag in the document tag hierarchy; and
(cac) code for storing said extended hashed representation of said document tag if said document tag is more deeply nested than a previous document tag;
(cb) code for processing said VRD, for each tag identified therein, if said tag is not a first tag in a corresponding tag hierarchy, said means comprising;
(cba) code for determining a hierarchy position of said tag;
(cbb) code for determining an extended hashed representation of said tag concatenated with a hashed representation of a previous tag in the corresponding tag hierarchy; and
(cbc) code for storing said extended hashed representation of said tag in a list; and
(cc) code for establishing whether said extended hashed representation of said document tag is one of to be found in said list, and is a valid subset of a member of said list, thereby validating said markup language document.
-
-
50. An apparatus for validating a markup language document containing syntactic elements against a VRD containing syntactic elements, said apparatus comprising:
-
(a) a processor;
(b) a memory for storing (i) the document, (ii) said VRD, and (iii) a program which is configured to make the processor execute a procedure to validate the document;
(c) said program comprising;
(ca) code for processing said VRD, for each structural element identified therein, said code comprising;
(caa) code for determining syntactic attributes of said structural element;
(cab) code for determining a hashed representation of said structural element; and
(cac) code for storing said hashed representation and syntactic attributes of said structural element in a structural representation of said VRD; and
(cb) code for processing the markup language document, for each document structural element identified therein, said code comprising;
(caa) code for determining syntactic attributes of said document structural element;
(cab) code for determining a hashed representation of said document structural element; and
(cac) code for storing said hashed representation and syntactic attributes of said document structural element in a structural representation of the document; and
(cc) code for comparing syntactic attributes and hashed representations of said each document structural element in the structural representation of the document to corresponding syntactic attributes and hashed representations in said structural representation of said VRD to thereby establish validity of the markup language document.
-
-
51. A method of validating a markup language document against a VRD, said method comprising steps of:
-
determining first extended hashed representation(s) for most deeply nested syntactic element(s) of a first type in the VRD;
storing said first extended hashed representation(s) in a VRD list;
determining a second extended hashed representation for a most deeply nested syntactic element of the first type in the markup language document; and
declaring said markup language document to not be invalid if said second extended hashed representation is present in the VRD list. - View Dependent Claims (52, 53, 54)
-
-
55. An apparatus for validating a markup language document against a VRD, said apparatus comprising:
-
means for determining first extended hashed representation(s) for most deeply nested syntactic element(s) of a first type in the VRD;
means for storing said first extended hashed representation(s) in a VRD list;
means for determining a second extended hashed representation for a most deeply nested syntactic element of the first type in the markup language document; and
means for declaring said markup language document to not be invalid if said second extended hashed representation is present in the VRD list.
-
-
56. A computer program which is configured to sake a computer execute a procedure to validate a markup language document against a VRD, said program comprising:
-
code for determining first extended hashed representation(s) for most deeply nested syntactic element(s) of a first type in the VRD;
code for storing said first extended hashed representation(s) in a VRD list;
code for determining a second extended hashed representation for a most deeply nested syntactic element of the first type in the markup language document; and
code for declaring said markup language document to not be invalid if said second extended hashed representation is present in the VRD list.
-
-
57. A computer program product including a computer readable medium having recorded thereon a computer program which is configured to make a computer execute a procedure to validate a markup language document against a VRD, said program comprising:
-
code for determining first extended hashed representation(s) for most deeply nested syntactic element(s) of a first type in the VRD;
code for storing said first extended hashed representation(s) in a VRD list;
code for determining a second extended hashed representation for a most deeply nested syntactic element of the first type in the markup language document; and
code for declaring said markup language document to not be invalid if said second extended hashed representation is present in the VRD list.
-
Specification