Hash compact XML parser
First Claim
Patent Images
1. A method of generating an encoded representation of a markup language document comprising syntactic elements, said method comprising the steps of:
- parsing the markup language document to identify at least one syntactic element of that document;
identifying a type of the element;
processing the element by applying a hash function thereto, the hash function generating a numeric code from the element; and
generating the encoded representation including the numeric code;
wherein first and second syntactic elements respectively comprise a start tag and an end tag, being a first pair of tags, and said processing of said start tag and of said end tag generates corresponding hashed start and end tags;
wherein corresponding hashed start and end tags for the first pair of tags are incorporated into the encoded representation of the document;
wherein the document further includes a second pair of tags comprising respective start and end tags, the second pair of tags being nested within the first pair of tags in the document, andsaid method comprising further steps of;
processing the second pair of tags to form corresponding second hashed start and end tags; and
augmenting the encoded representation of the document using the corresponding second hashed start and end tags so that the second hashed start and end tags indicate a nesting in relation to the hashed start and end tags for the first pair of tags which is equivalent to the nesting of the second pair of tags within the first pair of tags.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of parsing a markup language document comprising syntactic elements is disclosed, said method comprising, for one of said syntactic elements, the steps of identifying (310) a type of the element, processing (318) the element by determining a hash representation thereof if said type is a first type, and augmenting (314) an at least partial structural representation of the document using the hash representation if said type is said first type.
-
Citations
26 Claims
-
1. A method of generating an encoded representation of a markup language document comprising syntactic elements, said method comprising the steps of:
-
parsing the markup language document to identify at least one syntactic element of that document; identifying a type of the element; processing the element by applying a hash function thereto, the hash function generating a numeric code from the element; and generating the encoded representation including the numeric code; wherein first and second syntactic elements respectively comprise a start tag and an end tag, being a first pair of tags, and said processing of said start tag and of said end tag generates corresponding hashed start and end tags; wherein corresponding hashed start and end tags for the first pair of tags are incorporated into the encoded representation of the document; wherein the document further includes a second pair of tags comprising respective start and end tags, the second pair of tags being nested within the first pair of tags in the document, and said method comprising further steps of; processing the second pair of tags to form corresponding second hashed start and end tags; and augmenting the encoded representation of the document using the corresponding second hashed start and end tags so that the second hashed start and end tags indicate a nesting in relation to the hashed start and end tags for the first pair of tags which is equivalent to the nesting of the second pair of tags within the first pair of tags. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. An apparatus for generating an encoded representation of a markup language document comprising syntactic elements, said apparatus comprising:
-
parsing means for parsing the markup language document to identify at least one syntactic element of that document; identifying means for identifying a type of the element; processing means for processing the element by applying a hash function thereto, said hash function generating a numeric code from the element; and generating means for generating the encoded representation including the numeric code, wherein first and second syntactic elements respectively comprise a start tag and an end tag, being a first pair of tags and said processing of said start tan and of said end tag generates corresponding hashed start and end tags; wherein corresponding hashed start and end tags for the first pair of tags are incorporated into the encoded representation of the document; wherein the document further includes a second pair of tans comprising respective start and end tags, the second pair of tags being nested within the first pair of tags in the document; wherein said processing means processes the second pair of tags to form corresponding second hashed start and end tags; and wherein said generating means augments the encoded representation of the document using the corresponding second hashed start and end tags so that the second hashed start and end tags indicate a nesting in relation to the hashed start and end tags for the first pair of tags which is equivalent to the nesting of the second pair of tans within the first pair of tags. - View Dependent Claims (22)
-
-
23. A computer-executable program which is stored on a computer-readable storage medium and which is configured to make a computer execute a procedure to generate an encoded representation of a markup language document comprising syntactic elements, said program comprising:
-
code for parsing the markup language document to identify at least one syntactic element of that document; code for identifying a type of the identified element; code for processing the identified element by applying a hash function thereto, the hash function generating a numeric code from the identified element; and code for generating the encoded representation including the numeric code, wherein first and second syntactic elements respectively comprise a start tag and an end tag, being a first pair of tags, and said code for processing of said start tag and of said end tag generates corresponding hashed start and end tags; wherein corresponding hashed start and end tags for the first pair of tans are incorporated into the encoded representation of the document; wherein the document further includes a second pair of tags comprising respective start and end tags, the second pair of tags being nested within the first pair of tags in the document; wherein said code for Processing processes the second pair of tans to form corresponding second hashed start and end tags; and wherein said code for generating augments the encoded representation of the document using the corresponding second hashed start and end tags so that the second hashed start and end tags indicate a nesting in relation to the hashed start and end tags for the first pair of tags which is equivalent to the nesting of the second pair of tags within the first pair of tags.
-
-
24. A computer program product including a computer readable storage medium having encoded thereon a computer program which is configured to make a computer execute a procedure to generate an encoded representation of a markup language document comprising syntactic elements, said program comprising:
-
code for parsing the markup language document to identify at least one syntactic element of that document; code for identifying a type of the element; code for processing the element by applying a hash function thereto, the hash function generating a numeric code from the element; and code for generating the encoded representation including the numeric code, wherein first and second syntactic elements respectively comprise a start tag and an end tag, being a first pair of tags, and said code for processing of said start tag and of said end tag generates corresponding hashed start and end tags; wherein corresponding hashed start and end tags for the first pair of tags are incorporated into the encoded representation of the document; wherein the document further includes a second pair of tags comprising respective start and end tags, the second pair of tags being nested within the first pair of tans in the document; wherein said code for processing processes the second pair of tags to form corresponding second hashed start and end tags; and wherein said code for generating augments the encoded representation of the document using the corresponding second hashed start and end tags so that the second hashed start and end tags indicate a nesting in relation to the hashed start and end tags for the first pair of tags which is equivalent to the nesting of the second pair of tags within the first pair of tags.
-
-
25. An encoded representation of a markup language document comprising syntactic elements, the encoded representation having been produced by a method comprising:
-
parsing the markup language document to identify at least one the syntactic element of that document; identifying a type of the element; processing the element by applying a hash function thereto, the hash function generating a numeric code from the element; and generating the encoded representation including said numeric code, wherein first and second syntactic elements respectively comprise a start tag and an end tag, being a first pair of tags, and said processing of said start tag and of said end tag generates corresponding hashed start and end tags; wherein corresponding hashed start and end tags for the first pair of tans are incorporated into the encoded representation of the document; wherein the document further includes a second pair of tans comprising respective start and end tags, the second pair of tags being nested within the first pair of tags in the document, and said method comprising further steps of; processing the second pair of tans to form corresponding second hashed start and end tags; and augmenting the encoded representation of the document using the corresponding second hashed start and end tags so that the second hashed start and end tags indicate a nesting in relation to the hashed start and end tags for the first pair of tags which is equivalent to the nesting of the second pair of tags within the first pair of tags.
-
-
26. An apparatus for generating an encoded representation of a markup language document comprising syntactic elements, said apparatus comprising:
-
a processor; a memory for storing (i) the document, and (ii) a program which is configured to make the processor execute a procedure to generate the encoded representation, wherein said program comprises; code for parsing the markup language document to identify at least one syntactic element of that document; code for identifying a type of the element; code for processing the element by applying a hash function thereto, the hash function generating a numeric code from the element; and code for generating the encoded representation including said numeric code, wherein first and second syntactic elements respectively comprise a start tag and an end tag, being a first pair of tags, and said code for processing of said start tag and of said end tag generates corresponding hashed start and end tags; wherein corresponding hashed start and end tags for the first pair of tags are incorporated into the encoded representation of the document; wherein the document further includes a second pair of tags comprising respective start and end tags, the second pair of tags being nested within the first pair of tags in the document; wherein said code for processing processes the second pair of tags to form corresponding second hashed start and end tags; and wherein said code for generating augments the encoded representation of the document using the corresponding second hashed start and end tans so that the second hashed start and end tags indicate a nesting in relation to the hashed start and end tags for the first pair of tags which is equivalent to the nesting of the second pair of tags within the first pair of tags.
-
Specification