Transformation module for transforming documents from one format to other formats with pipelined processor having dedicated hardware resources
First Claim
1. A transformation module for transforming XML documents from one format to one or more other formats according to one or many transformation functions, the transformation module comprising:
- a stylesheet translation tool for pre-processing XSLT stylesheets that describe how a given transformation is performed on a document, said stylesheet translation tool decomposing said stylesheets into an accelerator specific language consisting of static data structures comprising a set of control units that are atomic transformation operations that can be directly performed on the documents, a constant string table containing string constants for the stylesheets, and a template match information table used to compute which XSLT template to apply to a document at any a given time;
a memory storing said XSLT stylesheets decomposed into said static data structures;
a document node memory storing tree data structures that represent nodes of the XML documents;
a document string memory storing string values associated with the nodes of the documents stored in the document node memory, wherein the document node memory stores references that point to memory locations in the document node memory; and
a processor with a plurality of pipelined stages for executing the control units as atomic operations on a plurality of dedicated hardware resources, said processor being capable of executing several transformations in parallel; and
wherein said processor comprises an XPath scheduling unit executing control units derived from Xpath expressions in said stylesheets and an XSLT scheduling unit executing control units derived from a remaining portion of said stylesheets, and said XSLT scheduling unit comprises a template matching dedicated hardware resource for performing template matching operations, a node set and variable dedicated hardware resource for accessing elements of a node set, where a node set is a list of a document'"'"'s constituents, and an output generation dedicated hardware resource for building constituents of an output document, and said Xpath scheduling unit comprises a tree walker dedicated hardware resource for perforating searches on the documents, a string operation dedicated hardware resource for providing string manipulation operations, a math operation dedicated hardware resource of providing various math functions, and a node set dedicated hardware resource for building node sets.
5 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for converting documents from one format to another in a speed efficient way involves a hardware module which implements several operating pipeline stages which work in parallel. The transformations are supplied and decomposed into sequences of control units. The transformation of documents consists of applying control unit sequences to input documents. The control units are themselves executed by a set of dedicated hardware resources. Furthermore the pipeline is capable of operating on more than one document at a time. Fast document transformation is a key capability of document processing systems. The use of parallel processing techniques and hardware that implements highly specialized transformation resources make this invention particularly scalable for its use in large, high speed content networks.
-
Citations
11 Claims
-
1. A transformation module for transforming XML documents from one format to one or more other formats according to one or many transformation functions, the transformation module comprising:
-
a stylesheet translation tool for pre-processing XSLT stylesheets that describe how a given transformation is performed on a document, said stylesheet translation tool decomposing said stylesheets into an accelerator specific language consisting of static data structures comprising a set of control units that are atomic transformation operations that can be directly performed on the documents, a constant string table containing string constants for the stylesheets, and a template match information table used to compute which XSLT template to apply to a document at any a given time; a memory storing said XSLT stylesheets decomposed into said static data structures; a document node memory storing tree data structures that represent nodes of the XML documents; a document string memory storing string values associated with the nodes of the documents stored in the document node memory, wherein the document node memory stores references that point to memory locations in the document node memory; and a processor with a plurality of pipelined stages for executing the control units as atomic operations on a plurality of dedicated hardware resources, said processor being capable of executing several transformations in parallel; and wherein said processor comprises an XPath scheduling unit executing control units derived from Xpath expressions in said stylesheets and an XSLT scheduling unit executing control units derived from a remaining portion of said stylesheets, and said XSLT scheduling unit comprises a template matching dedicated hardware resource for performing template matching operations, a node set and variable dedicated hardware resource for accessing elements of a node set, where a node set is a list of a document'"'"'s constituents, and an output generation dedicated hardware resource for building constituents of an output document, and said Xpath scheduling unit comprises a tree walker dedicated hardware resource for perforating searches on the documents, a string operation dedicated hardware resource for providing string manipulation operations, a math operation dedicated hardware resource of providing various math functions, and a node set dedicated hardware resource for building node sets. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A content router, comprising:
-
a routing module for routing incoming XML documents based on their content; and a transformation module for transforming the documents from one format to one or more other formats according to one or many transformation functions, said transformation module comprising; a stylesheet translation tool for pre-processing XSLT stylesheets that describe how a given transformation is performed on a document, said stylesheet translation tool decomposing said stylesheets into an accelerator specific language consisting of into static data structures comprising a set of control units that are atomic transformation operations that can be directly performed on the documents, a constant string table containing string constants for the stylesheets, and a template match information table used to compute which XSLT template to apply to a document at any a given time a memory storing said XSLT stylesheets decomposed into said static data structures;
a document node memory storing tree data structures that represent nodes of the XML documents;a document string memory storing string values associated with the nodes of the documents stored in the document node memory, wherein the document node memory stores references that point to memory locations in the document node memory; a processor with a plurality of pipelined stages for executing the control units as atomic operations on a plurality of dedicated hardware resources said processor being capable of executing several transformations in parallel; wherein said processor comprises an XPath scheduling unit executing control units derived from Xpath expressions in said stylesheets and an XSLT scheduling unit executing control units derived from a remaining portion of said stylesheets, and said XSLT scheduling unit comprises a template matching dedicated hardware resource for performing template matching operations, a node set and variable dedicated hardware resource for accessing elements of a node set, where a node set is a list of a document'"'"'s constituents, and an output generation dedicated hardware resource for building constituents of an output document, and said Xpath scheduling unit comprises a tree walker dedicated hardware resource for perforating searches on the documents, a string operation dedicated hardware resource for providing string manipulation operations, a math operation dedicated hardware resource of providing various math functions, and a node set dedicated hardware resource for building node sets; and a bus for transferring documents from the routing module to the transformation module during processing by the router. - View Dependent Claims (11)
-
Specification