Methods and apparatus for converting markup language data to an intermediate representation
First Claim
1. A method for processing markup language data, the method comprising:
- receiving a character stream of markup language data;
applying sequences of characters of the character stream to a set of state machines, the set of state machines including a plurality of construct state machines responsible for processing respective markup language constructs identified by the sequences of characters; and
producing, from application of the sequences of characters to the set of state machines, an intermediate representation of the markup language constructs identified by the sequence of characters of the character stream of markup language data, the intermediate representation containing encoded items representative of the original stream of markup language data.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems, methods and apparatus provide a character process for processing markup language data, such as XML data, by receiving a character stream of markup language data and applying sequences of characters of the character stream to a set of state machines. The set of state machines includes a plurality of construct state machines responsible for processing respective markup language constructs identified by the sequences of characters. The character processor produces, from application of the sequences of characters to the set of state machines, an intermediate representation of the markup language constructs identified by the sequence of characters of the character stream of markup language data. The intermediate representation contains encoded items containing type, length, value representations representative of constructs within the character stream of markup language data.
46 Citations
55 Claims
-
1. A method for processing markup language data, the method comprising:
-
receiving a character stream of markup language data;
applying sequences of characters of the character stream to a set of state machines, the set of state machines including a plurality of construct state machines responsible for processing respective markup language constructs identified by the sequences of characters; and
producing, from application of the sequences of characters to the set of state machines, an intermediate representation of the markup language constructs identified by the sequence of characters of the character stream of markup language data, the intermediate representation containing encoded items representative of the original stream of markup language data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A character processor device comprising:
-
an input interface for receiving a character stream of markup language data;
logic processing coupled to the input interface and configured to receive and apply sequences of characters of the character stream to a set of state machines encoded within the logic processing, the set of state machines including a plurality of construct state machines responsible for processing respective markup language constructs identified by the sequences of characters; and
the logic processing producing, from an output interface coupled to the logic processing, from application of the sequences of characters to the set of state machines, an intermediate representation of the markup language constructs identified by the sequence of characters of the character stream of markup language data, the intermediate representation containing encoded items representative of the original stream of markup language data. - View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49)
-
-
50. A computer program product having a computer-readable medium including computer program logic encoded thereon that, when executed on processor within a computerized device, provides a character processor that processes markup language data by performing the operations of:
-
receiving a character stream of markup language data;
applying sequences of characters of the character stream to a set of state machines, the set of state machines including a plurality of construct state machines responsible for processing respective markup language constructs identified by the sequences of characters; and
producing, from application of the sequences of characters to the set of state machines, an intermediate representation of the markup language constructs identified by the sequence of characters of the character stream of markup language data, the intermediate representation containing encoded items representative of the original stream of markup language data. - View Dependent Claims (51, 52)
-
-
53. A character processor device comprising:
-
an input interface for receiving a plurality of character streams of extensible markup language (XML) data, each formatted according to an extensible markup language (XML) specification;
logic processing coupled to the input interface and configured to receive and apply sequences of characters of the character streams to a set of state machines encoded within the logic processing, the set of state machines including a plurality of construct state machines that provide means for processing respective markup language constructs identified by the sequences of characters, the plurality of construct state machines include respective construct state machines to process different types of XML constructs; and
the logic processing including means for producing, from an output interface coupled to the logic processing, from application of the sequences of characters to the set of state machines, respective intermediate representations of the markup language constructs identified by the sequence of characters of each character stream of markup language data, the respective intermediate representations containing respective groups of encoded items representative of the original streams of markup language data, the intermediate representations including encoded items containing type, length, value representations of the XML constructs identified by application of the sequences of characters of the character streams to the set of state machines. - View Dependent Claims (54, 55)
-
Specification