Methods and systems for high throughput information refinement
First Claim
1. A method for processing messages, the method comprising the steps of:
- a computer system;
receiving, from a plurality of different data sources in a data network, a plurality of data messages each having a data type associated therewith and comprising a payload, wherein the plurality of data messages comprises different data formats;
for each of the data messages, determining a classification of the data message by parsing out information identifying the data type;
re-formatting each of the data messages, each of the re-formatted messages having a uniform data structure comprising an identifier, the classification, and the payload;
selecting a message service queue for each of the reformatted messages from a plurality of message service queues each dedicated to storing messages of a particular data type according to the classifications of the reformatted messages such that each of the selected service queues store a subset of the reformatted messages of a single data type;
with a parsing processor, monitoring the plurality of message service queues; and
with a parsing processor, selecting one of the message service queues based on the monitoring and then retrieving and parsing a next one of the reformatted messages from the selected one of the message service queues in accordance with a target output data model,wherein the parsing processor comprises a plurality of parsers each operable to parse data messages having a different data type and wherein the parsing processor parses the next one of the reformatted messages using one of the plurality of parsers that is configured for parsing the data type associated with the selected service queue and is dynamically selected and allocated during the parsing step,wherein the parsing processor, during the parsing step, selects one of the parsers to use for parsing the next one of the reformatted messages based on the data type and at least one parsing rule applied to one or more characteristics of the next one of the reformatted messages and wherein the plurality of parsers includes at least two parsers adapted for parsing the data type associated with the selected service queue, andwherein the parsing includes extracting a subset of information in the payload of the next one of the reformatted messages defined in the target output data model, whereby throughput of the parsing processor is enhanced by extracting only select information from each of the data messages.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and articles of manufacture consistent with the present invention provide a data processing system comprising a business application that receives data messages from a plurality of client data sources. The business application comprises a message pre-processor and a parsing processor. The message pre-processor classifies and identifies the data messages and sends the messages in a structured format to a message queue corresponding to its data type. The parsing processor receives the data messages from the message queues and selects a parser by applying a set of parsing rules. The parsing rules apply information about the data message and provide a decision as to the best parsing engine to use out of a plurality of paring engines. The parsing engines are also able to perform information refinement in accordance with selected components defined in a target output data model.
-
Citations
18 Claims
-
1. A method for processing messages, the method comprising the steps of:
a computer system; receiving, from a plurality of different data sources in a data network, a plurality of data messages each having a data type associated therewith and comprising a payload, wherein the plurality of data messages comprises different data formats; for each of the data messages, determining a classification of the data message by parsing out information identifying the data type; re-formatting each of the data messages, each of the re-formatted messages having a uniform data structure comprising an identifier, the classification, and the payload; selecting a message service queue for each of the reformatted messages from a plurality of message service queues each dedicated to storing messages of a particular data type according to the classifications of the reformatted messages such that each of the selected service queues store a subset of the reformatted messages of a single data type; with a parsing processor, monitoring the plurality of message service queues; and with a parsing processor, selecting one of the message service queues based on the monitoring and then retrieving and parsing a next one of the reformatted messages from the selected one of the message service queues in accordance with a target output data model, wherein the parsing processor comprises a plurality of parsers each operable to parse data messages having a different data type and wherein the parsing processor parses the next one of the reformatted messages using one of the plurality of parsers that is configured for parsing the data type associated with the selected service queue and is dynamically selected and allocated during the parsing step, wherein the parsing processor, during the parsing step, selects one of the parsers to use for parsing the next one of the reformatted messages based on the data type and at least one parsing rule applied to one or more characteristics of the next one of the reformatted messages and wherein the plurality of parsers includes at least two parsers adapted for parsing the data type associated with the selected service queue, and wherein the parsing includes extracting a subset of information in the payload of the next one of the reformatted messages defined in the target output data model, whereby throughput of the parsing processor is enhanced by extracting only select information from each of the data messages. - View Dependent Claims (2, 3, 10, 14, 15)
-
4. A non-transitory computer-readable storage device having a program code embodied therein to perform a method for processing messages, the method comprising the steps of:
a computer system; receiving, from a plurality of different data sources in a data network, a plurality of data messages each having a data type associated therewith and comprising a payload, wherein the plurality of data messages comprises different data formats; for each of the data messages, determining a classification of the data message by parsing out information identifying the data type; re-formatting each of the data messages, each of the re-formatted messages having a uniform data structure comprising an identifier, the classification, and the payload; selecting a message service queue for each of the re-formatted messages from a plurality of message service queues each dedicated to storing messages of a particular type according to the classifications of the reformatted messages such that each of the selected service queues store a subset of the reformatted messages of a single type; with a parsing processor, monitoring the plurality of message service queues; and with a parsing processor, selecting a next one of the message service queues to service based on the monitoring, receiving a next one of the reformatted messages from the selected one of the message service queues, and parsing the next one of the reformatted messages in accordance with a target output data model using one of a plurality of parsers selected by the parsing processor, after the receiving, using characteristics of the data message, wherein the parsing processor comprises a plurality of parsers each operable to parse data messages having a different data type and wherein the parsing processor parses the next one of the reformatted messages using one of the plurality of parsers that is configured for parsing the data type associated with the selected service queue and is dynamically selected and allocated during the parsing step, wherein the parsing processor selects one of the parsers to use for parsing the next one of the reformatted messages based on the data type and at least one parsing rule applied to one or more characteristics of the next one of the reformatted messages and wherein the plurality of parsers includes at least two parsers adapted for parsing the data type associated with the selected service queue, and wherein the parsing processor extracts a subset of information in the payload of the next one of the reformatted messages defined in the target output data model, whereby throughput of the parsing processor is enhanced by extracting only select information from each of the data messages. - View Dependent Claims (5, 6, 11, 12)
-
7. A computer system comprising:
-
a central processing unit and a memory; a receiving unit for receiving, from a plurality of different data sources in a data network a plurality of data messages each having a data type associated therewith and comprising a payload, wherein the plurality of data messages comprises different data formats; an identification unit for determining a classification of each of the data messages by parsing out information identifying the data type; a formatting unit for re-formatting each of the data messages, wherein each of the re-formatted massages comprises an identifier, the classification, and the payload; a selecting unit for selecting, based on the data type, a message service queue for each of the reformatted messages from a plurality of message service queues from the memory each adapted to process messages having differing data types such that each of the selected service queues store a subset of the reformatted messages of a single data type; monitoring the plurality of message service queues; and a parsing processor executed by the central processing unit that retrieves the reformatted message and parses the retrieved message using one of a plurality of parsers operable to parse data messages according to data types, wherein the parsing processor selects the one of the parsers based on the monitoring and based on the classification of the retrieved message and another characteristic of the retrieved message determined from the message payload, wherein the parsing processor comprises a plurality of parsers each operable to parse data messages having a different data type and wherein the parsing processor parses the next one of the reformatted messages using one of the plurality of parsers that is configured for parsing the data type associated with the selected service queue and is dynamically selected and allocated during the parsing step, wherein the parsing processor selects one of the parsers to use for parsing the next one of the reformatted messages based on the data type and at least one parsing rule applied to one or more characteristics of the next one of the reformatted messages and wherein the plurality of parsers includes at least two parsers adapted for parsing the data type associated with the selected service queue, and wherein the parsing processor extracts a subset of information in the payload of the next one of the reformatted messages defined in the target output data model, whereby throughput of the parsing processor is enhanced by extracting only select information from each of the data messages. - View Dependent Claims (8, 9, 13, 16, 17, 18)
-
Specification