Method to collate and extract desired contents from heterogeneous text-data streams
First Claim
1. A method for extracting desired contents from multiple heterogeneous textual streams to provide normalized data, the method comprising:
- selecting, by a user, a plurality of input streams, the plurality of input streams comprising a first input stream of text data and second input stream of text data heterogeneous with respect to the first input stream;
selecting, by a user, a first set of parse rules corresponding to the first input stream and a second set of parse rules corresponding to the second input stream and distinct from the first set of parse rules;
selecting a first output interface;
extracting desired contents from the plurality of input streams; and
providing normalized data representing the desired contents, and adapted to the output interface.
7 Assignments
0 Petitions
Accused Products
Abstract
A method extracts desired contents from multiple heterogeneous textual streams and provides normalized data representative of the desired contents. The method selects input streams containing text data wherein the text data of different input streams may differ in format. The method further selects a first set of parse rules corresponding to one input stream and a second set of parse rules, distinct from the first set, which correspond to a second input stream. The invention extracts desired contents from the input streams and provides normalized data which represents the desired contents. The invention selects an output interface and adapts the normalized data representing the desired contents to the output interface. The invention sends the normalized data to the output interface and the output interface is instructed to transform and format the normalized data into device specific data.
-
Citations
20 Claims
-
1. A method for extracting desired contents from multiple heterogeneous textual streams to provide normalized data, the method comprising:
-
selecting, by a user, a plurality of input streams, the plurality of input streams comprising a first input stream of text data and second input stream of text data heterogeneous with respect to the first input stream;
selecting, by a user, a first set of parse rules corresponding to the first input stream and a second set of parse rules corresponding to the second input stream and distinct from the first set of parse rules;
selecting a first output interface;
extracting desired contents from the plurality of input streams; and
providing normalized data representing the desired contents, and adapted to the output interface. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
formatting, by the first output interface, the normalized data into the device specific data, the device-specific data corresponding to a logical output device; and
writing the device-specific data to the logical output device.
-
-
4. The method of claim 3, wherein the logical output device is selected from a text stream device, a file device, a database device, a monitor device, and a printer device.
-
5. The method of claim 2, further comprising sending the normalized data to a plurality of output interfaces, including the first output interface, the plurality of output interfaces being associated with a plurality of logical output devices.
-
6. The method of claim 5, further comprising:
-
providing, by each output interface of the plurality of interfaces, device specific data corresponding to a respective, corresponding, logical output device of the plurality of logical output devices; and
sending the device specific data to the respective, corresponding, logical output device.
-
-
7. The method of claim 5, wherein the plurality of logical output devices is heterogeneous.
-
8. The method of claim 5, wherein each logical output device of the plurality of logical output devices is selected from a text stream device, a file device, a database device, a monitor device, and a printer device.
-
9. The method of claim 1, wherein the desired contents comprise desired fields defined by delimiters.
-
10. The method of claim 1, wherein extracting comprises parsing each input stream according to the parse rules associated therewith to obtain the desired contents.
-
11. The method of claim 1, wherein providing the normalized data comprises consolidating and formatting the desired contents.
-
12. The method of claim 1, wherein the input streams are selected from a file and a data stream.
-
13. A computer system for extracting desired contents from multiple heterogeneous textual streams to provide normalized data to a first output interface, the system comprising:
-
a processor programmed to open multiple heterogeneous input streams, selectable by a user, and parse the input streams according to parse rules, selectable by a user, the processor further programmed to extract desired contents from the input streams and consolidate the desired contents into normalized data;
memory operably connected to the processor for storing data structures, the data structures including;
an opening module for opening the input streams;
a parsing module for parsing the input streams;
an extraction module for extracting the desired contents from the input streams; and
the first output interface for transforming the normalized data into device specific data. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A computer readable medium storing data structures for extracting desired contents from multiple heterogeneous textual streams to provide normalized data representing the desired contents to a first output interface module adapted to format the normalized data, the data structures comprising:
-
portions of the textual streams;
an opening module for opening the textual streams;
an extraction module for extracting the desired contents from the textual streams;
device configuration data defining a configuration of the first output interface module including identification data for identifying an output device and including format data for formatting the normalized data;
parse rules, selectable by a user and associated with the textual streams defining locations of the desired contents relative to other textual data in the textual streams; and
the first output interface module executable by the processor for processing the device configuration data, receiving the normalized data, and formatting the normalized data to provide device specific data. - View Dependent Claims (19, 20)
-
Specification