Systems and methods for processing data
First Claim
1. A method for processing data, the method comprising:
- receiving, at a data processing tool, at least one data file including at least partially unstructured data from at least one data source, wherein the at least partially unstructured data includes actual data from a main application;
processing, by a processor, the at least partially unstructured data to generate at least partially structured data that includes tagged data, wherein the tagged data includes a tag inserted to precede at least one identified term of interest, and wherein processing the at least partially unstructured data comprises at least one of;
processing the at least partially unstructured data using an associative memory application that tags the at least one term of interest based on a generated identification score exceeding a predetermined threshold where the score is determined based on the number of matching terms between a segment of unstructured text and a segment of text in the associative memory application; and
processing the at least partially unstructured data using a regular expression processing program;
transmitting the at least one data file including the at least partially structured data to the main application;
incorporating the at least partially structured data into the main application based at least in part on the tagged data, wherein incorporating the at least partially structured data comprises at least one of including and excluding data based on at least one of existence, content and type of a tag;
displaying, at a user interface, the at least partially structured data, wherein at least partially structured data includes at least one segment of misidentified data that is at least one of incorrectly tagged and incorrectly not tagged;
receiving, at the user interface, a user selection of at least one segment of misidentified data;
updating the misidentified data to form re-identified data;
updating the associative memory application to include the re-identified data that includes data that has been correctly tagged or correctly not tagged;
receiving, at the data processing tool, text segments generated by parsing the at least partially unstructured data into discrete text segments;
identifying one or more of the text segments as boilerplate data based on a comparison between the text segments and strings of text in a column incorporated in an associative memory application, wherein the text segments need not exactly match the strings of text in the associative memory application; and
incorporating data including text segments parsed from the at least partially structured data into the main application, wherein the text identified as boilerplate data is excluded from the data incorporated into the main application.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for processing at least partially unstructured data is provided. The method includes receiving, at a data processing tool, at least partially unstructured data from at least one data source, and processing the at least partially unstructured data to generate at least partially structured data that includes tagged data, wherein processing the at least partially unstructured data includes at least one of processing the at least partially unstructured data using an associative memory application, and processing the at least partially unstructured data using a regular expression processing program. The method further includes transmitting the at least partially structured data to a main application, and incorporating the at least partially structured data into the main application based at least in part on the tagged data, wherein incorporating the at least partially structured data includes at least one of including and excluding data based on the existence, content and/or type of a tag.
39 Citations
20 Claims
-
1. A method for processing data, the method comprising:
-
receiving, at a data processing tool, at least one data file including at least partially unstructured data from at least one data source, wherein the at least partially unstructured data includes actual data from a main application; processing, by a processor, the at least partially unstructured data to generate at least partially structured data that includes tagged data, wherein the tagged data includes a tag inserted to precede at least one identified term of interest, and wherein processing the at least partially unstructured data comprises at least one of; processing the at least partially unstructured data using an associative memory application that tags the at least one term of interest based on a generated identification score exceeding a predetermined threshold where the score is determined based on the number of matching terms between a segment of unstructured text and a segment of text in the associative memory application; and processing the at least partially unstructured data using a regular expression processing program; transmitting the at least one data file including the at least partially structured data to the main application; incorporating the at least partially structured data into the main application based at least in part on the tagged data, wherein incorporating the at least partially structured data comprises at least one of including and excluding data based on at least one of existence, content and type of a tag; displaying, at a user interface, the at least partially structured data, wherein at least partially structured data includes at least one segment of misidentified data that is at least one of incorrectly tagged and incorrectly not tagged; receiving, at the user interface, a user selection of at least one segment of misidentified data; updating the misidentified data to form re-identified data; updating the associative memory application to include the re-identified data that includes data that has been correctly tagged or correctly not tagged; receiving, at the data processing tool, text segments generated by parsing the at least partially unstructured data into discrete text segments; identifying one or more of the text segments as boilerplate data based on a comparison between the text segments and strings of text in a column incorporated in an associative memory application, wherein the text segments need not exactly match the strings of text in the associative memory application; and incorporating data including text segments parsed from the at least partially structured data into the main application, wherein the text identified as boilerplate data is excluded from the data incorporated into the main application. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. One or more non-transitory computer-readable storage media having computer-executable instructions embodied thereon, wherein when executed by at least one processor, the computer-executable instructions cause the at least one processor to:
-
receive, at a data processing tool, at least one data file including at least partially unstructured data from at least one data source, wherein the at least partially unstructured data includes actual data from a main application; process the at least partially unstructured data to generate at least partially structured data that includes tagged data, wherein the tagged data includes a tag inserted to precede at least one identified term of interest, and wherein to process the at least partially unstructured data, the computer-executable instructions cause the processor to; process the at least partially unstructured data using an associative memory application that tags the at least one term of interest based on a generated identification score exceeding a predetermined threshold where the score is determined based on the number of matching terms between a segment of unstructured text and a segment of text in the associative memory application; and process the at least partially unstructured data using a regular expression processing program; transmit the at least one data file including the at least partially structured data to the main application; incorporate the at least partially structured data into the main application based at least in part on the tagged data, wherein incorporating the at least partially structured data includes at least one of including and excluding data based on existence of a tag; display, at a user interface, the at least partially structured data, wherein at least partially structured data includes at least one segment of misidentified data that is at least one of incorrectly tagged and incorrectly not tagged; receive, at the user interface, a user selection of at least one segment of misidentified data; update the misidentified data to form re-identified data; update the associative memory application to include the re-identified data that includes data that has been correctly tagged or correctly not tagged; receive, at the data processing tool, text segments generated by parsing the at least partially unstructured data into discrete text segments; identify one or more of the text segments as boilerplate data based on a comparison between the text segments and strings of text in a column incorporated in an associative memory application, wherein the text segments need not exactly match the strings of text in the associative memory application; and incorporate data including text segments parsed from the at least partially structured data into the main application, wherein the text identified as boilerplate data is excluded from the data incorporated into the main application. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A system for processing data, the system comprising:
-
a processing device; a user interface communicatively coupled to said processing device; and at least one of a memory communicatively coupled to said processing device and a communications interface communicatively coupled to said processing device, said processing device programmed to; receive at least one data file including at least partially unstructured data from at least one of said memory and said communications interface, wherein the at least partially unstructured data includes actual data from a main application; and process the at least partially unstructured data using a data processing tool executing thereon to generate at least partially structured data that includes tagged data including a tag inserted to precede at least one identified term of interest by at least one of; processing the at least partially unstructured data using an associative memory application executing thereon that tags the at least one term of interest based on a generated identification score exceeding a predetermined threshold where the score is determined based on the number of matching terms between a segment of unstructured text and a segment of text in the associative memory application; and processing the at least partially unstructured data using a regular expression processing program executing thereon; and incorporate the at least partially structured data into the main application based on the tagging, wherein incorporating the at least partially structured data includes at least one of including and excluding data based on existence of a tag; and display, at the user interface, the at least partially structured data, wherein at least partially structured data includes at least one segment of misidentified data that is at least one of incorrectly tagged and incorrectly not tagged; receive a user selection of at least one segment of misidentified data; update the misidentified data to form re-identified data; update the associative memory application to include the re-identified data that includes data that has been correctly tagged or correctly not tagged; receive, at the data processing tool, text segments generated by parsing the at least partially unstructured data into discrete text segments; identify one or more of the text segments as boilerplate data based on a comparison between the text segments and strings of text in a column incorporated in an associative memory application, wherein the text segments need not exactly match the strings of text in the associative memory application; and incorporate data including text segments parsed from the at least partially structured data into the main application, wherein the text identified as boilerplate data is excluded from the data incorporated into the main application. - View Dependent Claims (17, 18, 19, 20)
-
Specification