Systems and methods for processing data

US 9,501,455 B2
Filed: 06/30/2011
Issued: 11/22/2016
Est. Priority Date: 06/30/2011
Status: Active Grant

First Claim

Patent Images

1. A method for processing data, the method comprising:

receiving, at a data processing tool, at least one data file including at least partially unstructured data from at least one data source, wherein the at least partially unstructured data includes actual data from a main application;

processing, by a processor, the at least partially unstructured data to generate at least partially structured data that includes tagged data, wherein the tagged data includes a tag inserted to precede at least one identified term of interest, and wherein processing the at least partially unstructured data comprises at least one of;

processing the at least partially unstructured data using an associative memory application that tags the at least one term of interest based on a generated identification score exceeding a predetermined threshold where the score is determined based on the number of matching terms between a segment of unstructured text and a segment of text in the associative memory application; and

processing the at least partially unstructured data using a regular expression processing program;

transmitting the at least one data file including the at least partially structured data to the main application;

incorporating the at least partially structured data into the main application based at least in part on the tagged data, wherein incorporating the at least partially structured data comprises at least one of including and excluding data based on at least one of existence, content and type of a tag;

displaying, at a user interface, the at least partially structured data, wherein at least partially structured data includes at least one segment of misidentified data that is at least one of incorrectly tagged and incorrectly not tagged;

receiving, at the user interface, a user selection of at least one segment of misidentified data;

updating the misidentified data to form re-identified data;

updating the associative memory application to include the re-identified data that includes data that has been correctly tagged or correctly not tagged;

receiving, at the data processing tool, text segments generated by parsing the at least partially unstructured data into discrete text segments;

identifying one or more of the text segments as boilerplate data based on a comparison between the text segments and strings of text in a column incorporated in an associative memory application, wherein the text segments need not exactly match the strings of text in the associative memory application; and

incorporating data including text segments parsed from the at least partially structured data into the main application, wherein the text identified as boilerplate data is excluded from the data incorporated into the main application.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for processing at least partially unstructured data is provided. The method includes receiving, at a data processing tool, at least partially unstructured data from at least one data source, and processing the at least partially unstructured data to generate at least partially structured data that includes tagged data, wherein processing the at least partially unstructured data includes at least one of processing the at least partially unstructured data using an associative memory application, and processing the at least partially unstructured data using a regular expression processing program. The method further includes transmitting the at least partially structured data to a main application, and incorporating the at least partially structured data into the main application based at least in part on the tagged data, wherein incorporating the at least partially structured data includes at least one of including and excluding data based on the existence, content and/or type of a tag.

39 Citations

View as Search Results

20 Claims

1. A method for processing data, the method comprising:
- receiving, at a data processing tool, at least one data file including at least partially unstructured data from at least one data source, wherein the at least partially unstructured data includes actual data from a main application;
  
  processing, by a processor, the at least partially unstructured data to generate at least partially structured data that includes tagged data, wherein the tagged data includes a tag inserted to precede at least one identified term of interest, and wherein processing the at least partially unstructured data comprises at least one of;
  
  processing the at least partially unstructured data using an associative memory application that tags the at least one term of interest based on a generated identification score exceeding a predetermined threshold where the score is determined based on the number of matching terms between a segment of unstructured text and a segment of text in the associative memory application; and
  
  processing the at least partially unstructured data using a regular expression processing program;
  
  transmitting the at least one data file including the at least partially structured data to the main application;
  
  incorporating the at least partially structured data into the main application based at least in part on the tagged data, wherein incorporating the at least partially structured data comprises at least one of including and excluding data based on at least one of existence, content and type of a tag;
  
  displaying, at a user interface, the at least partially structured data, wherein at least partially structured data includes at least one segment of misidentified data that is at least one of incorrectly tagged and incorrectly not tagged;
  
  receiving, at the user interface, a user selection of at least one segment of misidentified data;
  
  updating the misidentified data to form re-identified data;
  
  updating the associative memory application to include the re-identified data that includes data that has been correctly tagged or correctly not tagged;
  
  receiving, at the data processing tool, text segments generated by parsing the at least partially unstructured data into discrete text segments;
  
  identifying one or more of the text segments as boilerplate data based on a comparison between the text segments and strings of text in a column incorporated in an associative memory application, wherein the text segments need not exactly match the strings of text in the associative memory application; and
  
  incorporating data including text segments parsed from the at least partially structured data into the main application, wherein the text identified as boilerplate data is excluded from the data incorporated into the main application.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method according to claim 1, further comprising:
    - verifying at least partially structured data is tagged correctly; and
      
      releasing at least partially structured data, such that the at least partially structured data may be incorporated into the main application.
  - 3. The method according to claim 2, wherein verifying at least partially structured data comprises examining one or more identification tags in the at least partially structured data.
  - 4. The method according to claim 1, wherein processing at least partially unstructured data using an associative memory application comprises:
    - parsing at least partially unstructured data into one or more segments of at least partially unstructured data;
      
      querying the associative memory application with at least one segment of the at least partially unstructured data;
      
      generating a score associated with the at least one segment of at least partially unstructured data and at least one segment of data in the associative memory application; and
      
      tagging the at least one segment of at least partially unstructured data based on the score.
  - 5. The method according to claim 4, wherein querying the associative memory application comprises querying an associative memory application that includes at least one segment of data containing boilerplate, and wherein tagging at least one segment of at least partially unstructured data comprises tagging at least one segment of at least partially unstructured data that includes boilerplate.
  - 6. The method according to claim 1, further comprising:
    - updating the data processing tool based on the at least one segment of misidentified data.
  - 7. The method according to claim 1, further comprising outputting at least partially structured data to one of an output table and an output hypertext markup language (HTML) page.
  - 8. The method according to claim 1, wherein processing the at least partially unstructured data using a regular expression processing program comprises:
    - applying at least one source regular expression pattern to at least partially unstructured data;
      
      matching at least one segment of the at least partially unstructured data to the at least one source regular expression pattern; and
      
      tagging at least one matched segment of the at least partially unstructured data.
  - 9. The method according to claim 8, wherein tagging at least one matched segment of the at least partially unstructured data comprises tagging at least one matched segment of at least partially unstructured data with an identification tag.
  - 10. The method according to claim 1, wherein updating the misidentified data comprises:
    - placing the misidentified data back into the processing without correcting the misidentified data; and
      
      manually identifying the misidentified data to form the re-identified data.

11. One or more non-transitory computer-readable storage media having computer-executable instructions embodied thereon, wherein when executed by at least one processor, the computer-executable instructions cause the at least one processor to:
- receive, at a data processing tool, at least one data file including at least partially unstructured data from at least one data source, wherein the at least partially unstructured data includes actual data from a main application;
  
  process the at least partially unstructured data to generate at least partially structured data that includes tagged data, wherein the tagged data includes a tag inserted to precede at least one identified term of interest, and wherein to process the at least partially unstructured data, the computer-executable instructions cause the processor to;
  
  process the at least partially unstructured data using an associative memory application that tags the at least one term of interest based on a generated identification score exceeding a predetermined threshold where the score is determined based on the number of matching terms between a segment of unstructured text and a segment of text in the associative memory application; and
  
  process the at least partially unstructured data using a regular expression processing program;
  
  transmit the at least one data file including the at least partially structured data to the main application;
  
  incorporate the at least partially structured data into the main application based at least in part on the tagged data, wherein incorporating the at least partially structured data includes at least one of including and excluding data based on existence of a tag;
  
  display, at a user interface, the at least partially structured data, wherein at least partially structured data includes at least one segment of misidentified data that is at least one of incorrectly tagged and incorrectly not tagged;
  
  receive, at the user interface, a user selection of at least one segment of misidentified data;
  
  update the misidentified data to form re-identified data;
  
  update the associative memory application to include the re-identified data that includes data that has been correctly tagged or correctly not tagged;
  
  receive, at the data processing tool, text segments generated by parsing the at least partially unstructured data into discrete text segments;
  
  identify one or more of the text segments as boilerplate data based on a comparison between the text segments and strings of text in a column incorporated in an associative memory application, wherein the text segments need not exactly match the strings of text in the associative memory application; and
  
  incorporate data including text segments parsed from the at least partially structured data into the main application, wherein the text identified as boilerplate data is excluded from the data incorporated into the main application.
- View Dependent Claims (12, 13, 14, 15)
- - 12. One or more non-transitory computer-readable storage media having computer-executable instructions embodied thereon according to claim 11, wherein to process the at least partially unstructured data using an associative memory application, the computer-executable instructions cause the at least one processor to:
    - parse the at least partially unstructured data into one or more segments of the at least partially unstructured data;
      
      query the associative memory application with at least one segment of the at least partially unstructured data;
      
      generate a score associated with the at least one segment of the at least partially unstructured data and at least one segment of data in the associative memory application; and
      
      tag the at least one segment of the at least partially unstructured data based on the score.
  - 13. One or more non-transitory computer-readable storage media having computer-executable instructions embodied thereon according to claim 11, wherein the computer-executable instructions cause the at least one processor to:
    - update the data processing tool based on the at least one segment of misidentified data.
  - 14. One or more non-transitory computer-readable storage media having computer-executable instructions embodied thereon according to claim 11, wherein to process the at least partially unstructured data using a regular expression processing program, the computer-executable instructions cause the at least one processor to:
    - apply at least one source regular expression pattern to the at least partially unstructured data;
      
      match at least one segment of the at least partially unstructured data to the at least one source regular expression pattern; and
      
      tag the at least one matched segment of the at least partially unstructured data.
  - 15. One or more non-transitory computer-readable storage media having computer-executable instructions embodied thereon according to claim 11, wherein the computer-executable instructions cause the at least one processor to output the at least partially structured data to one of an output table and an output hypertext markup language (HTML) page.

16. A system for processing data, the system comprising:
- a processing device;
  
  a user interface communicatively coupled to said processing device; and
  
  at least one of a memory communicatively coupled to said processing device and a communications interface communicatively coupled to said processing device, said processing device programmed to;
  
  receive at least one data file including at least partially unstructured data from at least one of said memory and said communications interface, wherein the at least partially unstructured data includes actual data from a main application; and
  
  process the at least partially unstructured data using a data processing tool executing thereon to generate at least partially structured data that includes tagged data including a tag inserted to precede at least one identified term of interest by at least one of;
  
  processing the at least partially unstructured data using an associative memory application executing thereon that tags the at least one term of interest based on a generated identification score exceeding a predetermined threshold where the score is determined based on the number of matching terms between a segment of unstructured text and a segment of text in the associative memory application; and
  
  processing the at least partially unstructured data using a regular expression processing program executing thereon; and
  
  incorporate the at least partially structured data into the main application based on the tagging, wherein incorporating the at least partially structured data includes at least one of including and excluding data based on existence of a tag; and
  
  display, at the user interface, the at least partially structured data, wherein at least partially structured data includes at least one segment of misidentified data that is at least one of incorrectly tagged and incorrectly not tagged;
  
  receive a user selection of at least one segment of misidentified data;
  
  update the misidentified data to form re-identified data;
  
  update the associative memory application to include the re-identified data that includes data that has been correctly tagged or correctly not tagged;
  
  receive, at the data processing tool, text segments generated by parsing the at least partially unstructured data into discrete text segments;
  
  identify one or more of the text segments as boilerplate data based on a comparison between the text segments and strings of text in a column incorporated in an associative memory application, wherein the text segments need not exactly match the strings of text in the associative memory application; and
  
  incorporate data including text segments parsed from the at least partially structured data into the main application, wherein the text identified as boilerplate data is excluded from the data incorporated into the main application.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The system according to claim 16, wherein said processing device is further programmed to:
    - update the data processing tool executing thereon based on the at least one segment of misidentified data.
  - 18. The system according to claim 16, wherein to process the at least partially unstructured data using an associative memory application, said processing device further programmed to:
    - parse the at least partially unstructured data into one or more segments of the at least partially unstructured data;
      
      query the associative memory application executing thereon with at least one segment of the at least partially unstructured data;
      
      generate a score associated with the at least one segment of the at least partially unstructured data and at least one segment of data in the associative memory application; and
      
      tag the at least one segment of the at least partially unstructured data based on the score.
  - 19. The system according to claim 16, wherein to process the at least partially unstructured data using a regular expression processing program, said processing device further programmed to:
    - apply at least one source regular expression pattern to the at least partially unstructured data;
      
      match at least one segment of the at least partially unstructured data to the at least one source regular expression pattern; and
      
      tag the at least one matched segment of the at least partially unstructured data.
  - 20. The system according to claim 16, wherein said processing device further programmed to output the at least partially structured data to one of an output table in said memory and an output hypertext markup language (HTML) page for display via said user interface.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
The Boeing Co.
Original Assignee
The Boeing Co.
Inventors
Quadracci, Leonard Jon, Nakamoto, Kyle M., Warn, Brian
Primary Examiner(s)
Spooner, Lamont

Application Number

US13/173,028
Publication Number

US 20130006610A1
Time in Patent Office

1,972 Days
Field of Search

704/1, 704/9, 704/10, 707706-708
US Class Current

1/1
CPC Class Codes

G06F 40/117 Tagging; Marking up details...

G06F 40/289 Phrasal analysis, e.g. fini...

Systems and methods for processing data

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

39 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for processing data

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

39 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links