Method and apparatus for retrieving and processing data
First Claim
Patent Images
1. A computer implemented method comprising:
- a processor receiving account access information associated with a user via a network;
in response to receiving the account access information, accessing a web page associated with the user'"'"'s account using the received account access information;
extracting data from the web page associated with the user'"'"'s account including executing a script;
verifying the user'"'"'s ability to access the account based on the data extracted from the web page associated with the user'"'"'s account; and
if the user'"'"'s ability to access the account is not verified, resulting in a failure to verify the user'"'"'s ability to access the account, determining a reason associated with the failure;
determining whether a change in a data layout of a web site has occurred;
if the change in the data layout of the web site has been determined, the processor reporting the change of the data layout as a failure to access due to the change in the data layout of the web site such that a process of editing scripts for accessing the site is automatically initiated, wherein the method further comprises storing the web page without the account access information, and detecting errors while a script is accessing the changed web page, and editing the script in response, wherein editing comprises editing the script to properly harvest data from the changed web page, and wherein the web page is associated with a financial institution, and the web page contains information regarding a customer'"'"'s personal identification information associated with the financial institution.
4 Assignments
0 Petitions
Accused Products
Abstract
Data is captured from a web site or other data source. Data is extracted from the web page using a data harvesting script or other data acquisition routine. The extracted data is then normalized and stored in a database. If data cannot be extracted from the web page, a copy of the captured web page is stored without personal information contained in the web page. The data harvesting script is then edited based on an analysis of the captured web page.
142 Citations
8 Claims
-
1. A computer implemented method comprising:
-
a processor receiving account access information associated with a user via a network; in response to receiving the account access information, accessing a web page associated with the user'"'"'s account using the received account access information; extracting data from the web page associated with the user'"'"'s account including executing a script; verifying the user'"'"'s ability to access the account based on the data extracted from the web page associated with the user'"'"'s account; and if the user'"'"'s ability to access the account is not verified, resulting in a failure to verify the user'"'"'s ability to access the account, determining a reason associated with the failure; determining whether a change in a data layout of a web site has occurred; if the change in the data layout of the web site has been determined, the processor reporting the change of the data layout as a failure to access due to the change in the data layout of the web site such that a process of editing scripts for accessing the site is automatically initiated, wherein the method further comprises storing the web page without the account access information, and detecting errors while a script is accessing the changed web page, and editing the script in response, wherein editing comprises editing the script to properly harvest data from the changed web page, and wherein the web page is associated with a financial institution, and the web page contains information regarding a customer'"'"'s personal identification information associated with the financial institution. - View Dependent Claims (2, 3, 4)
-
-
5. A computer implemented method comprising:
-
a processor receiving user information comprising account access information associated with a user via a network; capturing a web page from a web site based on the account access information, wherein the web page contains user account data and user identification information, wherein the user account data is extracted from the web page using a data harvesting script; normalizing the user account data, and storing the user account data in a database; detecting an error while the script is accessing a changed web page; storing the web page without the user information; editing the script; identifying each failure to verify the user'"'"'s ability to access the account based on information contained in the captured web page; the processor assigning bugs based on the identified failures wherein identified failures include modification of the captured web page; and automatically accessing HTML data to repair scripts associated with each failure, wherein repairing comprises determining modification of the captured web page and modification of scripts accordingly, and wherein the captured web page is associated with a financial institution, and web page contains information regarding a customer'"'"'s personal identification information associated with the financial institution.
-
-
6. An apparatus comprising:
-
a data capture module to receive account information associated with a user, the data capture module further to capture a web page from a web site associated with a financial institution using the received account information; a data extraction module coupled to the data capture module and configured to extract data from the captured web page using a data harvesting script, the data extraction module further configured to verify the user'"'"'s ability to access the account based on the data extracted from the captured web page; and a database control module coupled to the data extraction module and configured to store the captured web page; a failure analysis module coupled to the data extraction module and configured to, verify the user'"'"'s ability to access the account based on the data extracted from the captured web page; determine whether the verification has failed; if the verification has failed, identify the failure; sort the failure; and report the failure such that a process of repairing a script associated with the failure is automatically initiated, and wherein the process comprises accessing code associated with a web page, and wherein a cause of the failure comprises a change in the web page, and wherein the repair comprises determining the change in the web page, the data extraction module further configure to, detect errors while the script is accessing a changed web page; store the web page without account information associated with the user; and edit the script to properly harvest data from the changed web page. - View Dependent Claims (7, 8)
-
Specification