Computer system and method of data analysis

US 6,035,295 A
Filed: 11/24/1998
Issued: 03/07/2000
Est. Priority Date: 01/07/1997
Status: Expired due to Term

First Claim

Patent Images

1. A data comparison system for comparing data stored within a database against each other to determine at least one of duplicate, fraudulent, defective and irregular data, said data comparison system comprising:

a database storing data therein;

a pattern database storing pattern data therein;

a data pattern build system, responsively connected to said database and to said pattern database, retrieving the data from said database and generating the pattern data formatted in accordance a predetermined patten, the predetermined pattern comprising an array having array locations corresponding to each character in a defined character set, said data pattern build system incrementing a value in each of the array locations responsive to the number of occurrences of each character in the data and storing the array as the pattern data in said pattern database; and

a neural network, responsively connected to said pattern database, retrieving the pattern data stored therein and comparing the pattern data to each other and determining responsive to the comparing when different pattern data match in accordance with predetermined criteria indicating that the different pattern data are at least one of duplicate, fraudulent, defective and irregular.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A neural network based data comparison system compares data stored within a database against each other to determine duplicative, fraudulent, defective and/or irregular data. The system includes a database storing data therein, and a pattern database storing pattern data therein. The system further includes a data pattern build system, responsively connected to the database and to the pattern database. The data pattern build system retrieves the data from the database and generates the pattern data formatted in accordance a predetermined patten. The predetermined pattern includes an array having array locations corresponding to each character in a defined character set. The data pattern build system increments a value in each of the array locations responsive to the number of occurrences of each character in the data and stores the array as the pattern data in the pattern database. The comparison system also includes a neural network, responsively connected to the pattern database, which retrieves the pattern data stored therein and compares the pattern data to each other and determines responsive to the comparing when different pattern data match in accordance with predetermined criteria indicating that the different pattern data are duplicative, fraudulent, defective and/or irregular.

94 Citations

View as Search Results

16 Claims

1. A data comparison system for comparing data stored within a database against each other to determine at least one of duplicate, fraudulent, defective and irregular data, said data comparison system comprising:
- a database storing data therein;
  
  a pattern database storing pattern data therein;
  
  a data pattern build system, responsively connected to said database and to said pattern database, retrieving the data from said database and generating the pattern data formatted in accordance a predetermined patten, the predetermined pattern comprising an array having array locations corresponding to each character in a defined character set, said data pattern build system incrementing a value in each of the array locations responsive to the number of occurrences of each character in the data and storing the array as the pattern data in said pattern database; and
  
  a neural network, responsively connected to said pattern database, retrieving the pattern data stored therein and comparing the pattern data to each other and determining responsive to the comparing when different pattern data match in accordance with predetermined criteria indicating that the different pattern data are at least one of duplicate, fraudulent, defective and irregular.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. A data comparison system according to claim 1, wherein said data pattern build system normalizes the value in each of the array locations with respect to a total number of characters in the data producing a normalized array for each data, and stores the normalized array as the pattern data in said pattern database.
  - 3. A data comparison system according to claim 1, wherein said neural network determines that when the different pattern data match responsive to a threshold value indicating that the different pattern data are at least one of duplicate, fraudulent, defective and irregular.
  - 4. A data comparison system according to claim 1, wherein said neural network determines that when the different pattern data match irrespective of whether the different pattern data include upper and lower case characters or empty spaces.
  - 5. A data comparison system according to claim 1, wherein said neural network comprises one of a Kohonan neural network and a Back Propagation neural network.
  - 6. A data comparison system according to claim 1, wherein said data pattern build system generates the pattern data formatted in accordance a second predetermined patten, the second predetermined pattern comprising the array having the array locations corresponding to pairs of characters in the defined character set, said data pattern build system incrementing the value in each of the array locations responsive to the number of occurrences of each of the pairs of the characters.
  - 7. A data comparison system according to claim 6, wherein an array number representing the number of array locations comprises the number of characters in the defined character set to the power of 2.
  - 8. A data comparison system according to claim 6, wherein the pairs of characters comprise adjacent pairs of characters.
  - 9. A data comparison system according to claim 1, wherein said data pattern build system generates the pattern data formatted in accordance a second predetermined patten, the second predetermined pattern comprising the array having the array locations corresponding to three character combinations in the defined character set, said data pattern build system incrementing the value in each of the array locations responsive to the number of occurrences of each of the three character combinations.
  - 10. A data comparison system according to claim 9, wherein an array number representing the number of array locations comprises the number of characters in the defined character set to the power of 3.
  - 11. A data comparison system according to claim 1, further comprising defective/duplicative data detection means for determining when the different pattern data are at least one of duplicate, fraudulent, defective and irregular, and for at least one of:
    - displaying whether the different pattern data are at least one of duplicate, fraudulent, defective and irregular;
      
      generating a report confirming that the different pattern data are valid and for processing the validated pattern data in accordance with a predetermined set of processing instructions;
      
      generating a warning report indicating that the different pattern data are potentially at least one of duplicate, fraudulent, defective and irregular; and
      
      generating an electronic mail message reporting to a data originator of the data represented by the different pattern data indicating whether the different pattern data are at least one of duplicate, fraudulent, defective and irregular.

12. In a data comparison system for comparing data stored within a repository against each other to determine at least one of duplicate, fraudulent, defective and irregular data, the data comparison system comprising a database storing data therein, a pattern database storing pattern data therein, a data pattern build system and a neural network, a method of comparing the data stored within the database against each other to determine at least one of duplicate, fraudulent, defective and irregular data, said method comprising the steps of:
- (a) retrieving the data from the database;
  
  (b) generating pattern data formatted in accordance a predetermined pattern from the data, the predetermined pattern comprising an array having array locations corresponding to each character in a defined character set, said generating the pattern data comprising the step of incrementing a value in each of the array locations responsive to the number of occurrences of each character in the data;
  
  (d) storing the array as the pattern data; and
  
  (e) retrieving the pattern data and comparing the pattern data to each other; and
  
  (f) determining responsive to said comparing when different pattern data match in accordance with predetermined criteria indicating that the different pattern data are at least one of duplicate, fraudulent, defective and irregular.

13. A computer based system for examining data stored electronically in a database of a computer, wherein the database stores records each having a one or more fields, comprising:
- accessing means for accessing criteria stored in the computer, wherein the criteria includes the number of items stored in the database to be examined;
  
  sample generation means for generating a sample set of data based on the criteria, such that said sample set of data includes at least the number of items to be examined in accordance with the criteria, wherein said sample set of data is selected by applying at least one of a focus group criteria, a filter criteria, a skew criteria, or an empty field indicator, wherein said sample generation means includes at least one offocus group means, responsive to said focus group criteria, for logically organizing a variety of fields within the database whose combined accuracy are analyzed as one unit,filter means, responsive to said filter criteria, for determining records and fields for inclusion in said sample set,empty field means, responsive to said empty field indicator, for not including empty fields in said sample set when said sample set is generated, andskew means, responsive to said skew criteria, for emphasizing one or more fields within a record such that said sample set is biased towards said emphasized one or more fields, but does not limit said sample set to only emphasized fields; and
  
  analysis means for accepting said errors, and determining results, said results indicating approximate accuracy values of said the records stored in said database.
- View Dependent Claims (14, 15, 16)
- - 14. A computer based system according to claim 13, further comprising:
    - a pattern database storing pattern data therein;
      
      a data pattern build system, responsively connected to said database and to said pattern database, retrieving the data from said database and generating the pattern data formatted in accordance a predetermined patten, the predetermined pattern comprising an array having array locations; and
      
      a neural network, responsively connected to said pattern database, retrieving the pattern data stored therein and comparing the pattern data to each other and determining when different pattern data match in accordance with predetermined criteria indicating that the different pattern data are at least one of duplicate, fraudulent, defective and irregular.
  - 15. A computer based system according to claim 14, wherein said data pattern build system receives said results from said analysis means and filters the records stored in said database prior to building the pattern data responsive to said results.
  - 16. A computer based system according to claim 14,wherein said neural network generates a match result signal indicating whether the different pattern data are at least one of duplicate, fraudulent, defective and irregular, andwherein said analysis means receives the match result from said neural network and simultaneously indicates the approximate accuracy values of said the records stored in said database and the different pattern data which match.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Renaissance Group IP Holdings, LLC
Original Assignee
Laurence C. Klein
Inventors
Klein, Laurence C.
Primary Examiner(s)
Black, Thomas G.
Assistant Examiner(s)
MIZRAHI, DIANE D

Application Number

US09/198,442
Time in Patent Office

469 Days
Field of Search

707/1, 707/2, 707/5, 707/6, 707/7, 707/10, 707/101, 707/204, 707/205, 707/206, 707/508, 704/2, 455/410, 382/144, 382/145, 382/149, 382/321, 347/225, 347/260, 84/609, 342/457
US Class Current

1/1
CPC Class Codes

G06F 11/008   Reliability or availability...

G06F 11/1612   where the redundant compone...

G06F 2221/2101   Auditing as a secondary aspect

Y10S 707/99932   Access augmentation or opti...

Y10S 707/99935   Query augmenting and refini...

Y10S 707/99936   Pattern matching access

Y10S 707/99937   Sorting

Y10S 707/99942   Manipulating data structure...

Y10S 707/99945   Object-oriented database st...

Y10S 707/99953   Recoverability

Computer system and method of data analysis

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

94 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Computer system and method of data analysis

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

94 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links