Method and apparatus for determining a set of large sequences from an electronic data base
First Claim
1. A method of determining a set of large sequences from an electronic data base comprising a set D={d1, . . . , dn} of transactions di (1≦
- i≦
n) in a computer system with an implemented query module, each of the large sequences on the set D of transactions di having a support value greater than or equal to a given support value S, each of the transactions di of the set D being a sequence of items of a record E={e1, . . . , em} of items ej (1≦
j≦
m) and the method comprising the following steps;
a) determining a set L1 of large sequences from the set D of transactions, the large sequences of set L1 each comprising exactly one item of the record E, and an assigned support value SL1 on the sequence D of transactions each being greater than or equal to the given support value S;
b) determining a set L2 of large sequences from the set D of transactions, the large sequences of set L2 each comprising exactly two items of the record E in a respective order RL2, and an assigned support value SL2 on the set D of transactions each being greater than or equal to the given support value S, and nothing but sequences comprising one of the large sequences of set L1, as a partial sequence, being taken into account in determining set L2;
c) determining a set Lk (k>
2) of large sequences from the set D of transactions, the large sequences of set Lk each comprising exactly k items of record E in a respective order RLK, and an assigned support value SLK on the sequence D of transactions each being greater than or equal to the given support value S, and nothing but sequences comprising two of the large sequences of set Lk−
1, as partly overlapping partial sequences, with the respective order RLK−
1, being taken into account in determining set Lk; and
d) repeating step c) for k=k+1 and terminating the repetition of step c) when a given termination condition is fulfilled.
1 Assignment
0 Petitions
Accused Products
Abstract
The instant invention relates to a method of and an apparatus for determining a set of large sequences from an electronic data base comprising a set D={d1, . . . , dn} of transactions di (1≦i≦n) in a computer system with an implemented query module, each of the large sequences on the set D of transactions di having a support value greater than or equal to a given support value S, each of the transactions di of the set D being a sequence of items of a record E={e1, . . . , em} of items ej (1≦j≦m). A set Lk (k>2) of large sequences is determined from the set D of transactions, the large sequences of set Lk each comprising exactly k items of record E in a respective order RLK, and an associated support value SLK on the sequence D of transactions each being greater than or equal to the given support value S, and nothing but sequences comprising two of the large sequences of set Lk−1, as partly overlapping partial sequences, with the respective order RLK−1, being taken into account in determining set Lk.
-
Citations
7 Claims
-
1. A method of determining a set of large sequences from an electronic data base comprising a set D={d1, . . . , dn} of transactions di (1≦
- i≦
n) in a computer system with an implemented query module, each of the large sequences on the set D of transactions di having a support value greater than or equal to a given support value S, each of the transactions di of the set D being a sequence of items of a record E={e1, . . . , em} of items ej (1≦
j≦
m) and the method comprising the following steps;
a) determining a set L1 of large sequences from the set D of transactions, the large sequences of set L1 each comprising exactly one item of the record E, and an assigned support value SL1 on the sequence D of transactions each being greater than or equal to the given support value S;
b) determining a set L2 of large sequences from the set D of transactions, the large sequences of set L2 each comprising exactly two items of the record E in a respective order RL2, and an assigned support value SL2 on the set D of transactions each being greater than or equal to the given support value S, and nothing but sequences comprising one of the large sequences of set L1, as a partial sequence, being taken into account in determining set L2;
c) determining a set Lk (k>
2) of large sequences from the set D of transactions, the large sequences of set Lk each comprising exactly k items of record E in a respective order RLK, and an assigned support value SLK on the sequence D of transactions each being greater than or equal to the given support value S, and nothing but sequences comprising two of the large sequences of set Lk−
1, as partly overlapping partial sequences, with the respective order RLK−
1, being taken into account in determining set Lk; and
d) repeating step c) for k=k+1 and terminating the repetition of step c) when a given termination condition is fulfilled. - View Dependent Claims (2, 3, 4, 5)
- i≦
-
6. A computer program product for determining a set of large sequences from an electronic data base comprising a set D={d1, . . . , dn} of transactions di (1≦
- i≦
n) in a computer system with an implemented query module, each of the large sequences on the set D of transactions di having a support value greater than or equal to a given support value S, each of the transactions di of the set D being a sequence of items of a record E={e1, . . . , em} of items ej (1≦
j≦
m) and the product comprising the following means;
a) means recorded on an electronic storage medium for determining a set L1 of large sequences from the set D of transactions, the large sequences of set L1 each comprising exactly one item of the record E, and an assigned support value SL1 on the sequence D of transactions each being greater than or equal to the given support value S;
b) means recorded on the electronic storage medium for determining a set L2 of large sequences from the set D of transactions, the large sequences of set L2 each comprising exactly two items of the record E in a respective order RL2, and an assigned support value SL2 on the set D of transactions each being greater than or equal to the given support value S, and nothing but sequences comprising one of the large sequences of set L1, as a partial sequence, being taken into account in determining set L2;
c) means recorded on the storage medium for determining a set Lk (k>
2) of large sequences from the set D of transactions, the large sequences of set Lk each comprising exactly k items of record E in a respective order RLK, and an assigned support value SLK on the sequence D of transactions each being greater than or equal to the given support value S, and nothing but sequences comprising two of the large sequences of set Lk−
1, as partly overlapping partial sequences, with the respective order RLK−
1, being taken into account in determining set Lk; and
d) means recorded on the electronic storage medium for repeating step c) for k=k+1 and terminating the repetition of step c) when a given termination condition is fulfilled.
- i≦
-
7. An integrated sequential analysis system, comprising:
-
an electronic data base comprising a set D={d1, . . . , dn} of transactions di (1≦
i≦
n), each of the large sequences on the set D of transactions di having a support value greater than or equal to a given support value S, each of the transactions di of the set D being a sequence of items of a record E={e1, . . . , em} of items ej (1≦
j≦
m);
a query module comprising a query means coupled to the data base and a processing means for detecting query parameters and generating queries to the query means;
means for determining a set L1 of large sequences from the set D of transactions, the large sequences of set L1 each comprising exactly one item of the record E, and an assigned support value SL1 on the sequence D of transactions each being greater than or equal to the given support value S;
means for determining a set L2 of large sequences from the set D of transactions, the large sequences of set L2 each comprising exactly two items of the record E in a respective order RL2, and an assigned support value SL2 on the set D of transactions each being greater than or equal to the given support value S, and nothing but sequences comprising one of the large sequences of set L1, as a partial sequence, being taken into account in determining set L2;
means for determining a set Lk (k>
2) of large sequences from the set D of transactions, the large sequences of set Lk each comprising exactly k items of record E in a respective order RLK, and an assigned support value SLK on the sequence D of transactions each being greater than or equal to the given support value S, and nothing but sequences comprising two of the large sequences of set Lk−
1, as partly overlapping partial sequences, with the respective order RLK−
1, being taken into account in determining set Lk; and
means for repeating step c) for k=k+1 and terminating the repetition of step c) when a given termination condition is fulfilled.
-
Specification