Multi-resolution graph-based clustering
First Claim
1. A method of defining the electrofacies of a geological formation traversed by a bore hole comprising:
- moving a sonde through a plurality of positions in said bore hole and recording a data set having a number d of measurements taken by the sonde at each of the plurality of positions, wherein the d measurements at each position are associable with a point in d-dimensional space; and
calculating a neighboring index I of each measurement point.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus and method for obtaining facies of geological formations for identifying mineral deposits is disclosed. Logging instruments are moved in a bore hole to produce log measurements at successive levels of the bore hole. The set of measurements at each such level of the bore hole interval is associated with reference sample points within a multidimensional space. The multidimensional scatter of sample points thus obtained is analyzed to determine a set of characteristic modes. The sample points associated with characteristic modes are grouped to identify clusters. A facies is designated for each of the clusters and a graphic representation of the succession of facies as a function of the depth is thus obtained. To identify the clusters, a “neighboring index” of each log measurement point in the data set is calculated. Next, small natural groups of points are formed based on the use of the neighboring index to determine a K-Nearest-Neighbor (KNN) attraction for each point. Independently of the natural group formation, an optimal number of clusters is calculated based on a Kernel Representative Index (KRI) and based on a user-specified resolution. Lastly, based on the data calculated from the prior steps, final clusters are formed by merging the smaller clusters.
-
Citations
47 Claims
-
1. A method of defining the electrofacies of a geological formation traversed by a bore hole comprising:
-
moving a sonde through a plurality of positions in said bore hole and recording a data set having a number d of measurements taken by the sonde at each of the plurality of positions, wherein the d measurements at each position are associable with a point in d-dimensional space; and
calculating a neighboring index I of each measurement point. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
determining for each given point x said given point'"'"'s nearest neighbor ranks m relative to all other measurement points in the data set;
calculating the weighted ranks, σ
n(x), of said given point using the nearest neighbor ranks m; and
determining for each given point x the summation s(x) of all the weighted ranks σ
n(x) for said given point.
-
-
3. The method of claim 2, wherein said calculating a neighboring index further includes:
-
determining from the summations s(x) for each measurement point x a minimum summation value Smin;
determining from the summations s(x) for each measurement point x a maximum summation value Smax; and
calculating for each point x the neighboring index I(x) corresponding to
-
-
4. The method of claim 2 wherein the weighted rank of the point is σ
- n(x)=exp(−
m/α
).
- n(x)=exp(−
-
5. The method of claim 1, further comprising:
determining attraction sets that are disjoint sets of measurement points.
-
6. The method of claim 5, wherein the attraction sets are determined by:
-
calculating for each measurement point x in the data set a set of attraction values Attrx(y) where y ranges over the set of the nearest neighbors of x;
calculating the maximum Attrx(y) for each point x;
for those points x having a maximum Attrx(y) greater than zero, determining a directed connection from point x to the point y that maximizes Attrx(y);
using the directed connections to categorize each measurement point x in the set as a kernel point of a mode, slope point of a mode, or boundary point of a mode; and
forming attraction sets from the points having a shared kernel point.
-
-
7. The method of claim 6, wherein the attraction value calculations include finding the products I(y) Vx(y) where Vx(y) is an adherence function.
-
8. The method of claim 7 wherein the adherence function Vx(y)=1 if y is one of the K-nearest-neighbors of x, otherwise Vx(y)=0.
-
9. The method of claim 5, further comprising:
determining one or more optimal cluster numbers by calculating a Kernel Representative Index F(x) for each measurement point in the data set.
-
10. The method of claim 9 wherein the Kernel Representative Index F(x) is determined by determining for each point x in the data set, the nearest neighbor y of x satisfying I(y)>
- I(x); and
calculating F(x)=Ia(x)*Mb(x,y)*Dc(x,y) in which M(x,y) is a rank function that equals m when y is the mth neighbor of x, D(x,y) is the distance between x and y, and a, b, c are predetermined constants.
- I(x); and
-
11. The method of claim 9 wherein the optimal cluster number corresponds to a sharp drop in the Kernel Representative Index F(x).
-
12. The method of claim 9, further comprising:
performing mode merging of the attraction sets to form clusters of measurement points, each cluster defining an electrofacies.
-
13. The method of claim 12, wherein the clusters of measurement points are formed by:
-
for each a pair of attraction sets S1 and S2 from all attraction sets, identifying a pair of points p1 and P2 belonging to S1 and S2, respectively, that satisfy the conditions;
(a) p1 and P2 are boundary points;
(b) p1 is in K-nearest-neighbors of P2 or P2 is in K-nearest-neighbors of p1; and
(c) the distance D(p2,p1) between p1 and P2 is minimum among all pairs of points satisfying conditions (a) and (b).
-
-
14. The method of claim 13, further comprising:
-
calculating L=Min(I(p1), I(p2)) for each pair of points satisfying conditions (a), (b) and (c); and
storing the values (p1, p2) wherein into a list in decreasing order with respect to L.
-
-
15. The method of claim 14, further comprising:
traversing the list in decreasing order while merging sets corresponding to points p1 and P2 if the sets do not both contain previously selected cluster kernels.
-
16. The method of claim 12, further comprising:
correlating the electrofacies with the geological formations traversed by the bore hole.
-
17. The method of claim 16, further comprising:
-
prior to said calculating the neighboring index, selecting the recorded measurements points that are stable over consecutive levels; and
after said correlating, comparing a recorded measurement point not selected in said selecting step to the clusters of selected measurement points to predict the facies of the geological formation at the bore hole positions corresponding to the unselected recorded measurement points.
-
-
18. The method of claim 17, further comprising:
producing a graph of the electrofacies of the geological formation as a function of the depth of the bore hole.
-
19. An apparatus for performing automatic clustering, comprising:
-
a memory unit configured to store log measurement points in d-dimensional space; and
a processing unit configured to retrieve the log measurement points from the memory unit, and configured to calculates a neighboring index I of each log measurement point. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
determining for each given point x said given point'"'"'s nearest neighbor ranks m relative to said given point'"'"'s N-nearest-neighbors;
calculating the weighted ranks, σ
n(x), of said given point using the nearest neighbor ranks m; and
determining for said given point the summation s(x) of the weighted ranks of said given point.
-
-
21. The apparatus of method of claim 20, wherein the neighboring index corresponding to each log measurement point is calculated by:
-
determining for each given point x said given point'"'"'s nearest neighbor ranks m relative to said given point'"'"'s N-nearest-neighbors;
calculating the weighted ranks, σ
n(x), of said given point using the nearest neighbor ranks m; and
determining for said given point the summation s(x) of the weighted ranks of said given point;
determining the minimum value Smax of s (x) over each log measurement point in the set;
determining the maximum value Smax of s(x) over each log measurement point in the set; and
calculating the neighboring index I(x) for each point in the set so that
-
-
22. The apparatus of claim 19 wherein said processing unit is further configured to determine attraction sets, the attraction sets containing log measurement points.
-
23. The apparatus of claim 22, wherein the attraction sets are determined by:
-
calculating for each log measurement point p in the data set a set of attraction values Attrx(y)where y ranges over the set of the nearest neighbors of x;
calculating the maximum Attrx(y) for each point x;
for those points x having a maximum Attrx(y) greater than zero, determining a directed connection from point x to the point y that maximizes Attrx(y);
using the directed connections to identify those which log measurement points serve as a kernel points; and
for each kernel point, forming an attraction set that includes the given kernel point and all those points having directed connections that lead to the given kernel point.
-
-
24. The apparatus of claim 19, wherein said processing unit is further configured to determine an optimal cluster number by calculating a Kernel Representative Index F(x) for each log measurement point.
-
25. The apparatus of claim 24, wherein the Kernel Representative Index F(x) is determined by:
-
calculating for each point x in the data set the nearest neighbor y of x satisfying I(y)>
I(x); and
calculating F(x)=Ia(x)*Mb(x, y)*DC(x, y) in which M(x,y) is a rank function that equals m when y is the mth neighbor of x, D(x,y) is the distance between x and y, and a, b, c are predetermined constants.
-
-
26. The apparatus of claim 19 wherein the optimal cluster number corresponds to a sharp drop in the Kernel Representative Index F(x).
-
27. The apparatus of claim 19, wherein said processing unit is further configured to perform mode merging of the attraction sets to form clusters of log measurement points, each cluster characterizing an electrofacies.
-
28. The apparatus of claim 27, wherein the clusters of log measurement points are formed by:
-
for each a pair of attraction sets S1 and S2 from all attraction sets, identifying a pair of points p1 and P2 belonging to S1 and S2, respectively, that satisfy the conditions;
(a) p1 and P2 are boundary points;
(b) p1 is in K-nearest-neighbors of P2 or P2 is in K-nearest-neighbors of p1; and
(c) the distance D(p2,p1) between p1 and P2 is minimum among all pairs of points satisfying conditions (a) and (b).
-
-
29. The apparatus of claim 28, wherein the clusters of log measurement points are further formed by:
-
calculating L=Min(I(p1), I(p2)) for each pair of points satisfying conditions (a), (b) and (c); and
storing the values (p1,p2) wherein into a list in decreasing order with respect to L.
-
-
30. The apparatus of claim 29, wherein the clusters of log measurement points are further formed by:
traversing the list in decreasing order while merging sets corresponding to points p1 and P2 if the sets do not both contain a previously selected kernel points.
-
31. The apparatus of claim 27, wherein the processing unit is further configured to correlate the electrofacies with the facies traversed by the bore hole.
-
32. A method of defining the electrofacies of a geological formation traversed by a bore hole comprising:
-
moving a sonde through a plurality of positions in said bore hole and recording a data set having a number d of log measurements taken by the sonde at each of the predetermined levels, wherein the d log measurements at each position are associable with a point in d-dimensional space;
calculating a neighboring index I of each log measurement points; and
determining an optimal cluster number by calculating a Kernel Representative Index F(x) for each log measurement point. - View Dependent Claims (33, 34)
determining for each point x in the data set, the nearest neighbor y of x that satisfies I(y)>
I(x); and
calculating F(x)=Ia(x)*Mb(x, y)*DC(x, y) in which M(x,y) is a rank function that equals m when y is the mth neighbor of x, D(x,y) is the distance between x and y, and a, b, c are predetermined constants.
-
-
34. The method of claim 32 wherein the optimal cluster number corresponds to a sharp drop in Kernel Representative Index F(x).
-
35. An apparatus for performing automatic clustering, comprising:
-
a memory unit configured to store measurement points in d dimensional space;
a processing unit configured to retrieve the measurement points from the memory unit; and
wherein said processing unit calculates a neighboring index I of each measurement point. - View Dependent Claims (36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47)
determining for each given point x said given point'"'"'s nearest neighbor ranks m relative to all other points;
calculating the weighted ranks, σ
n(x), of said given point using the nearest neighbor ranks m; and
determining for said given point the summation s(x) of the weighted ranks for said given point.
-
-
37. The apparatus of method of claim 36, wherein the neighboring index corresponding to each measurement point is further calculated by:
-
determining the minimum value Smin of s(x) over each measurement point in the set;
determining the maximum value Smax of s(x) over each measurement point in the set; and
calculating the neighboring index I(x) for each point in the set by
-
-
38. The apparatus of claim 35, wherein said processing unit is further configured to determine attraction sets that are disjoint sets of measurement points.
-
39. The apparatus of claim 38, wherein the attraction sets are determined by:
-
calculating for each measurement point x in the data set a set of attraction values Attrx(y) where y ranges over the set of the nearest neighbors of x;
calculating the maximum Attrx(y) for each point x;
for those points x having a maximum Attrx(y) greater than zero, directing point x to the point y that maximizes Attrx(y);
using the directions to identify points that serve as a kernel point of a mode; and
for each kernel point, forming an attraction set that includes the given kernel point and all those points directed to the kernel point.
-
-
40. The apparatus of claim 38, wherein the processing unit is further configured to determine one or more optimal cluster numbers by calculating a Kernel Representative Index F(x) for each measurement point in the data set.
-
41. The apparatus of claim 40, wherein the Kernel Representative Index F(x) is determined by:
-
calculating for each point x in the data set, the nearest neighbor y of x satisfying I(y)>
I(x); and
calculating F(x)=Ia(x)*Mb(x,y)*Dc(x,y) in which M(x,y) is a rank function that equals m when y is the mth neighbor of x, D(x,y) is the distance between x and y, and a, b, c are predetermined constants.
-
-
42. The apparatus of claim 40 wherein the optimal cluster numbers correspond to sharp drops in the Kernel Representative Index F(x).
-
43. The apparatus of claim 40, wherein the processing unit is further configured to perform mode merging of the attraction sets to form clusters of measurement points, each cluster characterizing a prototype.
-
44. The apparatus of claim 43, wherein the clusters of measurement points are formed by:
-
for each a pair of attraction sets S1 and S2 from all attraction sets, identifying a pair of points p1 and P2 belonging to S1 and S2, respectively, that satisfy the conditions;
(a) p1 and P2 are boundary points;
(b) p1 is in K-nearest-neighbors of P2 or P2 is in K-nearest-neighbors of p1; and
(c) the distance D(p2,p1) between p1 and P2 is minimum among all pairs of points satisfying conditions (a) and (b).
-
-
45. The apparatus of claim 44, further comprising:
-
calculating L=Min(I(p1), I(p2) for each pair of points satisfying conditions (a), (b) and (c); and
storing the values (p1,p2) wherein into a list in decreasing order with respect to L.
-
-
46. The apparatus of claim 45, further comprising:
traversing the list in decreasing order while merging sets corresponding to points p1 and P2 if the sets do not both contain a previously selected kernel points.
-
47. The apparatus of claim 43, wherein the processing unit is further configured to correlate the prototype with new measurement points.
Specification