Classical cluster algorithms work best for:
DBSCAN groups data points based on density, making it effective for identifying clusters of various shapes and sizes.
Clusters are formed by connecting core points and their reachable neighbors, separating high-density regions from low-density areas.
Reachability Distance
Reachability minimum number of points
\(MinPts?\)
\(MinPts\) is selected based on the domain knowledge.
If you do not have domain understanding, a rule of thumb is to derive \(MinPts\) from the number of dimensions \(D\) in the data set.
For 2D data, take \(MinPts = 4\).
For larger datasets, with much noise, it suggested to go with \(MinPts = 2 * D\).
… knowing what you do helps!
In order to judge the process we need repeated measurements FROM the actual process.
… so we need to find the same particles again!
Copyright Prof. Dr. Tim Weber, 2024