DENsity-based CLUstEring (DENCLUE) is a density-based clustering algorithm that depends on the support of density-distribution functions.
Before a detailed explanation on the DENCLUE algorithm, some concepts need to be introduced; they are influence function, density function, gradient, and density attractor.
The influence function of a specific data object can be any function for which the Gaussian kernel is usually used as the kernel at the data point.
The density function at a point, x, is defined as the sum of the influence functions of all the data objects at this data point.
A point is defined as a density attractor if it is a local maximum of the density function and is computed as
A gradient of the density function is defined in the following equation, given the density function, .
DENCLUE defines a density function for the data point space at first. All the local maxima data points are searched and found. Assign each data point to the nearest local maxima point to maximize the density related to it. Each group of data points bound with a local maxima point is defined as a cluster. As a postprocess, the cluster is discarded if its bound local maxima density is lower than the user-predefined value. The clusters are merged if there exists a path such that each point on the path has a higher density value than the user-predefined value.
Please take a look at the R codes file ch_06_denclue.R
from the bundle of R codes for previously mentioned algorithm. The codes can be tested with the following command:
> source("ch_06_denclue.R")
The browser-cache analysis provides the website owner with the convenience that shows the best matched part to the visitors, and at the same time, it is related to their privacy protection. The data instances in this context are browser caches, sessions, cookies, various logs, and so on.
The possible factors included in certain data instances can be the Web address, IP address (denotes the position where the visitor comes from), the duration for which the visitor stayed on a specific page, the pages the user visited, the sequence of the visited pages, the date and time of every visit, and so on. The log can be specific to a certain website or to various websites. A more detailed description is given in the following table:
The analysis of a visitor is basically history sniffing, which is used for user-behavior analysis.
3.147.195.146