Cluster Analysis and Its
Significance to Business
A statistical tool, cluster analysis is used to classify objects into groups where objects in one group are more similar to each other and different from objects in other groups. It is normally used for exploratory data analysis and as a method of discovery by solving classification issues.
In the business application and decision-making context, cluster analysis can be a key process to know the distinguishable attributes of a large population. Cluster analysis methods help segregate the population into different marketing buckets or groups based on the campaign objective, which can be highly effective for targeted marketing initiatives. This can save a lot of time, effort, and money spent hitting the dart in the dark and empower the leadership team to focus on either run separate initiatives for each group of audience or focus on just one.
Contact Us
Cluster Analysis Overview
One of the most popular techniques in data science, clustering is the method of identifying similar groups of data in a dataset. One of the most common uses of clustering is segmenting a customer base by transaction behavior, demographics, or other behavioral attributes.
Types of Clustering
In broad terms, clustering can be divided into two subgroups.
- Hard Clustering
- Soft Clustering
In the case of hard clustering, each data point completely belongs to a cluster, or it doesn’t. However, in soft clustering, instead of each data point being assigned to a cluster, the probability of that data point being in a certain cluster is assigned.
Different Types of Clustering Algorithms and Their Applicability in Real-World Scenarios
Clustering is subjective, and there are multiple means for achieving it. Each methodology has a different set of rules for defining similarity, and there are more than 100 clustering algorithms. There are a few algorithms, though, which are popularly used.
-
Connectivity Models
Connectivity models are based on the idea that data points closer in data space show more similarity to each other than data points farther away. These models can follow two approaches. The first approach involves starting with the classification of all data points into clusters and aggregating them as distance decreases. The second involves all data points being identified as a single cluster and partitioned as the distance increases. The choice of distance function is also subjective. Connectivity models are easy to interpret, but lack scalability for handling large datasets. Hierarchical clustering algorithms and its variants are an example of connectivity models.
-
Centroid Model
Iterative clustering algorithms in which similarity is derived by the closeness of a data point to the centroid of the clusters, centroid models include K-Means clustering. In centroid models, the number of clusters required at the end must be identified in the beginning, making it important to have prior knowledge of the dataset. Centroid Models also run iteratively to find the local optima.
-
Distribution Model
This type of clustering model is based on the probability of all data points in the cluster belonging to the same distribution (Normal, Gaussian, etc.) Distribution models often suffer from overfitting. Expectation-maximization algorithm is a popular example of a distribution model, using multivariate normal distributions.
-
Density Model
Density models search data space for areas of varied density of data points, isolating various different density regions and assigning the data points within these regions in the same cluster. DBSCAN and OPTICS are popular examples of density models.
-
K Means Clustering
An iterative clustering algorithm, K means aims to find local maxima in each iteration, working in five steps:
- Specify the desired number of clusters K. For example, we’ll choose k=2 for these 5 data points in 2-D space.
- Specify the desired number of clusters K. For example, we’ll choose k=2 for these 5 data points in 2-D space.
- Cluster centroids must then be computer: The centroid of data points in the grey cluster using grey cross and those in the red cluster is shown using red cross.
- Each point must then be reassigned to the closest cluster centroid. Note that only the data point that is at the bottom is assigned to the red cluster even though it is closer to the centroid of grey cluster. Therefore, that data point must be assigned into the grey cluster
- The cluster centroids must then be recomputed, now re-computing the centroids for both the clusters.
- Repeat steps 4 and 5 until there are no more improvements are to make. When there are no further switching of data points between two clusters for two successive repeats, it marks the termination of the algorithm if not explicitly mentioned.
-
Hierarchical Clustering
As the name suggests, Hierarchical Clustering is an algorithm which builds a hierarchy of clusters. Starting with all the data points assigned to a cluster of their own, the algorithm then merges the two nearest clusters into the same cluster. This algorithm will only terminate only when a single cluster left.
The results of hierarchical clustering can be shown using dendrogram, which can be interpreted as:
At the bottom, we begin with 25 data points assigned to separate clusters. The two closest clusters are merged until one cluster remains at the top. The height at which two clusters are merged in the dendrogram represents the distance between two clusters in the data space. The number of clusters that best depict different groups can be chosen by observing the dendrogram. The best choice for the number of clusters is the number of vertical lines in the dendrogram cut by a horizontal line which can transverse the maximum vertical distance without intersecting a cluster.
In the example above, the best number of clusters will be four, as the horizontal red line in the dendrogram below covers maximum vertical distance AB.
There are two important things to know about hierarchical clustering:
- The algorithm has been implemented in the above examples using a bottom-up approach, though it is possible to follow a top-down approach, beginning with all data points assigned to the same cluster and recursively performing splits until each data point is assigned a separate cluster.
- The decision to merge two clusters is made based on the closeness of the clusters:
- Squared Euclidean distance: ||a-b||22 = Σ((ai-bi)2)
- Mahalanobis distance: √((a-b)T S-1 (-b)) {where, s : covariance matrix}
- Euclidean distance: ||a-b||2 = √(Σ(ai-bi))
- Manhattan distance: ||a-b||1 = Σ|ai-bi|
- Maximum distance:||a-b||INFINITY = maxi|ai-bi|
Applications of Cluster Analysis
There are many applications for cluster analysis across various domains. Some of the popular applications include the following:
- Market segmentation
- Social network analysis
- Recommendation engines
- Anomaly detection
- Medical imaging
- Image segmentation
Scenario 1: Segmentation for Customized Marketing Strategies
A grocer used clustering to segment their 1.3MM loyalty card customers into five different groups based on their buying behavior. Customized marketing strategies were then adopted for each of these segments, to target them more effectively.
One group was called ‘fresh food lovers,’ comprising customers who purchased a high proportion of organic food, fresh vegetables, and salads, etc. A marketing campaign which emphasized the freshness of the grocer’s produce, as well as year-round availability appealed to this group.
Another cluster was ‘convenience junkies,’ comprising people who shopped for cooked or semi-cooked, easy-to-prepare meals. The marketing campaign aimed at them focused on the retailer’s in-house line of frozen meals and the speed of the checkout counters.
Using cluster analysis, the grocer was able to deliver the right message to the right customer, maximizing the effectiveness of their marketing.
Scenario 2: Grouping for Single Initiatives
A well-known manufacturer of equipment used in power plants conducted a customer satisfaction survey, with the goal of grouping respondents into segments which could be targeted with unique marketing messages. To create these segments, respondents were grouped according to their attitudes towards the company and their receptiveness to various marketing approaches. The results of the cluster analysis allowed the company to group respondents into four segments:
- Never Againsthose who were unlikely to ever do business with the company again. These were not to be targeted with future marketing efforts.
- Hostagesreluctant customers, who only purchased a product because it met their specific needs and we're now locked into long-term contracts but were not satisfied overall.
- Leery labeled leery because they shared a discomfort in doing business with the company
- Acolytes the company’s core customers who had been targeted by their traditional marketing messages. This group showed the highest satisfaction and brand loyalty.
Based on the results of the cluster analysis, the company was able to realize that their marketing message was not being favorably received by about a third of their customer base. To address the concerns of their ‘hostage’ segment, they overhauled the messaging for the product line that segment members had in common, de-emphasizing long-term contracts.
For the ‘leery’ segment, the company restructured its sales force and product offers to cater to the needs of smaller companies.
Though the benefits of these reforms will take time to materialize, the company can be confident that through its research, it has better addressed the needs of its customers.
Need Help? – Consult the Cluster Analysis Experts at Research Optimus
Though clustering is easy to implement, there are important aspects to take care of, such as treating outliers in your data and making sure each cluster has enough population so that the inferred information gives the right directive. It has many uses, including planning or strategizing marketing campaigns, identifying test markets for new product development as well as in the field of biology and medical science like human genetic clustering, etc.
Research Optimus (ROP) as a knowledge process outsourcing service provider has been assisting businesses achieving their business objectives. With a pool of experienced analysts, the ROP team can certainly add value to you decision-making process utilizing the power of statistics in the form of cluster analysis, PESTEL analysis, data modeling, performance analysis, and more.