- Unsupervised Learning: perform unsupervised learning techniques on a wholesale data dataset. The project involves four main parts: exploratory data analysis and pre-processing, KMeans clustering, hierarchical clustering, and PCA.
The data set for this project is the "Wholesale Data" dataset containing information about various products sold by a wholesale grocery store.
The project will involves the following tasks:
- Exploratory data analysis and pre-processing
- Unsupervised learning: I used the Wholesale Data dataset to perform k-means clustering, hierarchical clustering, and principal component analysis (PCA) to identify patterns and group similar data points together. I determined the optimal number of clusters and communicate the insights gained through data visualization in the presentation pdf.
- Customers in cluster 2 annual spending across all the products compared to other customers.
- Customers who buy groceries are very much likely detergents paper and also there a good chance that they will purchace milk
- There are twice as many customers who are in Hotel/Restaurants/Cafe channel then Retail channel
- customers in clusters 1 and 3 spend mostly on "Fresh" and 'Frozen products' while they have a very little expenditure on other products. Also, the only major difference between them is their region.
- customers in cluster 0 are the moderate spenders