Customer Segmentation using KNN

Python Machine Learning

K Nearest Neighbor

Objective: To build a KNN classifier to predict the classification of unknown cases within the customer base of a telecommunications provider.


Data Description


import itertools import numpy as np import matplotlib.pyplot as plt # from matplotlib.ticker import NullFormatter import pandas as pd import numpy as np # import matplotlib.ticker as ticker from sklearn import preprocessing %matplotlib inline

Methodology

  1. Data Preparation:
    • Import relevant libraries for data manipulation (pandas, numpy) and visualization (matplotlib).
    • Load the customer data from a CSV file into a pandas DataFrame.
    • Perform initial data exploration to understand the dataset's structure.
  2. Exploratory Data Analysis:
    • The notebook likely contains statistical summaries and visualizations to explore the customer data and understand the distribution across different segments.
  3. Model Development:
    • Implement the KNN algorithm to create a predictive model.
    • Configure the model to identify the nearest neighbors and classify the customers accordingly.

from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=4) print ('Train set:', X_train.shape, y_train.shape) print ('Test set:', X_test.shape, y_test.shape)
# Calculating the accuracy for different Ks from sklearn.metrics import accuracy_score accuracies = [] for k in range(1,11): knn = KNeighborsClassifier(n_neighbors = k).fit(X_train, y_train) y_pred = knn.predict(X_test) #Calc. accuracy and store accuracies.append(accuracy_score(y_test, ypred)) print (accuracies)

View project