Objective: To build a KNN classifier to predict the classification of unknown cases within the customer base of a telecommunications provider.

Data Description

Segmentation: The customer base is segmented into four groups based on service usage patterns.

Target Variable: The custcat field, which includes four values corresponding to the customer groups:

Basic Service

E-Service

Plus Service

Total Service

import itertools
import numpy as np
import matplotlib.pyplot as plt
# from matplotlib.ticker import NullFormatter
import pandas as pd
import numpy as np
# import matplotlib.ticker as ticker
from sklearn import preprocessing
%matplotlib inline

Methodology

Data Preparation:

Import relevant libraries for data manipulation (pandas, numpy) and visualization (matplotlib).

Load the customer data from a CSV file into a pandas DataFrame.

Perform initial data exploration to understand the dataset's structure.

Exploratory Data Analysis:

The notebook likely contains statistical summaries and visualizations to explore the customer data and understand the distribution across different segments.

Model Development:

Implement the KNN algorithm to create a predictive model.

Configure the model to identify the nearest neighbors and classify the customers accordingly.

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=4)
print ('Train set:', X_train.shape, y_train.shape)
print ('Test set:', X_test.shape, y_test.shape)
# Calculating the accuracy for different Ks
from sklearn.metrics import accuracy_score
accuracies = []
for k in range(1,11):
knn = KNeighborsClassifier(n_neighbors = k).fit(X_train, y_train)
y_pred = knn.predict(X_test)
#Calc. accuracy and store
accuracies.append(accuracy_score(y_test, ypred))
print (accuracies)