The Science of today is the technology of tomorrow ML/AI/Block Chain/IoT/DevOps/Data Science

Latest Post

Search This Blog

K-Nearest Neighbors for Machine Learning


KNN

Features of dataset:

sepal_length - sepal length in cm

sepal_width - sepal width in cm 

petal_length - petal length in cm

petal_width - petal width in cm
Class:
1 - Iris Setosa 
2 - Iris Versicolour
3 - Iris Virginica

Given the features of a flower we need to predict its class. 

Import required libraries

# For handling datasets import pandas as pd # For plotting graphs from matplotlib import pyplot as plt # Import the sklearn library for KNN from sklearn.neighbors import KNeighborsClassifier
Import dataset

# Import the csv file df = pd.read_csv('data.csv') print df.head() ''' Output: sepal_length sepal_width petal_length 0 5.1 3.5 1.4 1 4.9 3.0 1.4 2 4.7 3.2 1.3 3 4.6 3.1 1.5 4 5.0 3.6 1.4 petal_width class 0.2 1 0.2 1 0.2 1 0.2 1 0.2 1 '''
Prepare data for training

# Prepare the training set X = df.loc[:,'sepal_length':'petal_width'] Y = df.loc[:,'class']
Train the model

knn = KNeighborsClassifier() # Train the model knn.fit(X,Y)
Test the model

# Prepare the test data X_test = [[4.9,7.0,1.2,0.2], [6.0,2.9,4.5,1.5], [6.1,2.6,5.6,1.2]] # Test the model(returns the class) prediction = knn.predict(X_test) print prediction ''' Output: [1 2 3] '''
Plotting

# Plot the relation of each # feature with each class plt.xlabel('Feature') plt.ylabel('Class') X = df.loc[:,'sepal_length'] Y = df.loc[:,'class'] plt.scatter(X, Y,color='blue' ,label='sepal_length') X = df.loc[:,'sepal_width'] Y = df.loc[:,'class'] plt.scatter(X, Y,color='green' ,label='sepal_width') X = df.loc[:,'petal_length'] Y = df.loc[:,'class'] plt.scatter(X, Y,color='red' ,label='petal_length') X = df.loc[:,'petal_width'] Y = df.loc[:,'class'] plt.scatter(X, Y,color='black' ,label='petal_width') plt.legend(loc=4, prop={'size': 5}) plt.show()

No comments:

Post a Comment

Popular Posts