# New Technology

Trending Technology Machine Learning, Artificial Intelligent, Block Chain, IoT, DevOps, Data Science

## Sunday, 3 March 2019

Features of dataset:

• eruptions - eruption time in minutes
• waiting - waiting time to next eruption in minutes.

• Given the data related to eruptions we need to cluster a particular eruption.

## Import required libraries

``````
# For mathematical calculation
import numpy as np

# For handling datasets
import pandas as pd

# For plotting graphs
from matplotlib import pyplot as plt

# Import the sklearn library for KMeans Clustering
from sklearn.cluster import KMeans
``````

## Import dataset

``````
# Import the csv file

'''
Output:
eruptions  waiting
0      3.600       79
1      1.800       54
2      3.333       74
3      2.283       62
4      4.533       85

'''
``````

## Train the model

``````
# Assign the number of clusters
k = 2

kmeans = KMeans(n_clusters=k)

# Train the model
kmeans = kmeans.fit(df)

# array that contains cluster number
labels = kmeans.labels_

# array of size k with co-ordinates of
# centroids
centroids = kmeans.cluster_centers_
``````

## Test the model

``````
# Prepare the test data
x_test = [[4.671,67],[2.885,61],[1.666,90],
[5.623,54],[2.678,80],[1.875,60]]

#Test the model(returns the cluster number)
prediction = kmeans.predict(x_test)

print prediction
'''
Output:
[0 0 1 0 1 0]

As value of k is 2
there are only two clusters 0 and 1.
'''
``````

## Plot the clusters.

``````
# Plot the points representing their cluster
# cluster number
colors = ['blue','red','green','black']
y = 0
for x in labels:
# plot the points acc to their clusters
# and assign different colors
plt.scatter(df.iloc[y,0], df.iloc[y,1]
,color=colors[x])
y+=1

for x in range(k):
#plot the centroids
lines = plt.plot(centroids[x,0]
,centroids[x,1],'kx')
#make the centroid larger
plt.setp(lines,ms=15.0)
plt.setp(lines,mew=2.0)

title = ('No of clusters (k) = {}').format(k)
plt.title(title)
plt.xlabel('eruptions (mins)')
plt.ylabel('waiting (mins)')
plt.show()
``````
``````
``````

``````
``````
``````
``````
``````
``````
``````
``````

#### 1 comment:

1. thank you for the valuable information giving on data science it is very helpful.