1
kmeans = KMeans(n_clusters=4)

model = kmeans.fit(europe_july)
pred = model.labels_
europe_july['cluster'] = pred

pca = PCA(n_components=2)
pca_model = pca.fit_transform(europe_july)
data_transform = pd.DataFrame(data = pca_model, columns = ['PCA1', 'PCA2'])
data_transform['Cluster'] = pred

plt.figure(figsize=(8,8))
g = sns.scatterplot(data=data_transform, x='PCA1', y='PCA2',\
                    palette=sns.color_palette()[:4], hue='Cluster')
title = plt.title('World countries clusters with PCA')

PCA

But when I run this code it does not seem to take into account this model.

europe_july['country'] = countries
europe_july['iso_alpha'] = iso_alpha

fig = px.choropleth(data_frame = europe_july, locations= "iso_alpha", scope= 'world', title='2020-11-07 (World)', color= "cluster", hover_name= "country", color_continuous_scale= 'earth', )

fig.show()

Since this is the output that I get, as you can see there is clearly a cluster with only three countries, when there is no such cluster predicted by the model.

This is the output of the predictions for clusters and it matches the visualizations by PCA:

array([2, 3, 1, 0, 2, 0, 0, 3, 1, 3, 3, 3, 1, 3, 3, 1, 1, 3, 3, 1, 2, 0,
       1, 1, 0, 1, 2, 3, 0, 2, 3, 2, 2, 1, 2, 3, 2, 0, 2, 2, 3, 3, 1, 2,
       2, 1, 2, 3, 1, 3, 3, 3, 2, 3, 2, 0, 1, 1, 1, 1, 2, 2, 3, 2, 0, 0,
       2, 3, 3, 0, 2, 2, 3, 3, 2, 0, 3, 0, 2, 3, 1, 0, 2, 2, 1, 2, 1, 3,
       3, 3, 1, 1, 3, 1, 3, 0, 3, 3, 1, 3, 0, 2, 1, 2, 0, 3, 1, 2, 3, 3,
       2, 2, 2, 0, 3, 3, 3, 2, 2, 3, 1, 2, 3, 2, 3, 1, 1, 0, 1, 3, 0, 2,
       2, 1, 2, 1, 3, 0, 3, 0, 2, 2, 0, 3, 1, 1, 2, 3, 2, 1, 3, 1, 3, 3,
       3, 3, 3, 3, 2, 0, 1, 0, 0, 2, 3, 2, 1, 3, 2, 3, 0, 3, 3, 2, 1, 3,
       2, 3, 3, 2, 1, 2, 3, 3, 2, 3, 1, 2, 1, 2, 2, 1, 1, 3, 0, 2, 3, 3,
       3, 3, 3, 3, 2, 0, 2, 1, 0, 2, 2, 2, 1, 0], dtype=int32)

vis

Could someone please guide on why my visualisation is wrong?

Multivac
  • 3,199
  • 2
  • 10
  • 26
vojtak
  • 241
  • 1
  • 2
  • 6

1 Answers1

0

The visualisation is in fact correct, the only issue is that the Plotly map simply does not have those countries marked on its map at all.

vojtak
  • 241
  • 1
  • 2
  • 6