Data Visualization-II (Python)

# Data-visualization-2

In the second part of data visualization,Almost all the plot in this repo is drawn using “Seaborn” library:

>The boxplot is drawn with the randomely generated data,
>Data visualization of the iris data is also show
>Reg plot is also drawn
>lmplot is also ploted
>And also you will learn how to make an network of data
>You can also learn to plot area chart
>And lastly we will learn to plot FactGrid

Look at the code in ploting.ipynb

In [1]:

#import all the modules
import matplotlib.pyplot as plt
import numpy as np 
import seaborn as sns
import pandas as pd
In [ ]:
#creating an random data frame for box plot
a=pd.DataFrame({'group': np.repeat('A', 500), 'value': np.random.normal(10,5,500)})
b=pd.DataFrame({'group': np.repeat('B', 500), 'value': np.random.normal(13,1.2,500)})
c=pd.DataFrame({'group': np.repeat('B', 500), 'value': np.random.normal(18,1.2,500)})
d=pd.DataFrame({'group': np.repeat('C', 20), 'value': np.random.normal(25,4,20)})
e=pd.DataFrame({'group': np.repeat('D', 100), 'value': np.random.uniform(12, size=100)})
df=a.append(b).append(c).append(d).append(e)
In [ ]:
sns.boxplot(x='group', y='value', data=df)
In [2]:
import seaborn as sns
import numpy as np
df=sns.load_dataset('iris')
In [12]:
ax=sns.boxplot(x='species', y='sepal_length', data=df)
median=df.groupby(['species'])['sepal_length'].median().values
nobs=df['species'].value_counts().values
nobs=[str(x) for x in nobs.tolist()]
nobs=["n: " +i for i in nobs]
pos=range(len(nobs))
for tick, label in zip(pos, ax.get_xticklabels()):
    ax.text(pos[tick], median[tick]+0.03,nobs[tick],horizontalalignment='center', size='x-small', color='yellow', weight='semibold')
In [15]:
sns.regplot(x=df["sepal_length"], y=df["sepal_width"], fit_reg=False)
Out[15]:
<matplotlib.axes._subplots.AxesSubplot at 0x27303bdbf28>
In [24]:
sns.regplot(x=df["sepal_length"], y=df["sepal_width"], fit_reg=False, marker="+", scatter_kws={"color":"darkred"})
Out[24]:
<matplotlib.axes._subplots.AxesSubplot at 0x27303e24b38>
In [29]:
sns.lmplot(x="sepal_length", y="sepal_width", data=df, fit_reg=False, hue='species', legend=False)
plt.legend(loc="lower right")
Out[29]:
<matplotlib.legend.Legend at 0x27303f23668>
In [30]:
sns.lmplot(x="sepal_length", y="sepal_width", data=df, fit_reg=False, hue='species', legend=False, markers=["o", "x", "1"])
plt.legend(loc="lower right")
Out[30]:
<matplotlib.legend.Legend at 0x27303e340b8>
In [37]:
sns.lmplot(x="sepal_length", y="sepal_width", data=df, fit_reg=False, hue='species', legend=False, palette=dict(setosa="red", virginica='skyblue', versicolor="blue"))
plt.legend(loc="lower right")
Out[37]:
<matplotlib.legend.Legend at 0x27303ff9278>
In [39]:
import networkx as nx
In [40]:
df= pd.DataFrame({ 'from':['A', 'B','C', 'A'], 'to':['D','A','E','C']})
df
Out[40]:
from to
0 A D
1 B A
2 C E
3 A C
In [60]:
G=nx.from_pandas_edgelist(df, 'from', 'to', create_using=nx.Graph())
nx.draw(G, with_labels=True, node_size=200, node_color="skyblue", pos=nx.random_layout(G))
plt.show()
In [3]:
x = range(1,6)
y = [1,4,6,8,4]
In [5]:
plt.fill_between(x, y, color = "green")
plt.title("are chart")
plt.xlabel("xval")
plt.ylabel('y_label')
Out[5]:
Text(0, 0.5, 'y_label')
In [15]:
country = ['india', 'usa', 'canada', 'russia', 'brazil']
In [16]:
df = pd.DataFrame({"Country": np.repeat(country, 10), 'years': range(2000, 2050), 'value' : np.random.rand(50)})
In [17]:
df
Out[17]:
Country years value
0 india 2000 0.575791
1 india 2001 0.685109
2 india 2002 0.702861
3 india 2003 0.100856
4 india 2004 0.296796
5 india 2005 0.638772
6 india 2006 0.621124
7 india 2007 0.754949
8 india 2008 0.126422
9 india 2009 0.908114
10 usa 2010 0.755647
11 usa 2011 0.308911
12 usa 2012 0.004106
13 usa 2013 0.971685
14 usa 2014 0.585563
15 usa 2015 0.975938
16 usa 2016 0.350514
17 usa 2017 0.830893
18 usa 2018 0.699058
19 usa 2019 0.058461
20 canada 2020 0.893327
21 canada 2021 0.416444
22 canada 2022 0.903531
23 canada 2023 0.951550
24 canada 2024 0.134168
25 canada 2025 0.297796
26 canada 2026 0.378167
27 canada 2027 0.290011
28 canada 2028 0.935085
29 canada 2029 0.236448
30 russia 2030 0.773072
31 russia 2031 0.446389
32 russia 2032 0.362055
33 russia 2033 0.836870
34 russia 2034 0.424735
35 russia 2035 0.054878
36 russia 2036 0.682037
37 russia 2037 0.756833
38 russia 2038 0.505820
39 russia 2039 0.748621
40 brazil 2040 0.916116
41 brazil 2041 0.620386
42 brazil 2042 0.181319
43 brazil 2043 0.442546
44 brazil 2044 0.467849
45 brazil 2045 0.372263
46 brazil 2046 0.707299
47 brazil 2047 0.861410
48 brazil 2048 0.590995
49 brazil 2049 0.515779
In [22]:
g = sns.FacetGrid(df, col = "Country", hue = "Country", col_wrap = 4)
g = g.map(plt.plot, 'years','value')
g = g.map(plt.fill_between, 'years', 'value')
In [23]:
g
Out[23]:
<seaborn.axisgrid.FacetGrid at 0x24d44a28d30>
In [ ]:
 

 

Also check-https://github.com/theone9807/Data-visualization-2

Leave a comment