Data Visualization-II (Python)

theone9807 Python Tutorial January 27, 2019January 28, 2019 3 Minutes

# Data-visualization-2

In the second part of data visualization,Almost all the plot in this repo is drawn using “Seaborn” library:

>The boxplot is drawn with the randomely generated data,
>Data visualization of the iris data is also show
>Reg plot is also drawn
>lmplot is also ploted
>And also you will learn how to make an network of data
>You can also learn to plot area chart
>And lastly we will learn to plot FactGrid

Look at the code in ploting.ipynb

In [1]:

#import all the modules
import matplotlib.pyplot as plt
import numpy as np 
import seaborn as sns
import pandas as pd

In [ ]:

#creating an random data frame for box plot
a=pd.DataFrame({'group': np.repeat('A', 500), 'value': np.random.normal(10,5,500)})
b=pd.DataFrame({'group': np.repeat('B', 500), 'value': np.random.normal(13,1.2,500)})
c=pd.DataFrame({'group': np.repeat('B', 500), 'value': np.random.normal(18,1.2,500)})
d=pd.DataFrame({'group': np.repeat('C', 20), 'value': np.random.normal(25,4,20)})
e=pd.DataFrame({'group': np.repeat('D', 100), 'value': np.random.uniform(12, size=100)})
df=a.append(b).append(c).append(d).append(e)

In [ ]:

sns.boxplot(x='group', y='value', data=df)

In [2]:

import seaborn as sns
import numpy as np
df=sns.load_dataset('iris')

In [12]:

ax=sns.boxplot(x='species', y='sepal_length', data=df)
median=df.groupby(['species'])['sepal_length'].median().values
nobs=df['species'].value_counts().values
nobs=[str(x) for x in nobs.tolist()]
nobs=["n: " +i for i in nobs]
pos=range(len(nobs))
for tick, label in zip(pos, ax.get_xticklabels()):
    ax.text(pos[tick], median[tick]+0.03,nobs[tick],horizontalalignment='center', size='x-small', color='yellow', weight='semibold')

In [15]:

sns.regplot(x=df["sepal_length"], y=df["sepal_width"], fit_reg=False)

Out[15]:

<matplotlib.axes._subplots.AxesSubplot at 0x27303bdbf28>

In [24]:

sns.regplot(x=df["sepal_length"], y=df["sepal_width"], fit_reg=False, marker="+", scatter_kws={"color":"darkred"})

Out[24]:

<matplotlib.axes._subplots.AxesSubplot at 0x27303e24b38>

In [29]:

sns.lmplot(x="sepal_length", y="sepal_width", data=df, fit_reg=False, hue='species', legend=False)
plt.legend(loc="lower right")

Out[29]:

<matplotlib.legend.Legend at 0x27303f23668>

In [30]:

sns.lmplot(x="sepal_length", y="sepal_width", data=df, fit_reg=False, hue='species', legend=False, markers=["o", "x", "1"])
plt.legend(loc="lower right")

Out[30]:

<matplotlib.legend.Legend at 0x27303e340b8>

In [37]:

sns.lmplot(x="sepal_length", y="sepal_width", data=df, fit_reg=False, hue='species', legend=False, palette=dict(setosa="red", virginica='skyblue', versicolor="blue"))
plt.legend(loc="lower right")

Out[37]:

<matplotlib.legend.Legend at 0x27303ff9278>

In [39]:

import networkx as nx

In [40]:

df= pd.DataFrame({ 'from':['A', 'B','C', 'A'], 'to':['D','A','E','C']})
df

Out[40]:

	from	to
0	A	D
1	B	A
2	C	E
3	A	C

In [60]:

G=nx.from_pandas_edgelist(df, 'from', 'to', create_using=nx.Graph())
nx.draw(G, with_labels=True, node_size=200, node_color="skyblue", pos=nx.random_layout(G))
plt.show()

In [3]:

x = range(1,6)
y = [1,4,6,8,4]

In [5]:

plt.fill_between(x, y, color = "green")
plt.title("are chart")
plt.xlabel("xval")
plt.ylabel('y_label')

Out[5]:

Text(0, 0.5, 'y_label')

In [15]:

country = ['india', 'usa', 'canada', 'russia', 'brazil']

In [16]:

df = pd.DataFrame({"Country": np.repeat(country, 10), 'years': range(2000, 2050), 'value' : np.random.rand(50)})

In [17]:

df

Out[17]:

	Country	years	value
0	india	2000	0.575791
1	india	2001	0.685109
2	india	2002	0.702861
3	india	2003	0.100856
4	india	2004	0.296796
5	india	2005	0.638772
6	india	2006	0.621124
7	india	2007	0.754949
8	india	2008	0.126422
9	india	2009	0.908114
10	usa	2010	0.755647
11	usa	2011	0.308911
12	usa	2012	0.004106
13	usa	2013	0.971685
14	usa	2014	0.585563
15	usa	2015	0.975938
16	usa	2016	0.350514
17	usa	2017	0.830893
18	usa	2018	0.699058
19	usa	2019	0.058461
20	canada	2020	0.893327
21	canada	2021	0.416444
22	canada	2022	0.903531
23	canada	2023	0.951550
24	canada	2024	0.134168
25	canada	2025	0.297796
26	canada	2026	0.378167
27	canada	2027	0.290011
28	canada	2028	0.935085
29	canada	2029	0.236448
30	russia	2030	0.773072
31	russia	2031	0.446389
32	russia	2032	0.362055
33	russia	2033	0.836870
34	russia	2034	0.424735
35	russia	2035	0.054878
36	russia	2036	0.682037
37	russia	2037	0.756833
38	russia	2038	0.505820
39	russia	2039	0.748621
40	brazil	2040	0.916116
41	brazil	2041	0.620386
42	brazil	2042	0.181319
43	brazil	2043	0.442546
44	brazil	2044	0.467849
45	brazil	2045	0.372263
46	brazil	2046	0.707299
47	brazil	2047	0.861410
48	brazil	2048	0.590995
49	brazil	2049	0.515779

In [22]:

g = sns.FacetGrid(df, col = "Country", hue = "Country", col_wrap = 4)
g = g.map(plt.plot, 'years','value')
g = g.map(plt.fill_between, 'years', 'value')

In [23]:

Out[23]:

<seaborn.axisgrid.FacetGrid at 0x24d44a28d30>

In [ ]:

Also check-https://github.com/theone9807/Data-visualization-2

Published by theone9807

View all posts by theone9807

Published January 27, 2019January 28, 2019

Leave a comment Cancel reply

Design a site like this with WordPress.com