seaborn学习笔记(三):直⽅图、条形图、条带图1 直⽅图与条形图
在过去很长⼀段时间,我⼀直分不清直⽅图和条形图,或者说⼀直认为两者是⼀样的,直到看到histplot()和barplot()两个绘图⽅法,前者是绘制直⽅图,后者是绘制条形图。通过仔细对⽐两者各项功能后,我得出结论,两者⼗分相似,但有些许不同:直⽅图侧重于统计数据在数轴上各个位置的分布情况,统计的对象往往是连续型数值数据的,根据数值的⼤⼩分区间进⾏分组统计,例如有100个学⽣的⾝⾼,需要统计100个学⽣在各个⾝⾼段的分布情况;条形图不同,条形图分组往往是针对离散型数据或者说定性的分类数据,例如对⽐男⽣的平均⾝⾼和⼥⽣的平均⾝⾼。不知道我这么理解对不对,欢迎留⾔讨论。
2 histplot():直⽅图
主要参数如下:
data:绘图数据,可以是pandas.DataFrame,numpy.ndarray或者字典等
x,y:指定的x轴, y轴数据,可以是向量或者字符串,当是字符串时,⼀定是data中的⼀个key
hue:可以是向量(pandas中的⼀列,或者是list),也可以是字符串(data中的⼀个
key),seaborn将根据这⼀列设置不同颜⾊
weights: 数据加权的权重
stat: 柱形的统计⽅式
1)count:统计每个区间的值的个数
2)frequency:区间内取值个数除以区间宽度
3)probability或proportion:进⾏标准化使条形⾼度总和为1
4)percent:标准化使条形总⾼度为100
5)使条形总⾯积为1
bins: 字符型、整型、向量都可以,可以是引⽤规则的名称、箱⼦的数量或箱⼦的分段或者分箱规则名称,规则名称见下⽅⽰例
binwidth: 条形宽度
binrange: 条形边缘的最⼤值或最⼩值
discrete: 如果为True,则默认为binwidth=1,并绘制条形图,使其位于相应数据点的中
⼼。这避免了在使⽤离散(整数)数据时可能出现的“间隙”。
cumulative: 布尔型,是否逐个累加每个条形⾼度进⾏绘图
multiple: 直接看下⽂效果吧
element: 直接看下⽂效果吧
fill: 条形内部是否填充颜⾊
shrink: 缩⼩条形的宽度
kde: 是否⽣成核密度曲线
color: 设置条形颜⾊
legend: 是否显⽰图例
ax: 绘图的坐标轴实例
In [1]:
import numpy as np import pandas as pd
import matplotlib.pyplot as plt import seaborn as
sns In [2]:
penguins = sns .load_dataset('penguins',data_home ='.'
)In [3]:
penguins .head(2)2.1 为图像设置标题
histplot()⽅法返回值为matplotlib.axes._subplots.AxesSubplot 类实例,通过实例的set_title()⽅法可以为图像添加标题:
In [4]:
pic = sns .histplot(penguins, x ="flipper_length_mm")pic .set_title('flipper_length_mm')2.2 ax :⾃定义绘图坐标系
在不传递ax 参数时,seaborn 会⾃⾏创建坐标系进⾏绘图,我们也可以⾃⼰创建坐标系,通过ax 参数传递,这样做的好处是可以灵活绘制多个⼦图:
In [5]:
fig, ax =plt .subplots(1,2,constrained_layout =True , figsize =(8, 3))pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[0])pic = sns .histplot(penguins, x ="body_mass_g", ax =ax[1])另外,在histplot()⽅法中,没有提供太多关于坐标轴设置的参数,难以对坐标轴进⾏个性化定制,如果在绘图前,先创建好坐标轴,即可完成对坐标轴的设置(关于坐标轴的创建和设置,
请参考
In [6]:
ax =plt .axes((0.1, 0.1, 0.8, 0.7), facecolor ='green')
pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax)2.3 x, y :传递数据,控制图形⽅向
Out[3]:
species island bill_length_mm bill_depth_mm
flipper_length_mm
body_mass_g
sex 0Adelie Torgersen 39.118.7181.03750.0Male 1
Adelie
Torgersen
39.5
17.4
186.0
3800.0
Female
Out[4]:
Text(0.5, 1.0, 'flipper_length_mm')
x, y 参数不能同时传递,当传递x 时,x 轴为数据取值范围,y 轴为统计次数;当传递y 值时,y 轴为数据取值范围,x 轴为统计次数:
In [7]:
fig, ax =plt .subplots(1,2,constrained_layout =True , figsize =(8, 3))pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[0])pic .set_title('x')
pic = sns .histplot(penguins, y ="flipper_length_mm", ax =ax[1])pic .set_title('y')2.4 stat :y
轴显⽰数据的⽅式
In [8]:
fig, ax =plt .subplots(1,5,constrained_layout =True , figsize =(15, 3))
_ = sns .histplot(penguins, x ="flipper_length_mm", stat ="count", ax =ax[0]) # count, 也是默认值_ = sns .histplot(penguins, x ="flipper_length_mm", stat ="frequency", ax =ax[1]) # frequency _ = sns .histplot(penguins, x ="flipper_length_mm", stat ="probability", ax =ax[2])# probability _ = sns .histplot(penguins, x ="flipper_length_mm", stat ="percent", ax =ax[3]) # percent _ = sns .histplot(penguins, x ="flipper_length_mm", stat ="density", ax =ax[4]) # density 2.5 bins :指定分箱⽅式
指定分箱个数或者每个条形区间进⾏绘图:
In [9]:
fig, ax =plt .subplots(1,3,constrained_layout =True , figsize =(12, 3))pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[0], bins =5)pic .set_title('bins=5')
pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[1], bins =10)pic .set_title('bins=10')
pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[2], bins =[150, 175, 200, 225, 250])pic .set_title('bins=[150, 175, 200, 225, 250]')
也可以指定分箱的规则:
In [10]:
fig, ax =plt .subplots(2,4,constrained_layout =True , figsize =(15, 6))
pic = sns .histplot(penguins, x ="flipper_length_mm", bins ="auto", ax =ax[0][0]) # count, 也是默认值pic .set_title('auto')
pic = sns .histplot(penguins, x ="flipper_length_mm", bins ="fd", ax =ax[0][1]) # frequency pic .set_title('fd')
pic = sns .histplot(penguins, x ="flipper_length_mm", bins ="doane", ax =ax[0][2])# probability pic .set_title('doane')
pic = sns .histplot(penguins, x ="flipper_length_mm", bins ="scott", ax =ax[0][3]) # percent pic .set_title('scott')
pic = sns .histplot(penguins, x ="flipper_length_mm", bins ="stone", ax =ax[1][0]) # count, 也是默认值pic .set_title('stone')
pic = sns .histplot(penguins, x ="flipper_length_mm", bins ="rice", ax =ax[1][1]) # frequency pic .set_title('rice')
pic = sns .histplot(penguins, x ="flipper_length_mm", bins ="sturges", ax =ax[1][2])# probability pic .set_title('sturges')
pic = sns .histplot(penguins, x ="flipper_length_mm", bins ="sqrt", ax =ax[1][3]) # percent pic .set_title('sqrt')
Out[7]:
Text(0.5, 1.0, 'y')
Out[9]:
Text(0.5, 1.0, 'bins=[150, 175, 200, 225, 250]')
2.6 binwidth :设置柱形宽度
In [11]:
fig, ax =plt .subplots(1,2,constrained_layout =True , figsize =(8, 3))
_ = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[0], binwidth =1)_ = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[1], binwidth =3)2.7 cumulative
:累积每个条形⾼度进⾏绘图
In [12]:
sns .histplot(penguins, x ="flipper_length_mm", cumulative =True )In [13]:
sns .histplot(penguins, x ="flipper_length_mm", multiple ="layer")2.8 hue :颜⾊区分条形组成
In [14]:
_ = sns .histplot(penguins, x ="flipper_length_mm", hue ="species") 2.9 element
In [15]:
fig, ax =plt .subplots(1,3,constrained_layout =True , figsize =(12, 3))
pic = sns .histplot(penguins, x ="flipper_length_mm", hue ="species", ax =ax[0], element ="bars")pic .set_title('element="bars"')
pic = sns .histplot(penguins, x ="flipper_length_mm", hue ="species", ax =ax[1], element ="step")pic .set_title('element="step')
直条图和直方图图片pic = sns .histplot(penguins, x ="flipper_length_mm", hue ="species", ax =ax[2], element ="poly")pic .set_title('element="poly"')
D:\ProgramData\Anaconda3\envs\machine_learning\lib\site-packages\numpy\lib\histograms.py:669: RuntimeWarning: The number of b ins estimated may be suboptimal.
bin_edges, _ = _get_bin_edges(a, bins, range, weights)Out[10]:
Text(0.5, 1.0, 'sqrt')
Out[12]:
<AxesSubplot:xlabel='flipper_length_mm', ylabel='Count'>
Out[13]:
<AxesSubplot:xlabel='flipper_length_mm', ylabel='Count'>
Out[15]:
Text(0.5, 1.0, 'element="poly"')
2.10 multiple
In [16]:
fig, ax =plt .subplots(1,4,constrained_layout =True , figsize =(16, 3))
pic = sns .histplot(penguins, x ="flipper_length_mm", hue ="species", ax =ax[0], multiple ="layer")pic .set_title('multiple="layer"')
pic = sns .histplot(penguins, x ="flipper_length_mm", hue ="species", ax =ax[1], multiple ="dodge")pic .set_title('multiple="dodge')
pic = sns .histplot(penguins, x ="flipper_length_mm", hue ="species", ax =ax[2], multiple ="stack")pic .set_title('multiple="stack"')
pic = sns .histplot(penguins, x ="flipper_length_mm", hue ="species", ax =ax[3], multiple ="fill")pic .set_title('multiple="fill"')2.11 kde
:是否⽣成核密度曲线
In [17]:
fig, ax =plt .subplots(1,2,constrained_layout =True , figsize =(8, 3))
pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[0], kde =False ) # 默认值,不⽣成核密度曲线pic .set_title('kde=False')
pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[1], kde =True ) # 值为True ,显⽰核密度曲线pic .set_title('kde=True')2.12 color
:设置条形颜⾊
In [18]:
fig, ax =plt .subplots(1,2,constrained_layout =True , figsize =(8, 3))
pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[0], color ="#FFC0CB") # 可以使16进制颜⾊pic .set_title('color="#FFC0CB"')
pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[1], color ="orange") # 也可以是 英⽂颜⾊字符串pic .set_title('color="orange"'
)In [19]:
fig, ax =plt .subplots(1,2,constrained_layout =True , figsize =(8, 3))
pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[0], color ="#FFC0CB") # 可以使16进制颜⾊pic .set_title('color="#FFC0CB"')
pic = sns .histplot(penguins, x ="flipper_length_mm", ax =ax[1], color ="orange") # 也可以是 英⽂颜⾊字符串pic .set_title('color="orange"')2.13 fill
:条形内部是否填充颜⾊
In [20]:
fig, ax =plt .subplots(1,2,constrained_layout =True , figsize =(8, 3))
pic = sns .histplot(penguins, x ="flipper_length_mm", hue ='sex', ax =ax[0], fill =False )pic .set_title('fill=False')
pic = sns .histplot(penguins, x ="flipper_length_mm", hue ='sex', ax =ax[1], fill =True )pic .set_title('fill=True')
Out[16]:
Text(0.5, 1.0, 'multiple="fill"')
Out[17]:
Text(0.5, 1.0, 'kde=True')
Out[18]:
Text(0.5, 1.0, 'color="orange"')
Out[19]:
Text(0.5, 1.0, 'color="orange"')
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论