python棒棒糖代码_25个常用Matplotlib图的Python代码,收藏了!--688IT编程网

matplotlib中subplotpython棒棒糖代码_25个常⽤Matplotlib图的Python代码，收藏

了！

作者：zsx_yiyiyi

编辑：python⼤本营

本⽂参考⾃：

⼤家好，今天分享给⼤家25个Matplotlib图的汇总，在数据分析和可视化中⾮常有⽤，⽂章较长，可以收藏下来慢慢练⼿。

# !pip install brewer2mpl

import numpy as np

import pandas as pd

import matplotlib as mpl

import matplotlib.pyplot as plt

import seaborn as sns

import warnings; warnings.filterwarnings(action='once')

large = 22; med = 16; small = 12

params = {'axes.titlesize': large,

'legend.fontsize': med,

'figure.figsize': (16, 10),

'axes.labelsize': med,

'axes.titlesize': med,

'xtick.labelsize': med,

'ytick.labelsize': med,

'figure.titlesize': large}

plt.style.use('seaborn-whitegrid')

sns.set_style("white")

%matplotlib inline

# Version

print(mpl.__version__) #> 3.0.0

print(sns.__version__) #> 0.9.0

1、散点图

Scatteplot是⽤于研究两个变量之间关系的经典和基本图。如果数据中有多个组，则可能需要以不同颜⾊可视化每个组。在Matplotlib，你可以⽅便地使⽤。

# Import dataset

midwest = pd.read_csv("raw.githubusercontent/selva86/datasets/master/midwest_filter.csv")

# Prepare Data

# Create as many colors as there are unique midwest['category']

categories = np.unique(midwest['category'])

colors = [ab10(i/float(len(categories)-1)) for i in range(len(categories))]

# Draw Plot for Each Category

plt.figure(figsize=(16, 10), dpi= 80, facecolor='w', edgecolor='k')

for i, category in enumerate(categories):

plt.scatter('area', 'poptotal',

data=midwest.loc[midwest.category==category, :],

s=20, c=colors[i], label=str(category))

# Decorations

xlabel='Area', ylabel='Population')

plt.title("Scatterplot of Midwest Area vs Population", fontsize=22)

plt.legend(fontsize=12)

plt.show()

2、带边界的⽓泡图

有时，你希望在边界内显⽰⼀组点以强调其重要性。在此⽰例中，你将从应该被环绕的数据帧中获取记录，并将其传递给下⾯的代码中描述的记录。encircle()

from matplotlib import patches

from scipy.spatial import ConvexHull

import warnings; warnings.simplefilter('ignore')

sns.set_style("white")

# Step 1: Prepare Data

midwest = pd.read_csv("raw.githubusercontent/selva86/datasets/master/midwest_filter.csv")

# As many colors as there are unique midwest['category']

categories = np.unique(midwest['category'])

colors = [ab10(i/float(len(categories)-1)) for i in range(len(categories))]

# Step 2: Draw Scatterplot with unique color for each category

fig = plt.figure(figsize=(16, 10), dpi= 80, facecolor='w', edgecolor='k')

for i, category in enumerate(categories):

plt.scatter('area', 'poptotal', data=midwest.loc[midwest.category==category, :], s='dot_size', c=colors[i], label=str(category), edgecolors='black', linewidths=.5)

# Step 3: Encircling

def encircle(x,y, ax=None, **kw):

if not ax: a()

p = np.c_[x,y]

hull = ConvexHull(p)

poly = plt.Polygon(p[hull.vertices,:], **kw)

ax.add_patch(poly)

# Select data to be encircled

midwest_encircle_data = midwest.loc[midwest.state=='IN', :]

# Draw polygon surrounding vertices

encircle(midwest_encircle_data.area, midwest_encircle_data.poptotal, ec="k", fc="gold", alpha=0.1)

encircle(midwest_encircle_data.area, midwest_encircle_data.poptotal, ec="firebrick", fc="none", linewidth=1.5)

# Step 4: Decorations

xlabel='Area', ylabel='Population')

plt.title("Bubble Plot with Encircling", fontsize=22)

plt.legend(fontsize=12)

plt.show()

3、带线性回归最佳拟合线的散点图

如果你想了解两个变量如何相互改变，那么最合适的线就是要⾛的路。下图显⽰了数据中各组之间最佳拟合线的差异。要禁⽤分组并仅为整个数据集绘制⼀条最佳拟合线，请从下⾯的调⽤中删除该参数。

# Import Data

df = pd.read_csv("raw.githubusercontent/selva86/datasets/master/mpg_ggplot2.csv")

df_select = df.l.isin([4,8]), :]

# Plot

sns.set_style("white")

gridobj = sns.lmplot(x="displ", y="hwy", hue="cyl", data=df_select,

height=7, aspect=1.6, robust=True, palette='tab10',

scatter_kws=dict(s=60, linewidths=.7, edgecolors='black'))

# Decorations

gridobj.set(xlim=(0.5, 7.5), ylim=(0, 50))

plt.title("Scatterplot with line of best fit grouped by number of cylinders", fontsize=20)

每个回归线都在⾃⼰的列中

或者，你可以在其⾃⼰的列中显⽰每个组的最佳拟合线。你可以通过在⾥⾯设置参数来实现这⼀点。

# Import Data

df = pd.read_csv("raw.githubusercontent/selva86/datasets/master/mpg_ggplot2.csv")

df_select = df.l.isin([4,8]), :]

# Each line in its own column

sns.set_style("white")

gridobj = sns.lmplot(x="displ", y="hwy",

data=df_select,

height=7,

robust=True,

palette='Set1',

col="cyl",

scatter_kws=dict(s=60, linewidths=.7, edgecolors='black'))

# Decorations

gridobj.set(xlim=(0.5, 7.5), ylim=(0, 50))

plt.show()

4、抖动图

通常，多个数据点具有完全相同的X和Y值。结果，多个点相互绘制并隐藏。为避免这种情况，请稍微抖动点，以便你可以直观地看到它们。这很⽅便使⽤

# Import Data

df = pd.read_csv("raw.githubusercontent/selva86/datasets/master/mpg_ggplot2.csv")

# Draw Stripplot

fig, ax = plt.subplots(figsize=(16,10), dpi= 80)

sns., df.hwy, jitter=0.25, size=8, ax=ax, linewidth=.5)

# Decorations

plt.title('Use jittered plots to avoid overlapping of points', fontsize=22)

plt.show()

5、计数图

避免点重叠问题的另⼀个选择是增加点的⼤⼩，这取决于该点中有多少点。因此，点的⼤⼩越⼤，周围的点的集中度就越⼤。

# Import Data

df = pd.read_csv("raw.githubusercontent/selva86/datasets/master/mpg_ggplot2.csv")

df_counts = df.groupby(['hwy', 'cty']).size().reset_index(name='counts')

# Draw Stripplot

fig, ax = plt.subplots(figsize=(16,10), dpi= 80)

sns.stripplot(, df_counts.hwy, size=unts*2, ax=ax)

# Decorations

plt.title('Counts Plot - Size of circle is bigger as more points overlap', fontsize=22)

plt.show()

6、边缘直⽅图

边缘直⽅图具有沿X和Y轴变量的直⽅图。这⽤于可视化X和Y之间的关系以及单独的X和Y的单变量分布。该图如果经常⽤于探索性数据分析(EDA)。

# Import Data

df = pd.read_csv("raw.githubusercontent/selva86/datasets/master/mpg_ggplot2.csv")

# Create Fig and gridspec

fig = plt.figure(figsize=(16, 10), dpi= 80)

grid = plt.GridSpec(4, 4, hspace=.5, wspace=.2)

# Define the axes

ax_main = fig.add_subplot(grid[:-1, :-1])

ax_right = fig.add_subplot(grid[:-1, -1], xticklabels=[], yticklabels=[])

ax_bottom = fig.add_subplot(grid[-1, 0:-1], xticklabels=[], yticklabels=[])

# Scatterplot on main ax

ax_main.scatter('displ', 'hwy', *4, c=df.manufacturer.astype('category').des, alpha=.9, data=df, cmap="tab10", edgecolors='gray', linewidths=.5)

# histogram on the right

ax_bottom.hist(df.displ, 40, histtype='stepfilled', orientation='vertical', color='deeppink')

ax_bottom.invert_yaxis()

# histogram in the bottom

ax_right.hist(df.hwy, 40, histtype='stepfilled', orientation='horizontal', color='deeppink')

# Decorations

ax_main.set(title='Scatterplot with Histograms

displ vs hwy', xlabel='displ', ylabel='hwy')

ax_main.title.set_fontsize(20)

for item in ([ax_main.xaxis.label, ax_main.yaxis.label] + _xticklabels() + _yticklabels()):

item.set_fontsize(14)

xlabels = _xticks().tolist()

ax_main.set_xticklabels(xlabels)

plt.show()

7、边缘箱形图

688IT编程网

python棒棒糖代码_25个常用Matplotlib图的Python代码,收藏了!

发表评论

推荐文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

热门文章

随机森林算法的改进方法

基于随机森林算法的风险预警模型研究

Python中的随机森林算法详解

随机森林发展历史

如何使用随机森林进行时间序列数据模式识别(八)

随机森林回归模型原理

如何使用随机森林进行时间序列数据模式识别(六)

如何使用随机森林进行时间序列数据预测(四)

如何使用随机森林进行异常检测(六)

随机森林算法和grandientboosting算法 -回复

随机森林方法总结全面

随机森林算法原理和步骤

随机森林的原理

随机森林重要性

随机森林算法

机器学习中随机森林的原理

随机森林算法原理

使用计算机视觉技术进行动物识别的技巧

基于crf命名实体识别实验总结

transformer预测模型训练方法

最新文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

随机森林结合直接正交信号校正的模型传递方法

标签列表

688IT编程网

python棒棒糖代码_25个常用Matplotlib图的Python代码,收藏了!

发表评论

推荐文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

热门文章

随机森林算法的改进方法

基于随机森林算法的风险预警模型研究

Python中的随机森林算法详解

随机森林发展历史

如何使用随机森林进行时间序列数据模式识别(八)

随机森林回归模型原理

如何使用随机森林进行时间序列数据模式识别(六)

如何使用随机森林进行时间序列数据预测(四)

如何使用随机森林进行异常检测(六)

随机森林算法和grandientboosting算法 -回复

随机森林方法总结全面

随机森林算法原理和步骤

随机森林的原理

随机森林 重要性

随机森林算法

机器学习中随机森林的原理

随机森林算法原理

使用计算机视觉技术进行动物识别的技巧

基于crf命名实体识别实验总结

transformer预测模型训练方法

最新文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

随机森林结合直接正交信号校正的模型传递方法

标签列表

随机森林重要性