series合并成dataframe_【S01E04】pandas之合并数据集--688IT编程网

series合并成dataframe_【S01E04】pandas之合并数据集

I. 数据库风格的合并——merge

i) 最简单的合并

<(df1, df2, on='key') key为重叠列名

ii) 连接键列名不同

<(left, right, left_on='lkey', right_on='rkey')

iii) 连接⽅式（默认为inner)

<(left, right, on='key', how='outer')

iv) 连接键为多列

<(left, right, on=['key1','key2'])

v) 重复列名的处理

<(left, right, on='key', suffixes=['_left','_right'])

vi) 索引上的合并（索引作为连接键）

<(left, right, left_on='key', right_index=True)

II. 按索引合并——join

i）join实例⽅法实现按索引合并

left.join(right, how='outer')

ii）【参数DataFrame的索引】跟【调⽤者DataFrame的某个列】之间的连接

left.join(right, on='key')

iii）join⽅法合并多个DataFrame

df1.join([df2,df3], how='outer', sort=True)

III. 轴向连接——concat⽅法

i) Series连接（axis=0）

ii) Series连接（axis=1）

iii) 连接⽅式（默认join='outer'）

iv) 指定在⾮连接轴上使⽤的索引

v) 区分连接⽚段

names=['level0', 'level1'])

vi) 抛弃⽆关⾏索引

IV. 合并重叠数据——combine_first()

df1bine_first(df2)

V. df末尾追加数据——append

<( )可根据⼀个或多个键将不同DataFrame中的⾏连接起来。（类似数据库的连接操作，merge默认做的是"inner"连接，join默认做的是"left"连接）

实例⽅法combine_first( )可以将重复数据编接在⼀起，⽤⼀个对象中的值填充另⼀个对象中的值。

I. 数据库风格的合并——merge

Merge DataFrame objects by performing a database-style join operation by columns or indexes.

merge(left, right, how='inner', on=None, left_on=None, right_on=None,

left_index=False, right_index=False, sort=False, suffixes=('_x', '_y'),

copy=True, indicator=False, validate=None)

left : DataFrame

right : DataFrame

how : {'left', 'right', 'outer', 'inner'}, default 'inner'

* left: use only keys from left frame, similar to a SQL left outer join;

preserve key order

* right: use only keys from right frame, similar to a SQL right outer join;

preserve key order

* outer: use union of keys from both frames, similar to a SQL full outer

join; sort keys lexicographically

* inner: use intersection of keys from both frames, similar to a SQL inner join; preserve the order of the left keys

on : label or list

Column or index level names to join on. These must be found in both

DataFrames. If `on` is None and not merging on indexes then this defaults to the intersection of the columns in both DataFrames.

left_on : label or list, or array-like多表left join

Column or index level names to join on in the left DataFrame. Can also

be an array or list of arrays of the length of the left DataFrame.

These arrays are treated as if they are columns.

right_on : label or list, or array-like

Column or index level names to join on in the right DataFrame. Can also be an array or list of arrays of the length of the right DataFrame.

These arrays are treated as if they are columns.

left_index : boolean, default False

Use the index from the left DataFrame as the join key(s). If it is a

MultiIndex, the number of keys in the other DataFrame (either the index or a number of columns) must match the number of levels

right_index : boolean, default False

Use the index from the right DataFrame as the join key. Same caveats as left_index

sort : boolean, default False

Sort the join keys lexicographically in the result DataFrame. If False,

the order of the join keys depends on the join type (how keyword)

suffixes : 2-length sequence (tuple, list, ...)

Suffix to apply to overlapping column names in the left and right

side, respectively

copy : boolean, default True

If False, do not copy data unnecessarily

重点参数

left

right

how

left_on

right_on

left_index

right_index

即数据、连接⽅式、连接键

数据集的合并（merge）或连接（join）运算是通过⼀个或多个键将⾏连接起来的。这些运算是关系型数据库的核⼼

i) 最简单的合并

最简单的连接（如果没有显式指定连接键，merge默认将重叠的列名当作键）

最好还是指定连接键：

ii) 连接键列名不同

如果在左右DataFrame中作为连接键的列有不同的列名，或者说左侧DataFrame中⽤作连接键的列与右侧DataFrame中⽤作连接键的列不同，可以⽤left_on和(或)right_on关键字（分别）显⽰指定

iii) 连接⽅式（默认为inner)

上⾯结果中没有c、d及与之相关的数据，因为merge默认做的是"inner"连接，结果中的键是交集。如果需要其他连接⽅式，⽤how关键字显式指定。

how＝'outer'

how='left'

688IT编程网

series合并成dataframe_【S01E04】pandas之合并数据集

发表评论

推荐文章

java正则表达式选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符回溯引用和前后查匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式选择题

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

688IT编程网

series合并成dataframe_【S01E04】pandas之合并数据集

发表评论

推荐文章

java正则表达式 选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符 回溯引用和前后查 匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式 选择题

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

java正则表达式选择题

非零金额正则表达式

基本的元字符回溯引用和前后查匹配模式

java正则表达式选择题

非零金额正则表达式