PythonNumpy中的数组堆叠⽅法(np.hstack和np.vstack)及其借助列。。。⼀、环境
Anaconda 3
Python 3.6
Numpy 1.14.3
⼆、功能⽤途及官⽅说明
三、实例
实例⼀:使⽤ np.hstack 将数据与标签合并
>>>import numpy as np
# 数据准备
>>> data =[i for i in range(18)]
>>> data_array = np.asarray(data)
>>> data_array = np.asarray(data).reshape([6,3])
>>> data_array.shape
(6,3)
>>> data_array
array([[0,1,2],
[3,4,5],
[6,7,8],
[9,10,11],
[12,13,14],
[15,16,17]])
# 标签准备
>>> label =[0,1]*3
>>> label_array = np.asarray(label)
>>> label_array.shape
(6,)
>>> label_array
array([0,1,0,1,0,1])
# 在数据的右侧⽔平⽅向上合并标签
>>> data_label = np.hstack((data_array,label_array))
Traceback (most recent call last):
File "<stdin>", line 1,in<module>
File "/usr/local/lib/python3.6/dist-packages/numpy/core/shape_base.py", line 288,in hstack
return _nx.concatenate(arrs,1)
ValueError:all the input arrays must have same number of dimensions
直观上看, np.hstack 只要保证要合并的两个 numpy 数组的数据⾏相同,那么两个 numpy 数组的列就可以沿着⽔平⽅向合并了!这⾥也是初学者常遇到个⼀个问题,仔细看⼀下报错信息就会很容易发现,问题出在要合并的两个 numpy 数组的维度数量不⼀致,data_array 的维度是⼆维(6, 3),⽽ label_array 的维度是⼀维 (6, ),因此即使两个 numpy 数组的⾏数⼀样,也不能沿⽔平⽅向进⾏正常的列堆叠!
正确的⽅法:
# 在准备标签时,先将⼀维的标签 reshape 为⼆维 numpy 数组,即 6 ⾏ 1 列
>>> label_array = shape(-1,1)
>>> data_label = np.hstack((data_array,label_array))
>>> data_label.shape
(6,4)
>>> data_label
array([[0,1,2,0],
[3,4,5,1],
[6,7,8,0],
[9,10,11,1],
[12,13,14,0],
[15,16,17,1]])
实例⼆:使⽤ np.vstack 合并两组数据集
>>>import numpy as np
>>> data1 = al(0,1,(2,5))
>>> data1.shape
(2,5)
>>> data1
array([[-1.49100993,0.03782522,0.33961941,-0.64073217,0.84000297],
[-1.02662855,-0.91858614,-0.27410549,-0.86956142,-0.44147313]])
# 准备第⼆个数据集
>>> data2 = np.arange(0,30,2)
>>> data2 = np.arange(0,30,2).reshape([3,5])
>>> data2.shape
(3,5)
>>> data2
array([[0,2,4,6,8],
[10,12,14,16,18],
[20,22,24,26,28]])
# 垂直⽅向堆叠连个数据集
>>> data = np.vstack((data1,data2))
>>> data.shape
(5,5)
>>> data
array([[-1.49100993,0.03782522,0.33961941,-0.64073217,0.84000297],
[-1.02662855,-0.91858614,-0.27410549,-0.86956142,-0.44147313],
[0.,2.,4.,6.,8.],
[10.,12.,14.,16.,18.],
[20.,22.,24.,26.,28.]])
实例三:借助列表(list)对多个数据集进⾏⼀次性堆叠合并
可以⽤于在 for / while 循环读取数据集时,依次先将数据加⼊到列表(list)中,然后在多个数据集⼀起堆叠合并,⽽不⽤在繁琐地使⽤两两数据集堆叠合并的⽅式了
>>> data_v1 = np.random.randint(0,10,(2,5))
>>> data_v1.shape
(2,5)
>>> data_v1
array([[4,4,0,7,3],
[3,9,0,3,0]])
# 准备第⼆个数据集
>>> data_v2 = np.ones((3,5))
>>> data_v2.shape
(3,5)
>>> data_v2
array([[1.,1.,1.,1.,1.],
[1.,1.,1.,1.,1.],
[1.,1.,1.,1.,1.]])
# 准备第三个数据集
>>> data_v3 = np.full((2,5),0)
>>> data_v3.shape
(2,5)
>>> data_v3
array([[0,0,0,0,0],
[0,0,0,0,0]])
# 定义⼀个临时存放多个数据集的列表(list),并将所有数据集添加到列表中>>> data_vlist =[]
>>> data_vlist.append(data_v1)
>>> data_vlist.append(data_v2)
>>> data_vlist.append(data_v3)
>>>len(data_vlist)
3
>>> data_vlist
[array([[4,4,0,7,3],
[3,9,0,3,0]]), array([[1.,1.,1.,1.,1.],
[1.,1.,1.,1.,1.],
[1.,1.,1.,1.,1.]]), array([[0,0,0,0,0],
[0,0,0,0,0]])]
# 将存放所有数据集的列表作为 np.vstack() 的输⼊参数,即可⼀次性合并多个数据集>>> data_vstack = np.vstack(data_vlist)
>>> data_vstack
array([[4.,4.,0.,7.,3.],
[3.,9.,0.,3.,0.],
[1.,1.,1.,1.,1.],
[1.,1.,1.,1.,1.],
[1.,1.,1.,1.,1.],
[0.,0.,0.,0.,0.],
[0.,0.,0.,0.,0.]])
>>> data_vstack.shape
(7,5)
>>> data_vstack
array([[4.,4.,0.,7.,3.],
[3.,9.,0.,3.,0.],
[1.,1.,1.,1.,1.],
[1.,1.,1.,1.,1.],
[1.,1.,1.,1.,1.],
[0.,0.,0.,0.,0.],
[0.,0.,0.,0.,0.]])
>>>>>>>>>>>##
同理,np.hstack 也可以借助列表(list)⼀次性⽔平合并多个数据列
# 准备⾸个数列
>>>import numpy as np
>>> data_h1 = np.random.randint(0,10,(3,3))
>>> data_h1.shape
(3,3)
>>> data_h1
array([[6,4,5],
[4,5,0],
[7,1,9]])
# 准备第⼆个数列
>>> data_h2 = np.zeros((3,2))
>>> data_h2.shape
(3,2)
>>> data_h2
array([[0.,0.],
[0.,0.],
python数组合并[0.,0.]])
# 准备第三个数列
>>> data_h3 = np.ones((3,1), dtype=int)
>>> data_h3.shape
(3,1)
>>> data_h3
array([[1],
[1],
[1]])
# 定义⼀个临时存放多个数据列的列表(list),并将所有数据列添加到列表中>>> data_hlist =[]
>>> data_hlist.append(data_h1)
>>> data_hlist.append(data_h2)
>>> data_hlist.append(data_h3)
>>>len(data_hlist)
3
>>> data_hlist
[array([[6,4,5],
[4,5,0],
[7,1,9]]), array([[0.,0.],
[0.,0.],
[0.,0.]]), array([[1],
[1],
[1]])]
# 将存放所有数据列的列表作为 np.hstack() 的输⼊参数,即可⼀次性合并多个数据列>>> data_hstack = np.hstack(data_hlist)
>>> data_hstack.shape
(3,6)
>>> data_hstack
array([[6.,4.,5.,0.,0.,1.],
[4.,5.,0.,0.,0.,1.],
[7.,1.,9.,0.,0.,1.]])
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
Python中的Numpy入门教程
« 上一篇
3.7合并数据集:Concat与Append操作
下一篇 »
发表评论