python⽐较数组中数的⼤⼩_Python中的⼤⼩增量Numpy数组我刚刚在Python中遇到了增量Numpy数组的需要,因为我没有到任何实现它的东西.我只是想知道我的⽅式是最好的⽅式还是你可以提出其他想法.
所以,问题是我有⼀个2D数组(程序处理nD数组),其⼤⼩事先是未知的,并且可变数据量需要在⼀个⽅向上连接到数组(让我们说我要去很多次打电话给np.vstak).每次我连接数据时,我都需要获取数组,沿轴0排序并执⾏其他操作,因此我⽆法构建⼀个长列表数组,然后⽴即对列表进⾏np.vstak.
由于内存分配很昂贵,我转向增量数组,其中我增加数量⼤于我需要的数量的数组(我使⽤50%增量),以便最⼩化分配数量.
我对此进⾏了编码,您可以在以下代码中看到它:
class ExpandingArray:
__DEFAULT_ALLOC_INIT_DIM = 10 # default initial dimension for all the axis is nothing is given by the user
__DEFAULT_MAX_INCREMENT = 10 # default value in order to limit the increment of memory allocation
__MAX_INCREMENT = [] # Max increment
__ALLOC_DIMS = [] # Dimensions of the allocated np.array
__DIMS = [] # Dimensions of the view with data on the allocated np.array (__DIMS <= __ALLOC_DIMS)
__ARRAY = [] # Allocated array
def __init__(self,initData,allocInitDim=None,dtype=np.float64,maxIncrement=None):
self.__DIMS = np.array(initData.shape)
self.__MAX_INCREMENT = maxIncrement
if self.__MAX_INCREMENT == None:
self.__MAX_INCREMENT = self.__DEFAULT_MAX_INCREMENT
# Compute the allocation dimensions based on user's input
if allocInitDim == None:
allocInitDim = self.__py()
while np.any( allocInitDim < self.__DIMS ) or np.any(allocInitDim == 0):
for i in range(len(self.__DIMS)):
if allocInitDim[i] == 0:
allocInitDim[i] = self.__DEFAULT_ALLOC_INIT_DIM
if allocInitDim[i] < self.__DIMS[i]:
allocInitDim[i] = min(allocInitDim[i]/2, self.__MAX_INCREMENT)
# Allocate memory
self.__ALLOC_DIMS = allocInitDim
self.__ARRAY = np.zeros(self.__ALLOC_DIMS,dtype=dtype)
# Set initData
sliceIdxs = [slice(self.__DIMS[i]) for i in range(len(self.__DIMS))]
self.__ARRAY[sliceIdxs] = initData
def shape(self):
return tuple(self.__DIMS)
def getAllocArray(self):
return self.__ARRAY
def getDataArray(self):
"""
Get the view of the array with data
"""
sliceIdxs = [slice(self.__DIMS[i]) for i in range(len(self.__DIMS))]
return self.__ARRAY[sliceIdxs]
def concatenate(self,X,axis=0):
if axis > len(self.__DIMS):
print "Error: axis number exceed the number of dimensions"
return
# Check dimensions for remaining axis
for i in range(len(self.__DIMS)):
if i != axis:
if X.shape[i] != self.shape()[i]:
print "Error: Dimensions of the input array are not consistent in the axis %d" % i return
# Check whether allocated memory is enough
needAlloc = False
while self.__ALLOC_DIMS[axis] < self.__DIMS[axis] X.shape[axis]:
needAlloc = True
# Increase the __ALLOC_DIMS
self.__ALLOC_DIMS[axis] = min(self.__ALLOC_DIMS[axis]/2,self.__MAX_INCREMENT) # Reallocate memory and copy old data
if needAlloc:
# Allocate
newArray = np.zeros(self.__ALLOC_DIMS)
# Copy
sliceIdxs = [slice(self.__DIMS[i]) for i in range(len(self.__DIMS))]
newArray[sliceIdxs] = self.__ARRAY[sliceIdxs]
self.__ARRAY = newArray
# Concatenate new data
sliceIdxs = []
for i in range(len(self.__DIMS)):
if i != axis:
sliceIdxs.append(slice(self.__DIMS[i]))
else:
sliceIdxs.append(slice(self.__DIMS[i],self.__DIMS[i] X.shape[i]))
self.__ARRAY[sliceIdxs] = X
self.__DIMS[axis] = X.shape[axis]
该代码显⽰出⽐vstack / hstack⼏个随机⼤⼩的连接更好的性能.
我想知道的是:这是最好的⽅式吗? numpy中有没有这样做的东西?
⽽且这将是很好能够重载np.array切⽚赋值运算符,所以实际的尺⼨之外,⼀旦⽤户分配什么,⼀个atenate()执⾏.怎么做这样的重载?
测试代码:我在这⾥也发布了⼀些代码,⽤于⽐较vstack和我的⽅法.我添加了最⼤长度为100的随机数据块.
import time
N = 10000
def performEA(N):
EA = s((0,2)),maxIncrement=1000)
python获取数组长度
for i in range(N):
nNew = np.random.random_integers(low=1,high=100,size=1)
X = np.random.rand(nNew,2)
# Perform operations DataArray()
return EA
def performVStack(N):
A = np.zeros((0,2))
for i in range(N):
nNew = np.random.random_integers(low=1,high=100,size=1)
X = np.random.rand(nNew,2)
A = np.vstack((A,X))
# Perform operations on A
return A
start_EA = time.clock()
EA = performEA(N)
stop_EA = time.clock()
start_VS = time.clock()
VS = performVStack(N)
stop_VS = time.clock()
print "Elapsed Time EA: %.2f" % (stop_EA-start_EA)
print "Elapsed Time VS: %.2f" % (stop_VS-start_VS)
解决⽅法:
我认为这些东西最常见的设计模式是只使⽤⼩数组的列表.当然你可以做动态调整⼤⼩的事情(如果你想做疯狂的事情,你也可以尝试使⽤resize数组⽅法).我认为⼀种典型的⽅法是在你真的不知道会有多⼤的时候总是加倍.当然,如果您知道阵列将增长到多⼤,那么只需预先分配完整的东西就是最简单的.
def performVStack_fromlist(N):
l = []
for i in range(N):
nNew = np.random.random_integers(low=1,high=100,size=1)
X = np.random.rand(nNew,2)
l.append(X)
return np.vstack(l)

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。