通量数据处理(Python)——NC格式转为CSV格式
1.数据结构
NetCDF(network Common Data Form)⽹络通⽤数据格式包括变量、维和属性。通量数据RDMF_可利⽤软件Panoply进⾏可视化,如下图所⽰:
2.数据读取
Pyton读取nc数据,现在⼤部分的⽅法利⽤netCDF4包的Dataset⽅法读取⽂件,但nc格式的通量数据则⽆法利⽤Dataset读取出变量的值。
import pandas as pd
import os
from netCDF4 import Dataset
# 获取nc⽂件的内部信息
nc =Dataset(r'D:\NC_files\fifth_page\Red Dirt Melon Farm OzFlux tower site\RDMF_')
# 获取RDMF_中所有的变量
vars=nc.variables.keys()
print(vars)
输出结果:odict_keys([‘Ah’, ‘Ah_QCFlag’, ‘Cc’, ‘Cc_QCFlag’, ‘Day’, ‘Day_QCFlag’, ‘Fa’, ‘Fa_QCFlag’,
‘Fc’, ‘Fc_QCFlag’, ‘Fe’, ‘Fe_QCFlag’, ‘Fg’, ‘Fg_QCFlag’, ‘Fh’, ‘Fh_QCFlag’, ‘Fld’, ‘Fld_QCFlag’,
学数据库原理需要先学什么‘Flu’, ‘Flu_QCFlag’, ‘Fm’, ‘Fm_QCFlag’, ‘Fn’, ‘Fn_QCFlag’, ‘Fsd’, ‘Fsd_QCFlag’, ‘Fsu’,
‘Fsu_QCFlag’, ‘Hdh’, ‘Hdh_QCFlag’, ‘Hour’, ‘Hour_QCFlag’, ‘Minute’, ‘Minute_QCFlag’, ‘Month’,
‘Month_QCFlag’, ‘Precip’, ‘Precip_QCFlag’, ‘Second’, ‘Second_QCFlag’, ‘Sws’, ‘Sws_QCFlag’,
‘Sws_50cm’, ‘Sws_50cm_QCFlag’, ‘Sws_5cm’, ‘Sws_5cm_QCFlag’, ‘Ta’, ‘Ta_QCFlag’, ‘Ts’,
‘Ts_QCFlag’, ‘Wd_CSAT’, ‘Wd_CSAT_QCFlag’, ‘Ws_CSAT’, ‘Ws_CSAT_QCFlag’, ‘Year’, ‘Year_QCFlag’,
‘eta’, ‘eta_QCFlag’, ‘ps’, ‘ps_QCFlag’, ‘theta’, ‘theta_QCFlag’, ‘ustar’, ‘ustar_QCFlag’, ‘xlDateTime’,‘xlDateTime_QCFlag’])
idea怎么运行jsp项目import pandas as pd
import os
from netCDF4 import Dataset
query参数和params参数# 获取nc⽂件的内部信息
nc =Dataset(r'D:\NC_files\fifth_page\Red Dirt Melon Farm OzFlux tower site\RDMF_')
# 获取RDMF_中所有的变量
vars=nc.variables.keys()
for var in vars:
#读取每个变量的值
var_data=nc.variables[var][:].data
print(var,var_data)
输出结果:Traceback (most recent call last):
File “E:/05study/Pycode/Python/TSTLProject/test.py”, line 52, in
var_data=nc.variables[var][:].data
ValueError: could not convert string to float: '0,50’
根据以上代码⽆法读取变量的值,出现错误:ValueError: could not convert string to float: ‘0,50’,其原因我猜可能是变量的属性描述中的valid_range = “0,50”,如下图所⽰,如果哪位知道真实原因,欢迎评论交流。
数据库索引的创建
经过多次尝试,我们引⼊了gdal包进⾏通量数据的读取,以读取变量Ah为例,其他变量都是如此,代
码如下:
import pandas as pd
import os
from osgeo import gdal
#RDMF_⽂件路径
power表情包什么意思dir=r'D:\NC_files\fifth_page\Red Dirt Melon Farm OzFlux tower site\RDMF_'
#打开RDMF_中的Ah变量
variable = gdal.Open('NETCDF:'+dir+':'+'Ah')
# 获取变量值,并按⾏的⽅式将多维数组变成⼀维
variable_value = variable.ReadAsArray().flatten('C')
print(variable_value)
输出结果:[ 9.884751e+00 -9.999000e+03 -9.999000e+03 … 1.499470e+01 1.622357e+01 1.678135e+01],数据格式为numpy数组。
3.变量值写⼊CSV
import pandas as pd
import os
from osgeo import gdal
from netCDF4 import Dataset
#RDMF_⽂件路径
dir=r'D:\NC_files\fifth_page\Red Dirt Melon Farm OzFlux tower site\RDMF_'
python怎么读取nc文件# 获取nc⽂件的内部变量
nc = Dataset(dir)
# 定义⼀个DataFrame()存储变量值
df = pd.DataFrame()
#循环获取nc中的各个变量,并且把变量的值读出
for var in nc.variables.keys():
variable = gdal.Open('NETCDF:'+dir+':'+ var)
# 获取变量值,并按⾏的⽅式将多维数组变成⼀维
variable_value = variable.ReadAsArray().flatten('C')
# 将变量和值写⼊到DataFrame中
df[var]= pd.Series(variable_value)
#将DataFrame中的变量值写⼊到test.csv中
<_csv('test.csv', encoding='utf-8', index=False)
4.结果
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论