python温度符号_以温度为例的气象数据缺测处理方法(Python版)--688IT编程网

python温度符号_以温度为例的⽓象数据缺测处理⽅法

（Python版）

⽓象观测数据经常遇到缺测的情况，本⽂档将以温度为例介绍缺测处理的⼀些⽅法供⼤家参考。这些⽅法是采⽤了Python语⾔函数库Pandas中的已有⽅法。为此，实现对⼀般⽓象缺测数据的处理就变得简单⽽且快速。当然，⽓象数据的缺测问题有很多种，不同的⽓象要

素其缺测处理⽅法也不同，对于复杂的、综合性的缺测问题通常需要针对性的提出解决⽅案，再通过编写定制化的程序代码来解决。本⽂档仅针对⼀般性的缺测问题处理进⾏介绍，⽓象要素选择最常⽤的温度要素，该要素的特点是空间和时间上连续分布。如果是降⽔量，则因其本⾝空间离散性强，因此本⽂档中的⽅法不适⽤于逐⽇降⽔量的缺测处理，但可⽤于时间尺度较长的累积降⽔量缺测的处理。

通过本⽂档中介绍的⽅法，可以了解到如下内容：

1)时间序列数据的处理(time series data)

2)缺测处理的4种⽅法：①以前⼀天或后⼀天观测值补缺；②以观测序列的统计值补缺；③采⽤插值⽅法补缺；④以参考年或站的同期数据

补缺

3)数据的描述性统计⽅法(descriptive statistics)

4)绘图和特殊符号在绘图中的显⽰，如温度单位℃

5)如何获取缺测数据所在⾏的序号

6)以⽇期为索引选择数据⼦集

#加载常⽤的Python函数库，这些在之前的⽂章中已经提及，这⾥不再赘述

import numpy as npimport pandas as pdimport datetime as dtimport matplotlib.pyplot as pltimport warningswarnings.filterwarnings('ignore')

#加载数据

meteo_min = pd.read_csv('./data/meteo_min.csv', sep=',', header=0)

scratch少儿编程介绍#预览前5⾏数据，并重新定义每列数据的名称

meteo_min.head()column_names = ['code', 'snumb', 'year', 'month', 'day', 'mint', 'durc', 'quality']lumns = column_names

#提取原数据中的year，month，day，mint四列数据，其中mint代表最低温度，这四列数据对应列索引号为2，3，4，5

meteo_min = meteo_min.iloc[:, [2,3,4,5]]

#增加⼀个新的属性列date，该属性列为YYYY-MM-DD格式的⽇期数据

meteo_min['date'] = [np.datetime64(dt.datetime(meteo_min.iloc[i,0], meteo_min.iloc[i, 1], meteo_min.iloc[i, 2])) #提取分析时段的数据。分析时段定义为1968-7-1到2018-6-30。提取的数据变量为temp_min

#define the start date as a datetime variablestart_date = dt.datetime(year = 1968, month = 7, day = 1)#define the end date as a datetime variableend_date #绘制提取时段的最低温度随时间的变化曲线。plt.ylabel中使⽤了Latex语法在纵坐标标题中显⽰”℃“

#plot the temporal change of the minimum temperaturefg = plt.figure()plt.plot(temp_min.mint, lw = 0.5, color = 'gray')plt.xlabel('Date')plt.ylabel("Minimum te

#检查数据中是否存在缺测，如果存在输出存在缺测值的总⾏数

#check how many null values in this dataprint('The number of null values is {0}'.format(sum(temp_min.mint.isna())))

输出结果：The number of null values is 20

#获取这20⾏缺测数据的⾏序号

eclipse如何下载中文包

#get the indices of the null values in the datana_index = temp_min.index[temp_min.mint.isna()]print(na_index)

输出结果：DatetimeIndex(['1983-09-23', '1994-02-27', '1994-11-23', '1995-05-09', '1999-12-1

2', '2000-01-21', '2000-09-01', '2000-11-09', '2000 #统计哪些年份存在缺测值，各年缺测⾏的总数据

ar.value_counts()输出结果：2000 62011 42003 21994 21983 12002 12001 11999 11995 12017 1Name: date, dtype: int64

#统计哪些⽉份存在缺测值，各⽉份缺测⾏的总数据

h.value_counts()输出结果：2 612 511 29 23 26 15 11 1Name: date, dtype: int64

#从缺测数据中选择两个案例时段，结果显⽰的是两个缺测时段的起始⽇期

na_index_sub = na_index[[0, 16]]print(na_index_sub)输出结果：DatetimeIndex(['1983-09-23', '2011-02-06'], dtype='datetime64[ns]', name='date', freq=None)

#从每个缺测时段起始时间开始，提取该⽇期前后各⼀段时间，这⾥设置的是前5天和后5天，含缺测⼀共11天的数据

#get a block of data that contains null valuesmint_na_subs = []for ech in na_index_sub: #start index of the block, changed to timestamp datatype vs_e

#以缺测⾏前⼀天的观测数据来替代缺测数据。block #0代表第⼀个例⼦(只有⼀个缺测⾏)；block #1代表第⼆个例⼦(含有连续多⾏缺测)

#fill the missing values with the temperature of previous dayfor i in range(0, len(mint_na_subs)): res = pd.DataFrame(); data = mint_na_subs[i] filled_

#以缺测⾏后⼀天的观测数据来替代缺测数据

#fill the missing values with the temperature of next dayfor i in range(0, len(mint_na_subs)): res = pd.DataFrame(); data = mint_na_subs[i] filled_data

#以缺测数据⾏前后5⾏数据的平均值代替缺测值

#fill the missing values with the mean of the block datafor i in range(0, len(mint_na_subs)): res = pd.DataFrame(); data = mint_na_subs[i] filled_data

#以缺测数据⾏前后5⾏数据的中值代替缺测值

#fill the missing values with the median of the block datafor i in range(0, len(mint_na_subs)): res = pd.DataFrame(); data = mint_na_subs[i] filled_dat

#采⽤不同插值⽅法计算缺测⾏的数值，这⾥采⽤的⽅法有：m0-线性插值；m1-按时间插值；m2-2次样条插值；m3-2次多项式插值；

m4-双向插值。输出结果中”od“表⽰原始数据，”fd“表⽰缺测订正过的数据，m0-m4表⽰上述5中插值⽅法

#filling the missing values with interpolation functionfor i in range(0, len(mint_na_subs)): res = pd.DataFrame(); data = mint_na_subs[i] filled_data_0

#采⽤参考年份同期的⽓象数据进⾏补缺，这个⽅法也适⽤参考站点或邻近站点同期⽓象缺测数据的补缺。选择block #1的例⼦，其含有缺测的数据以变量名data保存，数据为缺测起始⾏前15天和后15天的数据，⼀共为31天的数据

#提取含缺测数据的31⾏数据，以data变量名保存

#fill missing values based on a reference yearna_index_sub_exp = na_index_sub[1]#get a block of data that contains null values#start index of the block, #提取参考年份同期的31⾏数据，以ref_data变量名保存

#get a block of data from a reference year with the same time periodssyear = 2010eyear = 2010smonth = data.index[0].monthsday = data.index[0].dayem #绘图⽐较两套数据，同⽉同⽇同期，但不同年份

#have a look at the data from different sourcefg = plt.figure(figsize = (12,6))plt.subplot(2,1,1)plt.plot(ref_data)plt.xlabel('Date')plt.ylabel("Minimum tempera

xml文件如何转换成图片

#把两套数据组合形成⼀个新的Pandas.DataFrame变量以便统计分析，数据中“ref”代表参考年份数据；“cur”代表当前年份数据(含

available to

缺测)，这⾥仅显⽰前10⾏数据

#combine the data from both yearscomb_data = pd.DataFrame({"ref":ref_data.values, "cur":data.values})#a short view of the datacomb_data.head(10)输出结果：

#组合数据的描述性统计siblings up

comb_data.describe()输出结果： ref curcount 31.000000 27.000000mean 16.480645 15.174074std 2.839181 3.283243min 10.300000 9.40000

气象python零基础入门教程#采⽤⽐值法进⾏补缺。该⽅法原理是利⽤已知的两个序列观测值求算相互的⽐值，这⾥⽐值定义为R=ref/cur，然后cur中缺测的数据值

=ref/R

#filling the missing values based on the mean ratio of ref and cur#get the ratiosratios = comb_data['ref']/comb_data['cur']#plot the ratiosfg = plt.figure()plt.p

#将原数据与缺测处理后的数据进⾏整合，并输出和⽐较

new_data = pd.DataFrame({"original": data, "filled":filled_data})print(new_data.iloc[10:20,:])输出结果： original filleddate 2011-02-01

通过不同缺测处理⽅法的⽐较，我们认为没有⼀种缺测处理⽅法是完美的。对于零星的缺测，可选择均值补缺处理、平均值补缺处理或前后天观测值的补缺复制；对于连续的缺测处理，则可选择利⽤参考站(年)或临近站(年)相同时期的数据，以⽐值法或差值法进⾏缺测处理。然⽽，当存在复杂的⽓象数据缺测问题时，缺测处理⽅法应该根据实际情况进⾏选取。上述结果仅供参考。

如果对本⽂档有什么补充或建议，欢迎留⾔?。谢谢！

688IT编程网

python温度符号_以温度为例的气象数据缺测处理方法(Python版)

发表评论

推荐文章

java正则表达式选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符回溯引用和前后查匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式选择题

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

688IT编程网

python温度符号_以温度为例的气象数据缺测处理方法(Python版)

发表评论

推荐文章

java正则表达式 选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符 回溯引用和前后查 匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式 选择题

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

java正则表达式选择题

非零金额正则表达式

基本的元字符回溯引用和前后查匹配模式

java正则表达式选择题

非零金额正则表达式