hiveUDF实现⼀个字符串解码函数
其实hive的udf 是⽐较容易实现的,只需要继承UDF,实现其evaluate()⽅法,代码如下。
@Description(name = "decoder_url", value = "_FUNC_(url [,code][,count]) - decoder a URL from a String for count times using code as encoding scheme    + "if count is not given ,the url will be decoderd for 2 time,"
+ "if code is not given ,GBK is used")
public class UDFDecoderUrl extends UDF {
private String url = null;
private int times = 2;
private String code = "GBK";
public UDFDecoderUrl() {
}
public String evaluate(String urlStr, String srcCode, int count) {
if (urlStr == null) {
return null;
}
if (count <= 0) {
return urlStr;
}
if (srcCode != null) {
code = srcCode;
}
url = urlStr;
times = count;
for (int i = 0; i < times; i++) {
url = decoder(url, code);
}
return url;
}
字符串比较函数实现public String evaluate(String urlStr, String srcCode) {
if (urlStr == null) {
return null;
}
url = urlStr;
code = srcCode;
return evaluate(url, code,times);
}
public String evaluate(String urlStr, int count) {
if (urlStr == null) {
return null;
}
if (count <= 0) {
return urlStr;
}
url = urlStr;
times = count;
return evaluate(url, code,times);
}
public String evaluate(String urlStr) {
if (urlStr == null) {
return null;
}
url = urlStr;
return evaluate(url, code,times);
}
private String decoder(String urlStr, String code) {
private String decoder(String urlStr, String code) {
if (urlStr == null || code == null) {
return null;
}
try {
urlStr = URLDecoder.decode(urlStr, code);
} catch (Exception e) {
return null;
}
return urlStr;
}
}
在类中org.apache.hadoop.FunctionRegistry中添加
registerUDF("decoder_url", UDFDecoderUrl.class, false);
编译hive ,或者通过配置⽂件⽅式,让其读取,以后新加的函数配置到配置⽂件中⼀劳永逸。
上⾯的类UDFDecoderUrl需要打成jar包加载到hive中,需要再l配置如下加载jar包
<property>
<name>hive.aux.jars.path</name>
<value>file:///opt/hive/sohu/hive-udf-0.0.1.jar</value>
<description>These JAR file are available to all users for all jobs</description>
</property>

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。