您的位置:首页 > 编程语言

Hive UDF编程

2015-08-16 17:25 330 查看
  • 编写一个类 继承 org.apache.hadoop.hive.ql.exec.UDF

在该类中加入 evaluate 方法

"evaluate" should never be a void method. However it can return "null" if * needed.

 

 

public class UDFLastDay extends UDF{
private final SimpleDateFormat inputFormatter = new SimpleDateFormat("yyyy-MM-dd");
private final SimpleDateFormat outFormatter = new SimpleDateFormat("yyyy-MM-dd");

private final Calendar calendar = Calendar.getInstance();

Text result = new Text();

//  2015-03-01  ==> 2015-03-31
public Text evaluate(Text input) {

if(null == input || StringUtils.isBlank(input.toString())) {
return null;
}

try {
calendar.setTime(inputFormatter.parse(input.toString()));
int lastDate = calendar.getActualMaximum(Calendar.DATE);  //获得到月份最大的天数
calendar.set(Calendar.DATE, lastDate);

result.set(outFormatter.format(calendar.getTime()));

return result;
} catch (ParseException e) {
e.printStackTrace();
return null;
}
}
}

 

  • 打包放到 linux 某个目录下 例如: /home/hadoop/software/lib/udf.jar 
  • 如何将UDF加入到hive中使用?

方式一:(当前session有效)

add jar /home/hadoop/software/lib/udf.jar ;

create temporary function getLastDay as 'com.cloudyhadoop.bigdata.udf.UDFLastDay';

 

show functions;

select empno, ename, hiredate, getLastDay(hiredate) last_day from emp;

 方式二:(全局有效)

hive-site.xml中添加如下配置信息:

<property>

<name>hive.aux.jars.path</name>

<value>file:///home/hadoop/software/lib/udf.jar</value>

</property>

 

启动hive之后,就不需要再:add jar /home/hadoop/software/lib/udf.jar ;

create temporary function getLastDay as 'com.cloudyhadoop.bigdata.udf.UDFLastDay';

 

temporary:  current session, 退出或者重启之后函数丢失

 

如何做到全局有效?

1、https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/DropFunction

CREATE FUNCTION [db_name.]function_name AS class_name

  [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ];

2、修改源代码

https://github.com/cloudera/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java

registerUDF("getLastDay", UDFLastDay.class, false);

重新编译、部署

 

 

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: