自定义udtf函数(一进多出)
2022/8/4 23:24:40
本文主要是介绍自定义udtf函数(一进多出),对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
案例要求
java编写
package udtf; import org.apache.hadoop.hive.ql.exec.UDFArgumentException; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.udf.generic.GenericUDTF; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory; import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; import java.util.ArrayList; import java.util.List; public class MyExplode extends GenericUDTF { @Override public StructObjectInspector initialize(StructObjectInspector argOIs) throws UDFArgumentException { List<String> columnNames = new ArrayList<String>(); columnNames.add("user"); List<ObjectInspector> objectInspectors = new ArrayList<ObjectInspector>(); objectInspectors.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector); return ObjectInspectorFactory.getStandardStructObjectInspector(columnNames, objectInspectors); } public void process(Object[] args) throws HiveException { String str = args[0].toString(); String split = args[1].toString(); String[] strings = str.split(split); for (String s : strings) { ArrayList<String> list = new ArrayList<String>(); list.add(s); forward(list); } } public void close() throws HiveException { } }
shell
hive (default)> create temporary function myexplode as "udtf.MyExplode" using jar "hdfs://node1:9000/hive_function-1.0-SNAPSHOT.jar"; Added [/tmp/10de4466-6601-49b1-b749-8b5c8c2809b2_resources/hive_function-1.0-SNAPSHOT.jar] to class path Added resources: [hdfs://node1:9000/hive_function-1.0-SNAPSHOT.jar] OK Time taken: 5.442 seconds hive (default)> create table a(name string); OK Time taken: 1.046 seconds hive (default)> insert into table a values("zs_ls_ww"),("ww_ml_wb"); hive (default)> select myexplode(name, "_") from a; OK user zs ls ww ww ml wb Time taken: 1.138 seconds, Fetched: 6 row(s)
这篇关于自定义udtf函数(一进多出)的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-05-13TiDB + ES:转转业财系统亿级数据存储优化实践
- 2024-05-09“2024鸿蒙零基础快速实战-仿抖音App开发(ArkTS版)”实战课程已上线
- 2024-05-09聊聊如何通过arthas-tunnel-server来远程管理所有需要arthas监控的应用
- 2024-05-09log4j2这么配就对了
- 2024-05-09nginx修改Content-Type
- 2024-05-09Redis多数据源,看这篇就够了
- 2024-05-09Google Chrome驱动程序 124.0.6367.62(正式版本)去哪下载?
- 2024-05-09有没有大佬知道这种数据应该怎么抓取呀?
- 2024-05-09这种运行结果里的10.100000001,怎么能最快改成10.1?
- 2024-05-09企业src漏洞挖掘-有意思的命令执行