hive常⽤sql语句整理
1.查看建表语句
show create table table_name;
2.刷新表
refresh table_name;
3.表重命名
alter table table_name rename to new_table_name;
4.增加列 可以增加多列,⽤,隔开
alter table table_name add columns (col_de string comment 'department name');
5.修改字段
alter table table_name change col_old_name col_new_name string comment 'the new name is string'
6.调整字段位置
alter table table_name change col_old_name col_new_name string comment 'the new name is string' after col_1 7.删除字段 表中有a b c 字段,删除 c
alter table table_name replace columns ( a int, b int);
⼀般新增⼀列于调整字段位置⼀起配合使⽤
7 查看hive表信息
desc formatted provision_digital.provision_digital_store_rank
8.array<string> 类型的数据查询问题
只取第⼀条,⽹上搜到的例⼦这样查询可以
select id,id[0] from table;
实际查询没有作⽤,不⽣效,查出来的结果还是⼀个数组
select id,id[0],size(id) from table;
-- 或者
select id,id[0] from table where size(id) > 0;
可以实现想要的效果。
9.建表语句
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[PARTITIONED BY(col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...)
[SORTED BY(col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION hdfs_path]
10.查看hive表hdfs⽂件
hdfs dfs -ls hdfs://xxx/xxx/xxxx/xxxx/xxxx/xxxx
11.分组排序 row_number()
select name,count,row_number() over (partition by name order by count) rn from xx.t_xxx_id -- 分组排序后取第⼀条
select * from (
select name,count,row_number() over (partition by name order by count) rn from xx.t_xxx_id ) t where rn = 1;
12.跳序排序 rank()
select name,count, rank() over (order by count desc) as rank from xxx.t_table_xxx;
-- 如果两个排名第2,位于第三排名的序号为4
13.⼀⾏转多⾏
-- array_id 为array<string> 字段类型,或者字符串⽤逗号拼接
select name,array_id from xxx.t_table_xxx;
-- array<string>
select select name,id from (
select name,array_id from xxx.t_table_xxx
) t LATERAL VIEW explode(t.array_id) array_id as id
;
-- 字符串,先⽤ split将字段分隔为数组,再⽤上⾯的⽅法
select name,split(array_id,',') from xxx.t_table_xxx;
14,字段类型转 cast
cast(count1 as int )
15.多列转多⾏
源数据
select id,mobile1,mobile2,mobile3 x
转换
select a.id,b.idList,b.value from (
select id,mobile1,mobile2,mobile3 x
) a lateral view explode (map('mobile1',mobile1
,'mobile2',mobile2
,'mobile3',mobile3)) b as idList, value
16.多⾏转多列
select brand,
kv1['mobile1'] mobile1,
kv1['mobile2'] mobile2,
kv1['mobile3'] mobile3
from (
select id, str_to_map(concat_ws(',', collect_set(concat(idList, '-', value))),',','-') kv1 from (
select id, idList, value,brand
x
) tmp
group by brand
)
t
17. jsonarray jsonobject
不使⽤⾃定义udf函数,⽤hive collect_set、concat_ws,concat函数collect_set 和collect_list 两个都可以⽤,set 会去重
增加字段的sql语句concat 作为字符串拼接函数,如
concat("[",xxxxx,"]") xxxxx
输出的结果就是 [xxxxxx]
concat_ws 指定字符串直接拼接的连接符
concat_ws("-",xx)
输出结果,xx-xx-xx
⽰例代码:
-- 输出格式 ["xxxx","xxxx"]
select id
, concat("[",concat_ws(",",collect_set(xxxx)),"]") xxxx from (
select id
, concat("\"",xxxx,"\"") xxxx
x
) t group by id
;
-- 输出格式 ["{xxxx}","{xxxx}"]
select id
, concat("[",concat_ws(",",collect_set(xxxx)),"]") xxxx from (
select id
,
concat("{",xxxx,"}") xxxx
x
) t group by id
;
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论