⾯试题:如何⽤sqoop将hive中分区表的分区字段导⼊到MySQL中问题分析:
1. hive中分区表其底层就是HDFS中的多个⽬录下的单个⽂件,hive导出数据本质是将HDFS中的⽂件导出
mysql面试题sql2. hive中的分区表,因为分区字段(静态分区)不在⽂件中,所以在sqoop导出的时候,⽆法将分区字段进⾏直接导出
思路:在hive中创建⼀个临时表,将分区表复制过去后分区字段转换为普通字段,然后再⽤sqoop将tmp表导出即实现需求
步凑如下:
⽂章⽬录
1.创建⽬标表(分区表)
hive>CREATE TABLE`dept_partition`(
`deptno`int,
`dname` string,
`loc` string)
PARTITIONED BY(`month` string)row format delimited fields terminated by'\t';
1.1查看表结构
hive>show create table dept_partition;
+----------------------------------------------------+--+
|                  createtab_stmt                  |
+----------------------------------------------------+--+
|CREATE TABLE`dept_partition`(|
|`deptno`int,|
hive datediff函数
asp例子|`dname` string,|
|`loc` string)|
| PARTITIONED BY(|
|`month` string)
2.导⼊数据
hive>load data inpath '/user/hive/hive_db/'into table dept_partition;
电机未来发展趋势
10 ACCOUNTING 1700
20 RESEARCH 1800
30 SALES 1900
40 OPERATIONS 1700
3.查询表dept_partition
hive>select*from dept_partition;
结构体sizeof
+------------------------+-----------------------+---------------------+-----------------------+--+
| dept_partition.deptno  | dept_partition.dname  | dept_partition.loc  | h|
+------------------------+-----------------------+---------------------+-----------------------+--+
|10| ACCOUNTING            |1700|2019-10-19|
|20| RESEARCH              |1800|2019-10-19|
|30| SALES                |1900|2019-10-19|
|40| OPERATIONS            |1700|2019-10-19|
|10| ACCOUNTING            |1700|2019-10-20|
|20| RESEARCH              |1800|2019-10-20|
|30| SALES                |1900|2019-10-20|
|40| OPERATIONS            |1700|2019-10-20|
+------------------------+-----------------------+---------------------+-----------------------+--+
4.创建临时表 tmp_dept_partition
hive>create table tmp_dept_partition as select*from dept_partition;
5.查询临时表
hive>select*from tmp_dept_partition;
+----------------------------+---------------------------+-------------------------+---------------------------+--+
| tmp_dept_partition.deptno  | tmp_dept_partition.dname  | tmp_dept_partition.loc  | tmp_h| +----------------------------+---------------------------+-------------------------+---------------------------+--+
|10| ACCOUNTING                |1700|2019-10-19|
|20| RESEARCH                  |1800|2019-10-19|
|30| SALES                    |1900|2019-10-19|
|40| OPERATIONS                |1700|2019-10-19|
|10| ACCOUNTING                |1700|2019-10-20|
|20| RESEARCH                  |1800|2019-10-20|
|30| SALES                    |1900|2019-10-20|
|40| OPERATIONS                |1700|2019-10-20|
+----------------------------+---------------------------+-------------------------+---------------------------+--+
6.查看表结构(这个时候分区表已经转换为⾮分区表了)
hive>show create table tmp_dept_partition;
+----------------------------------------------------+--+
|                  createtab_stmt                  |
+----------------------------------------------------+--+
|CREATE TABLE`tmp_dept_partition`(|
|`deptno`int,|
|`dname` string,|
表单制作怎么居中
|`loc` string,|
|`month` string)
7.MySQL中建表 dept_partition
mysql>drop table if exists dept_partition;
create table dept_partition(
`deptno`int,
`dname`varchar(20),
`loc`varchar(20),
`month`varchar(50))
8.使⽤sqoop导⼊到MySQL
bin/sqoop export \
--connect jdbc:mysql://hadoop01:3306/partitionTb \
--username root \
--password 123456 \
--table dept_partition \
--num-mappers 1 \
--export-dir /user/hive/warehouse/hive_db.db/tmp_dept_partition \ --input-fields-terminated-by "\001"
8.Mysql查询验证是否成功导出
mysql>select*from dept_partition;
+--------+------------+------+------------+
| deptno | dname      | loc  |month|
+--------+------------+------+------------+
|10| ACCOUNTING |1700|2019-10-19|
|20| RESEARCH  |1800|2019-10-19|
|30| SALES      |1900|2019-10-19|
|40| OPERATIONS |1700|2019-10-19|
|10| ACCOUNTING |1700|2019-10-20|
|20| RESEARCH  |1800|2019-10-20|
|30| SALES      |1900|2019-10-20|
|40| OPERATIONS |1700|2019-10-20|
+--------+------------+------+------------+

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。