DataX数据传输O2O
DataX⼯具安装与使⽤
⼀、准备环境
1、创建⽤户和组和⽬录
groupadd -g 1400 datax
useradd -g datax -u 1400 datax
mkdir /datax
chown datax:datax /datax
2、系统环境
Linux
JDK(1.8以上,推荐1.8)
Python(推荐Python2.6.X)
Apache Maven 3.x (Compile DataX)
(1)升级JDK到(1.8.0_251)
安装:
rpm -ivh jdk-8u151-linux-x64.rpm
查看版本:
java -version
(2)安装 apache-maven
解压缩到/datax:
jar -xvf apache-maven-3.5.2-bin.zip -d /datax
添加环境变量:
vi /home/datax/.bash_profile
alias mvn='/datax/apache-maven-3.5.2/bin/mvn'
查看版本:
mvn -version
alias的作⽤是给命令起⼀个别的名字(作⽤的是命令)
export的作⽤是设置⼀个变量(作⽤的是变量)
(3)查看python版本:python -V
(4)DataX 安装配置
上传到/datax⽬录
解压缩:
tar -zxvf
⾃检脚本:
python {YOUR_DATAX_HOME}/bin/datax.py {YOUR_DATAX_HOME}/job/job.json
⼆、DataX O2O(Oracle to Oracle)配置清单
如两个不通数据库之间使⽤datax数据迁移
可以通过命令查看配置模板: python datax.py -r {Sourcedb_READER} -w {Targetdb_WRITER}⽰例:python /datax/datax/bin/datax.py -r oraclereader -w oraclewriter
查看Oracle到Oracle数据传输json⽂件模板
[datax@ceshi1 ~]$ python /datax/datax/bin/datax.py -r oraclereader -w oraclewriter
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
Please refer to the oraclereader document:
github/alibaba/DataX/blob/master/oraclereader/doc/oraclereader.md
Please refer to the oraclewriter document:
github/alibaba/DataX/blob/master/oraclewriter/doc/oraclewriter.md
Please save the following configuration as a json file and  use
python {DATAX_HOME}/bin/datax.py {JSON_FILE_NAME}.json
to run the job.
{
"job":{
"content":[
{
"reader":{
"name":"oraclereader",
"parameter":{
"column":[],
"connection":[
{
"jdbcUrl":[],
"table":[]
}
],
"password":"",
"username":""
}
},
"writer":{
"name":"oraclewriter",
"parameter":{
"column":[],
"connection":[
{
"jdbcUrl":"",
"table":[]
}
],
"password":"",
"preSql":[],
"username":""
}
}
}
],
"setting":{
"speed":{
"channel":""
}
}
}
}
DataX Job配置⽂件oraclereader和oraclewriter配置项清单如下:
主机源端⽬标端
IP192.168.48.201(源端)192.168.48.130(⽬标端)端⼝15211521
实例名orcl orcl1
⽤户/密码nice/nice hr/hr 表名STUDY DATAX
两表表结构如下:
SQL> desc datax
Name        Null?    Type
--------------------------------- --------
STUID        NOT NULL NUMBER(10)
STUNAME      NOT NULL VARCHAR2(20)
源端在/datax/datax/job⽬录下⽤oraclereader to oraclewriter模板配置json⽂件
vi test.json
{
"job":{
"content":[
{
"reader":{
"name":"oraclereader",
"parameter":{
"column":[
"STUID",
"STUNAME"
],
"connection":[
{
"jdbcUrl":["jdbc:oracle:thin:@192.168.48.201:1521:orcl"],
"table":["STUDY"]
}
],
"password":"nice",
"username":"nice"
}
},
"writer":{
"name":"oraclewriter",
"parameter":{
"column":[
"STUID",
"STUNAME"
],
"connection":[
{
"jdbcUrl":"jdbc:oracle:thin:@192.168.48.130:1521:orcl1",
"table":["DATAX"]
}
],
"password":"hr",
"preSql":["delete from DATAX"], ##同步前清空DATAX表
"username":"hr"
}
}
}
],
"setting":{
"speed":{
"channel":"4"##并⾏数,不加会报错
}
}
}
}
使⽤python执⾏json⽂件输出如下:
[datax@shuaige job]$ python /datax/datax/bin/datax.py /datax/datax/job/test.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2021-03-11 11:11:15.356 [main] INFO  VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl 2021-03-11 11:11:15.364 [main] INFO  Engine - the machine info  =>
osInfo: Oracle Corporation 1.8 25.251-b08
jvmInfo: Linux amd64 2.6.32-358.el6.x86_64
cpu num: 2
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [PS MarkSweep, PS Scavenge]
MEMORY_NAME                    | allocation_size                | init_size
PS Eden Space                  | 256.00MB                      | 256.00MB
Code Cache                    | 240.00MB                      | 2.44MB
Compressed Class Space        | 1,024.00MB                    | 0.00MB                          PS Survivor Space              | 42.50MB                        | 42.50MB
PS Old Gen                    | 683.00MB                      | 683.00MB
Metaspace                      | -0.00MB                        | 0.00MB
2021-03-11 11:11:15.387 [main] INFO  Engine -
{
"content":[
{
"reader":{
"name":"oraclereader",
"parameter":{
"column":[
"STUID",
"STUNAME"
],
"connection":[
{
"jdbcUrl":[
"jdbc:oracle:thin:@192.168.48.201:1521:orcl"
],
"table":[
"STUDY"
]
}
],
"password":"****",
"username":"nice"
}
},
"writer":{
"name":"oraclewriter",
"parameter":{
"column":[
"STUID",
"STUNAME"
],
"connection":[
{
"jdbcUrl":"jdbc:oracle:thin:@192.168.48.130:1521:orcl1",
"table":[
"DATAX"
]
}
],
"password":"**",
"preSql":[
"delete from DATAX"
],
"username":"hr"
}
}
linux安装jdk rpm安装
}

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。