http⽅式内⽹搭建CDH6.3.2与部分组件优化
Cloudera_Manager_6.3.2安装配置⽂档
1. 配置准备
Cloudera Manager (简称CM)⽤于管理CDH6集,可进⾏节点安装、配置、服务配置等,提供Web窗⼝界⾯提⾼了Hadoop配置可见度,⽽且降低了集参数设置的复杂度。本次CM安装配置规划如下:
CM安装配置规划
机器192.168.1.170 master01 192.168.1.171 master02 192.168.1.172 master03
系统Centos7.6
系统内核Linux 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
JDK  1.8.0_231
Ntpd服务server 192.168.1.170
cloudera-scm-server节点192.168.1.170
cloudera-scm-agent节点192.168.1.[170-172]
cloudera安装包http⽬录
Parcel包⽂件CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel
Parcel包下载地址
1.1 新建集http服务配置
配置http服务⽤于安装CM之后进⾏parcel的分发
1.安装httpd服务
yum install httpd -y
2.重启httpd服务
systemctl start httpd.service
3.配置⾃启动
systemctl enable httpd.service
4.测试
拷贝cdh相关⽂件到到/var/www/html ⽬录
2. 系统安装
CM安装统⼀使⽤root⽤户安装,通过Xshell⼯具使⽤root⽤户登录X台机器。
2.1 ⽂件同步脚本 hosts_sync.sh
#!/bin/bash
# Useage : You shuld install expect to use this script ,for command [rpm -ivh tcl-8.5.13-8.el7.x86_64.rpm]  [rpm -ivh expect-5.45-14.el7_1.x86_64.rpm]
#
# Reference : sh hosts_sync.sh    [hosts_file_path] [the same passwd] [Synchronized_file] [-Target_path]
#      hosts_file_path  : proposal value is '/etc/hosts'
#    the same passwd    : The same password for each machine
#    synchronized_file  : proposal value is '/home/'
#        -target_path(Optional) : proposal value is '/home/targetDir'
rpm -q expect &>/dev/null
if [ $? -ne 0 ];then
yum install -y expect >/dev/null
if [ $? -eq 0 ]; then
echo "expect install success!"
else
echo "expect install failure!"
exit;
fi
fi
# 判断是否有参数
if [ $# -lt 3 ]; then
echo Not enough Arguement!
exit;
elif [ ! -e $1 ] || [ ! -e $3 ]; then
echo Flie Not Exist!
exit;
fi
#从⽂件读取主机ip地址
n=0
for host in `awk '{print $1}' $1`
do
((n++));
if [ $n -gt 2 ]; then
count=0
while [ $count -le 3 ]
do
ping -c1 -w5 $host >/dev/null 2>&1
if [ $? -eq 0 ]; then
#获取⽂件⽗⽬录
pdir=$(cd -P $(dirname $3); pwd)
if [ x"$4" = x ]; then
targetDir=$pdir
else
targetDir=$4
fi
#获取当前⽂件名称
fname=$(basename $3)
#在主机上创建⽬录并进⾏同步
expect -c "
spawn ssh  $host
expect {
\"yes/no\" {send \"yes\r\"; exp_continue}
\"*assword\" {send \"$2\r\"; exp_continue}
\"root@*\" {send \"test -d $pdir || mkdir -p $pdir\r exit\r\"; exp_continue}
}
spawn scp -r $pdir/$fname $host:$targetDir
expect {
\"*assword\" { send \"$2\r\"; exp_continue}
}"
#判断⽂件是否同步成功
if [ $? -ne 0 ]; then
echo "$host sync failure!"
fi
break
else
((count++));
fi
done
if [ $count -gt 3 ]; then
echo "$host ping is failure!"
fi
fi
done
2.1.1 同步列表 ip.txt
192.168.1.170
192.168.1.171
192.168.1.172
2.1.2 同步脚本参数2 统⼀密码
2.1.3 同步脚本参数3 需要同步的⽂件
2.1.4 同步脚本参数4 [可选] 为空时默认为同步⽂件⽗⽬录,不为空时为指定⽬录
sh hosts_sync. 统⼀密码同步⽂件  [同步⽬的端⽬录]
2.2 全局安装JDK1.8+
JDK1.8+是CM依赖的关键,所以每台机器都必须安装全局的JDK1.8+,并配置环境变量。
2.2.1 输⼊命令 rpm -qa|grep java 查看当前系统是否安装过JDK
2.2.2 输⼊命令 echo $PATH 查看环境变量中是否存在JDK安装的路径
2.2.3 如果系统中存在JDK并且和当前版本要求⼀样,同时PATH中有JDK安装路径,那就不需要再安装JDK,即下⾯步骤直接跳过,否则需要将当前JDK卸载掉,重新安装⾃⼰需要的JDK。
输⼊命令 rpm -e --nodeps [jdk软件包名称] (软件名称就是第⼀步查到的软件名称)
2.2.4 将jdk-8u231-linux-x64.rpm安装包上传⾄/opt⽬录下.
2.2.5 输⼊命令 rpm -ivh jdk-8u231-linux-x64.rpm 解压安装包
rpm -ivh jdk-8u231-linux-x64.rpm
2.2.6 rpm安装模式会将jdk安装到/usr/java/⽬录下。
2.2.7 输⼊命令 vim /etc/profile 修改全局环境变量
export JAVA_HOME=/usr/java/jdk1.8.0_231-amd64
export PATH=$PATH:$JAVA_HOME/bin
2.2.8 输⼊命令 source /etc/profile 使环境变量⽣效
source /etc/profile
2.2.9 输⼊命令 echo $PATH 查看环境变量,输⼊命令java –version 查看jdk版本
2.2.10 同步rpm包到其他机器
sh hosts_sync. 统⼀密码  /opt/jdk-8u231-linux-x64.rpm
2.3 配置root密码统⼀
为了保证CM正常简单安装,将需要安装的⼏台机器的root密码设置相同,本⽂中使⽤的四台机器的root密码⼀致,都为“pwd”,如果密码不相同,可以执⾏命令 passwd 进⾏修改,修改成功后重启系统。
2.4 配置hostname和DNS静态域名(使⽤同步脚本同步到所有机器)
2.4.1 给集所有机器配置hosts。命令:vim /etc/hosts
注意:hostname中不要出现特殊字符例如 _(下划线)
2.4.2 输⼊命令 vi /etc/hosts 配置DNS静态域名,在hosts⽂件的尾部添加如下内容:
192.168.1.170 master01
192.168.1.171 master02
192.168.1.172 master03
2.4.3 同步hosts
sh hosts_sync. 统⼀密码  /etc/hosts
2.5 关闭selinux及防⽕墙(使⽤同步脚本同步到所有机器)
在CM安装中每台机器都要关闭selinux和防⽕墙,所以下⾯操作在X台机器上都要操作⼀遍
2.5.1 输⼊命令 vi /etc/selinux/config 修改config⽂件中的 SELINUX="" 为 disabled ,关闭selinux,永久⽣效
2.5.2 查看命令 /usr/sbin/sestatus -v
2.5.3 临时关闭 setenforce 0
2.5.4 关闭防⽕墙systemctl stop firewalld
2.5.5 取消防⽕墙开机⾃启systemctl disable firewalld
2.5.6 service firewalld status 查看防⽕墙状态
1
vi /etc/selinux/config
SELINUX=disabled
2
/usr/sbin/sestatus -v
3
setenforce 0
4
systemctl stop firewalld
5
systemctl disable firewalld
6
service firewalld status
2.5.7 同步脚本
sh hosts_sync. 统⼀密码  /etc/selinux/config
2.6 配置ssh免密码认证 master节点
2.6.1 在主节点上⽣成公钥
ssh-keygen -t rsa
2.6.2 将公钥拷贝到agent节点同步脚本同步
cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
sh hosts_sync. 统⼀密码  /root/.ssh/authorized_keys
2.6.3 验证是否完成ssh免密登录(每台机器都要验证⼀次)
2.7 安装mysql数据库
2.7.1 安装mariadb
yum install mariadb-server
2.7.2 配置mariadb
linux安装jdk rpm安装systemctl start mariadb  # 开启服务
systemctl enable mariadb  # 设置为开机⾃启动服务
2.7.3 初始化
whereis mysql_secure_installation
/usr/bin/mysql_secure_installation
2.7.4 配置数据库
mysql -uroot –pxxxx
mysql> show databases; #显⽰数据库
mysql> use mysql; #使⽤MySQL数据库
mysql> select User, Host from user;
mysql> update user set Host='%' where User='root'; #允许远程登陆
mysql> flush privileges; #刷新权限
mysql> create database hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci; # hive元数据库mysql> create database amon DEFAULT CHARSET utf8 COLLATE utf8_general_ci; # monitor元数据库mysql> grant all on hive.* to root@"%" Identified by "xxxx"; #授权hive库中所有表给root⽤户
mysql> grant all on amon.* to root@"%" Identified by "xxxx"; #授权monitor库中所有表给root⽤户mysql> flush privileges;
mysql> quit; #刷新
2.8 配置ntp时间同步
2.8.1 安装ntp服务
yum install -y ntp
2.8.2 配置ntp服务
service ntpd start  # 开启服务
chkconfig ntpd on  # 设置为开机⾃启动服务
2.8.3 配置f vim /f 服务端
# For more information about this file, see the man pages
# f(5), ntp_acc(5), ntp_auth(5), ntp_clock(5), ntp_misc(5), ntp_mon(5).
driftfile /var/lib/ntp/drift
# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
restrict default nomodify notrap nopeer noquery
# Permit all access over the loopback interface.  This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
restrict 127.0.0.1
restrict ::1
# Hosts on local network are less restricted.
#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap
# Use public servers from p.org project.
# Please consider joining the pool (p.org/join.html).
#p.org iburst
#p.org iburst
#p.org iburst
#p.org iburst
server 192.168.1.170
#broadcast 192.168.1.255 autokey # broadcast server
#broadcastclient  # broadcast client
#broadcast 224.0.1.1 autokey  # multicast server
#multicastclient 224.0.1.1  # multicast client
#manycastserver 239.255.254.254  # manycast server
#manycastclient 239.255.254.254 autokey # manycast client
# Enable public key cryptography.
#crypto
includefile /etc/ntp/crypto/pw
# Key file containing the keys and key identifiers used when operating
# with symmetric key cryptography.
keys /etc/ntp/keys
# Specify the key identifiers which are trusted.
#trustedkey 4 8 42
# Specify the key identifier to use with the ntpdc utility.
#requestkey 8
# Specify the key identifier to use with the ntpq utility.
#controlkey 8
# Enable writing of statistics records.
#statistics clockstats cryptostats loopstats peerstats
# Disable the monitoring facility to prevent amplification attacks using ntpdc
# monlist command when default restrict does not include the noquery flag. See
# CVE-2013-5211 for more details.
# Note: Monitoring will not be disabled with the limited restriction flag.
disable monitor
2.8.4 配置f vim /f 客户端
# For more information about this file, see the man pages
# f(5), ntp_acc(5), ntp_auth(5), ntp_clock(5), ntp_misc(5), ntp_mon(5).
driftfile /var/lib/ntp/drift
# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
restrict default nomodify notrap nopeer noquery
# Permit all access over the loopback interface.  This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
restrict 127.0.0.1
restrict ::1
# Hosts on local network are less restricted.
#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap
# Use public servers from p.org project.
# Please consider joining the pool (p.org/join.html).
#p.org iburst
#p.org iburst
#p.org iburst
#p.org iburst
server 192.168.1.170 #配置ntpserver服务
#broadcast 192.168.1.255 autokey # broadcast server
#broadcastclient  # broadcast client
#broadcast 224.0.1.1 autokey  # multicast server
#multicastclient 224.0.1.1  # multicast client
#manycastserver 239.255.254.254  # manycast server
#manycastclient 239.255.254.254 autokey # manycast client
# Enable public key cryptography.
#crypto
includefile /etc/ntp/crypto/pw
# Key file containing the keys and key identifiers used when operating
# with symmetric key cryptography.
keys /etc/ntp/keys
# Specify the key identifiers which are trusted.
#trustedkey 4 8 42
# Specify the key identifier to use with the ntpdc utility.
#requestkey 8
# Specify the key identifier to use with the ntpq utility.
#controlkey 8
# Enable writing of statistics records.
#statistics clockstats cryptostats loopstats peerstats
# Disable the monitoring facility to prevent amplification attacks using ntpdc
# monlist command when default restrict does not include the noquery flag. See
# CVE-2013-5211 for more details.
# Note: Monitoring will not be disabled with the limited restriction flag.
disable monitor
2.8.5 分发客户端配置
sh hosts_sync. 统⼀密码  /f
2.9 配置repo
2.9.po
[CDH-5.8.0]
name=CDH Version - CDH-5.8.0
baseurl=192.168.1.170/CDH/5.8.0/ #1.1 http服务的ip
gpgcheck=0
gpgkey=192.168.1.170/CDH/5.8.0/RPM-GPG-KEY-cloudera
enabled=1
priority=1
2.9.2 分发po
sh hosts_sync. 统⼀密码  /pos.po
2.9.3 重新加载yum源
yum clean all #清楚当前yum缓存
yum  update  #更新yum
yum makecache #缓存当前配置yum
3 安装cloudera
3.1 安装CM server ……
yum install cloudera* #从配置的po中通过yum⾃动安装rpm包(server deamons ……) 3.2 配置mysql驱动到共享⽬录
cp /opt/mysql-connector-java-5.1.47.jar /opt/mysql-connector-java.jar
cp /mysql-connector-java.jar /usr/share/java/
cp /mysql-connector-java.jar /opt/cloudera/cm/lib/
3.3 创建CM server数据库
/opt/cloudera/cm/schema/scm_prepare_database.sh -h mysqlIP -P 3306 mysql scm scm scm
# 参数说明
# -h:Database host
# --scm-host:SCM server's hostname
3.4 启动服务服务端
service cloudera-scm-server start #启动server
systemctl enable cloudera-scm-server #配置⾃启动
netstat -anp | grep 7180 #观察server web端⼝是否被占⽤(server web服务是否启动)
3.5 登陆CM管理页⾯,默认账号/密码 admin/admin
3.6 配置第三⽅服务包注册jar
cd /opt/cloudera/csd #进⼊server端的csd⽬录
wget 192.168.1.170/cdh/es/ELASTICSEARCH-1.0.jar #从http中下载封装完成的ES注册包
service cloudera-scm-server restart #重启server服务
4 集安装
1、进⼊CM管理界⾯之后
2、选择免费版本的CDH 继续
3、 CDH相关介绍页⾯
4、 CDH主机获取
5、集安装parcels选择删除多余parcel远程库连接后保存
6、继上⼀步点击继续分发agent
7、继上⼀步点击继续开始安装选定parcel
等待其安装完成点击继续进⼊主机正确性验证
进⼊主机正确性验证这⾥会有两个告警
警告1:Cloudera 建议将 /proc/sys/vm/swappiness 设置为 10。当前设置为 30。使⽤ sysctl 命令在运⾏时更改该设置并编辑 /f 以在重启后保存该设置。您可以继续进⾏安装,但可能会遇到问题,Cloudera Manager 报告您的主机由于通过echo 10 > /proc/sys/vm/swappiness即可解决。(所有主机)
警告2:已启⽤透明⼤页⾯压缩,可能会导致重⼤性能问题。
请运⾏(所有主机)
echo never > /sys/kernel/mm/transparent_hugepage/defrag和
echo never > /sys/kernel/mm/transparent_hugepage/enabled以禁⽤此设置,然后将同⼀命令添加到 /etc/rc.local 等初始脚本中,以便在系统重启时予以设置。
重新进⾏主机正确性验证:告警消失,点击完成进⼊组件安装阶段。
8、组件安装 ALL
给组件分配⾓⾊:根据集资源进⾏相关配置 ZK需要三台做协同,其他组件都可以使⽤默认配置。
填写连接信息并且进⾏连通性测试
集设置:选择默认即可
安装完毕
5 集优化
5.1 kafka ip访问配置
listeners=PLAINTEXT://0.0.0.0:9092,
advertised.listeners=PLAINTEXT://192.168.1.179:9092
##每个broker配置配置key:kafka.properties 的 Kafka Broker ⾼级配置代码段(安全阀)
5.2 kafka jmx访问
-server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80 -XX:+DisableExplicitGC -Djava.awt.headless=true -D ##每个broker中对配置key:Additional Broker Java Options进⾏新增配置
5.3 hdfs数据⽬录配置
/home/data/dfs/dn
##hdfs启⽤HA之前进⾏配置更改配置key:DataNode 数据⽬录 dfs.datanode.data.dir
5.4 hdfs纠删码服务配置
No Default Erasure Coding Policy.
##勾选以上配置配置key:Fallback Erasure Coding Policy system.default.policy
5.5 spark log4j⽇志级别配置
log4j.logger.tfinfo=INFO, tzapp
log4j.additivity.tfinfo=false
app=org.apache.log4j.ConsoleAppender
app.target=System.out
app.layout=org.apache.log4j.PatternLayout
app.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: [codeLogMark] %m%n
app.Encoding=UTF-8
### 配置key:Gateway ⽇志记录⾼级配置代码段(安全阀)
5.6 spark dynamic取消配置
取消勾选
### 配置key:Enable Dynamic Allocation abled
5.7 impala 负载均衡配置
1.寻⼀台主机
yum install haproxy -y
2.vim /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Example configuration for a possible web application.  See the

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。