hadoop的安装与配置(完全分布式)
完全分布式模式:
前⾯已经说了本地模式和伪分布模式,这两种在hadoop的应⽤中并不⽤于实际,因为⼏乎没⼈会将整个hadoop集搭建在⼀台服务器上(hadoop主要是围绕:分布式计算和分布式存储,如果以⼀台服务器做,那就完全违背了hadoop的核⼼⽅法)。简单说,本地模式是hadoop的安装,伪分布模式是本地搭建hadoop的模拟环境。(当然实际上并不是这个样⼦的,⼩博主有机会给⼤家说!)
那么在hadoop的搭建,其实真正⽤于⽣产的就是完全分布式模式:
思路简介
域名解析
ssh免密登陆
java和hadoop环境
配置hadoop⽂件
复制主节点到其他节点
格式化主节点
hadoop搭建过程+简介
在搭建完全分布式前⼤家需要了解以下内容,以便于⼤家更好的了解hadoop环境:
1.hadoop的核⼼:分布式存储和分布式计算(⽤官⽅的说法就是HDFS和MapReduce)
2.集结构:1+1+n 集结构(主节点+备⽤节点+多个从节点)
3.域名解析:这⾥为了⽅便,我们选择修改/etc/hosts实现域名解析(hadoop会在.../etc/hadoop/salves下添加从节点,这⾥需要解析名,当然你也能直接输⼊ip地址,更简单)
4.hadoop的命令发放,需要从ssh接⼝登录到其他服务器上,所以需要配置ssh免密登陆
5.本⽂采取1+1+3 集⽅式:域名为:s100(主),s10(备主),s1,s2,s3(从)
⼀:配置域名解析
主——s100:
[root@localhost ~]# vim /etc/hosts
1127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
2 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
3192.168.1.68 s100
4192.168.1.108 s1
5192.168.1.104 s2
6192.168.1.198 s3
7192.168.1.197 s10
将s100上的/etc/hosts拷贝到其他hadoop的集服务器上。例如:
将s100的/etc/hosts拷贝到s1上
[root@localhost ~]# scp /etc/hosts root@192.168.1.108:/etc/hosts
The authenticity of host '192.168.1.108 (192.168.1.108)' can't be established.
RSA key fingerprint is dd:64:75:5f:96:11:07:39:a3:fb:aa:3c:30:ae:59:82.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.1.108' (RSA) to the list of known hosts.
root@192.168.1.108's password:
hosts 100% 2460.2KB/s 00:00
将所有服务器的域名解析配置完成,进⾏下⼀步
⼆:配置ssh免密码登录主——s100:
ssh⽣成相应密钥对:id_rsa私钥和id_rsa.pub公钥
[root@localhost ~]# ssh-keygen -t rsa -P ''
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
a4:6e:8d:31:66:e1:92:04:37:8e:1c:a5:83:5e:39:c5 root@localhost.localdomain
The key's randomart image is:
+--[ RSA 2048]----+
| o.=. |
| o BoE |
|. =+o . . |
|. .o.o + |
| . o B S |
| = = |
| + . |
| . |
| |
+-----------------+
[root@localhost ~]# cd /root/.ssh/
[root@localhost .ssh]# ls
id_rsa id_rsa.pub known_hosts
默认是存在/当前user/.ssh(/root/.ssh或者/home/user/.ssh)下的!
有了密钥对:将id_rsa.pub加到授权中:
hadoop分布式集搭建[root@localhost .ssh]# cat id_rsa.pub >> authorized_keys(/root/.ssh下)
[root@localhost .ssh]# ls
authorized_keys id_rsa id_rsa.pub known_hosts
试⼀下是否本地免密登陆设置成功:
[root@localhost .ssh]# ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is 9e:e0:91:0f:1f:98:af:1a:83:5d:33:06:03:8a:39:93.
Are you sure you want to continue connecting (yes/no)? yes(第⼀次登陆需要确定)
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Last login: Tue Dec 2619:09:232017from192.168.1.156
[root@localhost ~]# exit
logout
Connection to localhost closed.
ok!没有问题,那么配置其他服务器,其实只需要把本机s100的id_rsa.pub复制到其他服务器上就可以了!这⾥就选择ssh-copy-id命令传送到其他服务器上
[root@localhost .ssh]# ssh-copy-id root@s1(s1是主机地址,这⾥提醒⼤家⼀下,因为有⼈因为这个问题问过我╭(╯^╰)╮)
The authenticity of host 's1 (192.168.1.108)' can't be established.
RSA key fingerprint is dd:64:75:5f:96:11:07:39:a3:fb:aa:3c:30:ae:59:82.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 's1' (RSA) to the list of known hosts.
root@s1's password:
Now try logging into the machine, with "ssh 'root@s1'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
主节点
三:配置java环境和安装hadoop(hadoop环境)
备注:这⾥⼩伙伴必须要知道的是,不管hadoop的主节点还是从节点甚⾄说备主节点,他们的java环境和hadoop环境都⼀样,所以我们只需要配置完⼀台服务器,就完成了所有的配置过程
因为完全分布模式也是在本地模式的基础上配置的,所以我们⾸先配置本地模式:
完全分布式模式 = 本地模式 + 配置⽂件
java环境和hadoop的安装等过程就是前⾯所说的本地模式了,这⾥就不多说了:
四:配置内容:
备注:对于配置⽂件以后会有时间会单独写⼀篇相关的⽂档
主要修改以下五个⽂件:
hadoop的配置⽂件:/data/hadoop/etc/hadoop
[root@localhost hadoop]# cd /data/hadoop/etc/hadoop
[root@localhost hadoop]# ls
slaves
配置 l:
主要:指定主节点
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://s100/</value>
</property>
#临时⽂件
<property>
<name&p.dir</name>
<value>/root/hadoop</value>
</property>
</configuration>
配置l:
主要:指定备份数量
<configuration>
#默认备份数为3,如果采取默认数,那么slaves不能少于备份数
<property>
<name&plication</name>
<value>2</value>#备份数
</property>
#备主
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>s10:50000</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///${p.dir}/dfs/name</value>
</property>
</configuration>
配置l:
主要:指定资源管理yran⽅法
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
配置l:
<configuration>
<property>
<name&sourcemanager.hostname</name>
<value>s100</value>
</property>
<property>
<name&demanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
配置slaves:
s1
s2
s3
五:scp-java环境和hadoop配置⽂件(hadoop环境),java环境直接安装做到这⾥,基本就完成了,现在就把主节点的所以配置都放到从节点上!scp -r /data/hadoop root@s1:/data/
复制hadoop
[root@localhost ~]# scp /etc/profile root@s1:/etc/profile
复制环境变量
登录到s1中执⾏source
[root@localhost ~]# ssh s1
Last login: Wed Dec 2723:18:482017from s100
[root@localhost ~]# source /etc/profile
s1配置完成,其他的服务器⼀样!
六:格式化主节点
[root@localhost ~]# hadoop namenode -format
启动hadoop:
start-all.sh
关闭hadoop:
stop-all.sh
jps查询进程信息
主节点:
[root@localhost ~]# jps
30996 Jps
30645 NameNode
30917 ResourceManager
2主节点:
[root@localhost ~]# jps
33571 Jps
33533 SecondaryNameNode 从节点:
[root@localhost ~]# jps 33720 Jps
33691 NodeManager 33630 DataNode
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论