最新Hadoop-2.7.2+hbase-1.2.0+zookeeper-3.4.8 HA高可用集群配置安裝
Ip 主機名 程序 進程 192.168.128.11 h1 Jdk Hadoop hbase Namenode DFSZKFailoverController Hamster 192.168.128.12 h2 Jdk Hadoop hbase Namenode DFSZKFailoverController Hamster 192.168.128.13 h3 Jdk Hadoop resourceManager 192.168.128.14 h4 Jdk Hadoop resourceManager 192.168.128.15 h5 Jdk Hadoop Zookeeper Hbase Datanode nodeManager JournalNode QuorumPeerMain HRegionServer 192.168.128.16 h6 Jdk Hadoop Zookeeper Hbase Datanode nodeManager JournalNode QuorumPeerMain HRegionServer 192.168.128.17 h7 Jdk Hadoop Zookeeper hbase Datanode nodeManager JournalNode QuorumPeerMain HRegionServer
關(guān)于準備工作 我這里就不一一寫出來了,總結(jié)一下有主機名,ip,主機名和ip的映射關(guān)系,防火墻,ssh免密碼,jdk的安裝及環(huán)境變量的設(shè)置。
安裝zookeeper到 h5、h6、h7上面
修改 /home/zookeeper-3.4.8/conf的zoo_sample.cfg
cp zoo_sample.cfg zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/zookeeper-3.4.8/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=h5:2888:3888
server.2=h6:2888:3888
server.3=h7:2888:3888
創(chuàng)建 data文件夾 和在里面 創(chuàng)建文件myid 并寫入數(shù)字1
touch data/myid
echo 1 > data/myid
拷貝整個zookeeper到另外兩個節(jié)點上
scp -r /home/zookeeper-3.4.8 h6:/home/
scp -r /home/zookeeper-3.4.8 h7:/home/
其他兩個節(jié)點的myid 修改為 2 3
安裝hadoop
/home/hadoop-2.7.2/etc/Hadoop
hadoop-env.sh:
export JAVA_HOME=/home/jdk
core-site.xml:
<configuration>
<!-- 指定hdfs的nameservice為masters -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://masters</value>
</property>
<!-- 指定hadoop臨時目錄 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop-2.7.2/tmp</value>
</property>
<!-- 指定zookeeper地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>h5:2181,h6:2181,h7:2181</value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<!--指定hdfs的nameservice為masters,需要和core-site.xml中的保持一致 -->
<property>
<name>dfs.nameservices</name>
<value>masters</value>
</property>
<!-- h1下面有兩個NameNode,分別是h1,h2 -->
<property>
<name>dfs.ha.namenodes.masters</name>
<value>h1,h2</value>
</property>
<!-- h1的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.masters.h1</name>
<value>h1:9000</value>
</property>
<!-- h1的http通信地址 -->
<property>
<name>dfs.namenode.http-address.masters.h1</name>
<value>h1:50070</value>
</property>
<!-- h2的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.masters.h2</name>
<value>h2:9000</value>
</property>
<!-- h2的http通信地址 -->
<property>
<name>dfs.namenode.http-address.masters.h2</name>
<value>h2:50070</value>
</property>
<!-- 指定NameNode的元數(shù)據(jù)在JournalNode上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://h5:8485;h6:8485;h7:8485/masters</value>
</property>
<!-- 指定JournalNode在本地磁盤存放數(shù)據(jù)的位置 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hadoop-2.7.2/journal</value>
</property>
<!-- 開啟NameNode失敗自動切換 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 配置失敗自動切換實現(xiàn)方式 -->
<property>
<name>dfs.client.failover.proxy.provider.masters</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 配置隔離機制方法,多個機制用換行分割,即每個機制暫用一行-->
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<!-- 使用sshfence隔離機制時需要ssh免登陸 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<!-- 配置sshfence隔離機制超時時間 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
mapred-site.xml:
<configuration>
<!-- 指定mr框架為yarn方式 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml:
<configuration>
<!-- 開啟RM高可靠 -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!-- 指定RM的cluster id -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>RM_HA_ID</value>
</property>
<!-- 指定RM的名字 -->
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<!-- 分別指定RM的地址 -->
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>h3</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>h4</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<!-- 指定zk集群地址 -->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>h5:2181,h6:2181,h7:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
Slaves:
h5
h6
h7
然后 拷貝到其他節(jié)點
scp -r hadoop-2.7.2 h2:/home/ 等等
這個地方說明一下 yarn 的HA 是在 h3和h4 上面
啟動順序
###注意:嚴格按照下面的步驟
1. 啟動zookeeper集群
[root@h6 ~]# cd /home/zookeeper-3.4.8/bin/
[root@h6 bin]# ./zkServer.sh start
H5 h6 h7 都一樣
[root@h6 bin]# ./zkServer.sh status
查看狀態(tài)
2. 啟動journalnode
[root@h5 bin]# cd /home/hadoop-2.7.2/sbin/
[root@h5 sbin]# ./hadoop-daemons.sh start journalnode
h5: starting journalnode, logging to /home/hadoop-2.7.2/logs/hadoop-root-journalnode-h5.out
h7: starting journalnode, logging to /home/hadoop-2.7.2/logs/hadoop-root-journalnode-h7.out
h6: starting journalnode, logging to /home/hadoop-2.7.2/logs/hadoop-root-journalnode-h6.out
[root@h5 sbin]# jps
2420 JournalNode
2309 QuorumPeerMain
2461 Jps
[root@h5 sbin]# ^C
3. 格式化HDFS
在h1上執(zhí)行命令:
hdfs namenode -format
格式化后會在根據(jù)core-site.xml中的hadoop.tmp.dir配置生成個文件
拷貝tmp 到 h2
[root@h1 hadoop-2.7.2]# scp -r tmp/ h2:/home/hadoop-2.7.2/
4. 格式化ZK(在h1上執(zhí)行即可)
[root@h1 hadoop-2.7.2]# hdfs zkfc -formatZK
5. 啟動HDFS(在h1上執(zhí)行)
[root@h1 hadoop-2.7.2]# sbin/start-dfs.sh
16/02/25 05:01:14 WARN hdfs.DFSUtil: Namenode for ns1 remains unresolved for ID null. Check your hdfs-site.xml file to ensure namenodes are configured properly.
16/02/25 05:01:14 WARN hdfs.DFSUtil: Namenode for ns2 remains unresolved for ID null. Check your hdfs-site.xml file to ensure namenodes are configured properly.
16/02/25 05:01:14 WARN hdfs.DFSUtil: Namenode for ns3 remains unresolved for ID null. Check your hdfs-site.xml file to ensure namenodes are configured properly.
Starting namenodes on [h1 h2 masters masters masters]
masters: ssh: Could not resolve hostname masters: Name or service not known
masters: ssh: Could not resolve hostname masters: Name or service not known
masters: ssh: Could not resolve hostname masters: Name or service not known
h2: starting namenode, logging to /home/hadoop-2.7.2/logs/hadoop-root-namenode-h2.out
h1: starting namenode, logging to /home/hadoop-2.7.2/logs/hadoop-root-namenode-h1.out
h5: starting datanode, logging to /home/hadoop-2.7.2/logs/hadoop-root-datanode-h5.out
h7: starting datanode, logging to /home/hadoop-2.7.2/logs/hadoop-root-datanode-h7.out
h6: starting datanode, logging to /home/hadoop-2.7.2/logs/hadoop-root-datanode-h6.out
Starting journal nodes [h5 h6 h7]
h5: journalnode running as process 2420. Stop it first.
h6: journalnode running as process 2885. Stop it first.
h7: journalnode running as process 2896. Stop it first.
Starting ZK Failover Controllers on NN hosts [h1 h2 masters masters masters]
masters: ssh: Could not resolve hostname masters: Name or service not known
masters: ssh: Could not resolve hostname masters: Name or service not known
masters: ssh: Could not resolve hostname masters: Name or service not known
h2: starting zkfc, logging to /home/hadoop-2.7.2/logs/hadoop-root-zkfc-h2.out
h1: starting zkfc, logging to /home/hadoop-2.7.2/logs/hadoop-root-zkfc-h1.out
[root@h1 hadoop-2.7.2]#
6. 啟動YARN(是在h3上執(zhí)行start-yarn.sh,把namenode和resourcemanager分開是因為性能問題,因為他們都要占用大量資源,所以把他們分開了,他們分開了就要分別在不同的機器上啟動)
[root@h3 sbin]# ./start-yarn.sh
[root@h4 sbin]# ./yarn-daemons.sh start resourcemanager
驗證:
http://192.168.128.11:50070
Overview 'h1:9000' (active)
http://192.168.128.12:50070
Overview 'h2:9000' (standby)
上傳文件
[root@h4 bin]# hadoop fs -put /etc/profile /profile
[root@h4 bin]# hadoop fs -ls
ls: `.': No such file or directory
[root@h4 bin]# hadoop fs -ls /
Found 1 items
-rw-r--r-- 3 root supergroup 1814 2016-02-26 19:08 /profile
[root@h4 bin]#
殺死h1
[root@h1 sbin]# jps
2480 NameNode
2868 Jps
2775 DFSZKFailoverController
[root@h1 sbin]# kill -9 2480
[root@h1 sbin]# jps
2880 Jps
2775 DFSZKFailoverController
[root@h1 sbin]# hadoop fs -ls /
Found 1 items
-rw-r--r-- 3 root supergroup 1814 2016-02-26 19:08 /profile
此時 h2 變?yōu)?span style="box-sizing: border-box">active
手動啟動 h1的 namenode
[root@h1 sbin]# ./hadoop-daemon.sh start namenode
starting namenode, logging to /home/hadoop-2.7.2/logs/hadoop-root-namenode-h1.out
[root@h1 sbin]# hadoop jar /home/hadoop-2.7.2/s
觀察 h1 狀態(tài)為standby
驗證yarn
[root@h1 sbin]# hadoop jar /home/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /profile /out
16/02/26 19:14:23 INFO input.FileInputFormat: Total input paths to process : 1
16/02/26 19:14:23 INFO mapreduce.JobSubmitter: number of splits:1
16/02/26 19:14:23 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1456484773347_0001
16/02/26 19:14:24 INFO impl.YarnClientImpl: Submitted application application_1456484773347_0001
16/02/26 19:14:24 INFO mapreduce.Job: The url to track the job: http://h3:8088/proxy/application_1456484773347_0001/
16/02/26 19:14:24 INFO mapreduce.Job: Running job: job_1456484773347_0001
16/02/26 19:14:49 INFO mapreduce.Job: Job job_1456484773347_0001 running in uber mode : false
16/02/26 19:14:49 INFO mapreduce.Job: map 0% reduce 0%
16/02/26 19:15:05 INFO mapreduce.Job: map 100% reduce 0%
16/02/26 19:15:22 INFO mapreduce.Job: map 100% reduce 100%
16/02/26 19:15:23 INFO mapreduce.Job: Job job_1456484773347_0001 completed successfully
16/02/26 19:15:23 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=2099
FILE: Number of bytes written=243781
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1901
HDFS: Number of bytes written=1470
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=13014
Total time spent by all reduces in occupied slots (ms)=13470
Total time spent by all map tasks (ms)=13014
Total time spent by all reduce tasks (ms)=13470
Total vcore-milliseconds taken by all map tasks=13014
Total vcore-milliseconds taken by all reduce tasks=13470
Total megabyte-milliseconds taken by all map tasks=13326336
Total megabyte-milliseconds taken by all reduce tasks=13793280
Map-Reduce Framework
Map input records=80
Map output records=256
Map output bytes=2588
Map output materialized bytes=2099
Input split bytes=87
Combine input records=256
Combine output records=156
Reduce input groups=156
Reduce shuffle bytes=2099
Reduce input records=156
Reduce output records=156
Spilled Records=312
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=395
CPU time spent (ms)=4100
Physical memory (bytes) snapshot=298807296
Virtual memory (bytes) snapshot=4201771008
Total committed heap usage (bytes)=138964992
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1814
File Output Format Counters
Bytes Written=1470
[root@h1 sbin]# hadoop fs -ls /
Found 3 items
drwxr-xr-x - root supergroup 0 2016-02-26 19:15 /out
-rw-r--r-- 3 root supergroup 1814 2016-02-26 19:08 /profile
drwx------ - root supergroup 0 2016-02-26 19:14 /tmp
[root@h1 sbin]#
Hadoop ha 集群搭建完成
安裝hbase
hbase-env.sh:
export JAVA_HOME=/home/jdk
export HBASE_MANAGES_ZK=false
hbase-site.xml:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://h1:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master</name>
<value>h1:60000</value>
</property>
<property>
<name>hbase.master.port</name>
<value>60000</value>
<description>The port master should bind to.</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>h5,h6,h7</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
注意:$HBASE_HOME/conf/hbase-site.xml的hbase.rootdir的主機和端口號與$HADOOP_HOME/conf/core-site.xml的fs.default.name的主機和端口號一致
Regionservers:內(nèi)容為:
h5
h6
h7
復(fù)制到h2 h5,h6,h7上面
整個啟動順序
按照上面啟動hadoop ha 的順序 先啟動好
然后在h1,h2上啟動hbase
./start-hbase.sh
測試進入 hbase
[root@h1 bin]# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hbase-1.2.0/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.0, r25b281972df2f5b15c426c8963cbf77dd853a5ad, Thu Feb 18 23:01:49 CST 2016
hbase(main):001:0> esit
NameError: undefined local variable or method `esit' for #<Object:0x7ad1caa2>
hbase(main):002:0> exit
至此全部結(jié)束。