Install Hadoop 2.8.2 HA on Centos
Centos Hadoop HAAssume all servers have been installed Centos 7 and all DNS records have been created on DNS Server.
Hosts
hdp-master1 - namenode zookeeper resourcemanager mapreduce.jobhistory
hdp-master2 - namenode zookeeper resourcemanager
hdp-node01 - datanode journalnode
hdp-node02 - datanode journalnode
hdp-node03 - datanode journalnode zookeeper
Install java on all servers
sudo yum install java-1.8.0-openjdk-devel
JAVA_HOME is /usr/lib/jvm/java
- Create user hadoop on all servers
sudo useradd -s /bin/bash hadoop sudo passwd hadoop
Logon as hadoop
- Create ssh key on one server
ssh-keygen -t rsa cd ~/.ssh cat id_rsa.pub > authorized_keys chmod 700 *
Copy all key files to other servers
- Make sure folder
~/.ssh
exists on server and folder mode is 700. - Use
scp
command to copy files between servers
scp ~/.ssh/* <target server ip>:~/.ssh/
Download Hadoop and Zookeeper from Apache website on one server
http://hadoop.apache.org/releases.html
http://zookeeper.apache.org/releases.html
Unpack Hadoop file and rename folder to hadoop
tar -xzf hadoop-2.8.2.tar.gz
mv hadoop-2.8.2 hadoop
tar -xzf zookeeper-3.4.11.tar.gz
mv zookeeper-3.4.11 zookeeper
Configure Hodoop environment
Create env.sh with following contents
#!/bin/bash
# Set Java environment
export JAVA_HOME=/usr/lib/jvm/java
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
# Set Hadoop environment
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_PREFIX=$HADOOP_HOME
export CLASSPATH=.:$HADOOP_HOME/lib:$CLASSPATH
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_ROOT_LOGGER=INFO,console
# Solve problem "WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform..."
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_COMMON_LIB_NATIVE_DIR"
# Zookeeper
export ZOOKEEPER_HOME=/home/hadoop/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin
Edit .bashrc add the following line
. env.sh
Source env.sh
. env.sh
Using scp
command copy .bashrc
and env.sh
to other machines
scp ~/.bashrc <server ip>:~/
scp ~/env.sh <server ip>:~/
Create DFS and Zookeeper data folders
mkdir -p /home/hodoop/data/dfs
mkdir -p /home/hodoop/data/zookeeper
Zookeeper configuration files
$ cd ~/zookeeper/conf
$ cp zoo_simple.cfg zoo.cfg
Edit zoo.cfg
as below
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/hadoop/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=hdp-master1:2888:3888
server.2=hdp-master2:2888:3888
server.3=hdp-node03:2888:3888
Modify Hodoop configuration files
Configure etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hdpcluster</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hdp-master1:2181,hdp-master2:2181,hdp-node03:2181</value>
</property>
</configuration>
Configure etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/data/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/data/dfs/data</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<!-- QJM cluster - hdpcluster -->
<property>
<name>dfs.nameservices</name>
<value>hdpcluster</value>
</property>
<property>
<name>dfs.ha.namenodes.hdpcluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hdpcluster.nn1</name>
<value>hdp-master1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hdpcluster.nn2</name>
<value>hdp-master2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.hdpcluster.nn1</name>
<value>hdp-master1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.hdpcluster.nn2</name>
<value>hdp-master2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hdp-node01:8485;hdp-node02:8485;hdp-node03:8485/hdpcluster</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.hdpcluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>shell(ssh -t $target_host '/home/hadoop/hadoop/sbin/hadoop-daemon.sh start namenode')
shell(/bin/true)</value>
</property>
<property>
<name>dfs.jouranlnode.edits.dir</name>
<value>/home/hadoop/data/dfs/jouranls</value>
</property>
</configuration>
Configure etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hdp-master1:10020</value>
</property>
<property>
<name>mapreduce.obhistory.webapp.address</name>
<value>hdp-master1:19888</value>
</property>
</configuration>
Configure etc/hadoop/yarn-site.xml
<configuration>
<property>
<description>Enable RM to recover state after starting. If true, then
yarn.resourcemanager.store.class must be specified</description>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<description>The class to use as the persistent store.</description>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn-cluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hdp-master1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hdp-master2</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>hdp-master1:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>hdp-master2:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hdp-node01:2181,hdp-node02:2181,hdp-node03:2181</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm1</name>
<value>hdp-master1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm2</name>
<value>hdp-master2:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
<value>hdp-master1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
<value>hdp-master2:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm1</name>
<value>hdp-master1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm2</name>
<value>hdp-master2:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm1</name>
<value>hdp-master1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm2</name>
<value>hdp-master2:8033</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
Assign slaves
- etc/hadoop/slaves
hdp-node01 hdp-node02 hdp-node03
Run Hadoop
Copy Hadoop folder to other machines
scp -r ~/hadoop <server-ip>:~/
scp -r ~/zookeeper <server-ip>:~/
scp -r ~/data <server-ip>:~/
Start journalnode on hdp-node01, hdp-node02, and hdp-node03.
$ hadoop-daemon.sh start journalnode
Format file system
bin/hdfs namenode -format
Start HDFS. It starts nodemanager on both hdp-master1 and hdp-master2.
sbin/start-dfs.sh
Start Yarn on hdp-master1 and hdp-master2.
$ start-yarn.sh
This command only starts YARN on current machine. We need to run it on standby machine too. There is no fencing for them. If one failed, we need to restart it manually.
Check services
- On master namenode
$ jps 55014 QuorumPeerMain 81693 ResourceManager 80748 DFSZKFailoverController 89391 Jps 80254 NameNode
- On datanodes
$ jps 2384 Jps 141939 DataNode 142628 NodeManager 142072 JournalNode
- Check YARN resource manager web GUI http://hdp-master1:8088
- Check DataNode information: http://hdp-master1:50070
To verify HA function, just kill the active namenode and resourcemanager and check standby namenode and resourcemanager. Those standby nodes becomes active nodes automatically.