posts - 495,  comments - 11,  trackbacks - 0

          本文將介紹在Linux(Red Hat 9)環境下搭建Hadoop集群,此Hadoop集群主要由三臺機器組成,主機名分別為
          linux????? 192.168.35.101
          linux02? 192.168.35.102
          linux03? 192.168.35.103

          從map reduce計算的角度講,linux作為master節點,linux02和linux03作為slave節點。
          從hdfs數據存儲角度講,linux作為namenode節點,linux02和linux03作為datanode節點。


          一臺namenode機,主機名為linux,hosts文件內容如下:
          127.0.0.1?? ??? linux????????? localhost.localdomain????????? localhost
          192.168.35.101???? linux????????? linux.localdomain????????????? linux
          192.168.35.102???? linux02
          192.168.35.103???? linux03

          兩臺datanode機,主機名為linux02和linux03
          >linux02的hosts文件
          127.0.0.1 ??? ??? linux02?????? localhost.localdomain?????? localhost
          192.168.35.102???? linux02?????? linux02.localdomain???????? linux02
          192.168.35.101???? linux
          192.168.35.103???? linux03
          >inux03的hosts文件
          127.0.0.1?? ????? ??? ?linux03????????? localhost.localdomain????????? localhost
          192.168.35.103????????? linux03??????????? linux03.localdomain??????????? linux03
          192.168.35.101? ??? ?linux
          192.168.35.102? ??? ?linux02

          1.安裝JDK
          > 從java.cun.com下載jdk-6u7-linux-i586.bin

          > ftp上傳jdk到linux的root目錄下

          > 進入root目錄,先后執行命令
          chmod 755 jdk-6u18-linux-i586-rpm.bin
          ./jdk-6u18-linux-i586-rpm.bin

          一路按提示下去就會安裝成功

          > 配置環境變量
          cd進入/etc目錄,vi編輯profile文件,將下面的內容追加到文件末尾
          export JAVA_HOME=/usr/java/jdk1.6.0_18
          export PATH=$JAVA_HOME/bin:$PATH
          export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

          注意:三臺機器都要安裝JDK~

          2.設置Master/Slave機器之間可以通過SSH無密鑰互相訪問
          最好三臺機器的使用相同的賬戶名,我是直接使用的root賬戶

          操作namenode機linux:
          以用戶root登錄linux,在/root目錄下執行下述命令:
          ssh-keygen -t rsa
          一路回車下去即可在目錄/root/.ssh/下建立兩個文件id_rsa.pub和id_rsa。

          接下來,需要進入/root/.ssh目錄,執行如下命令:
          cd .ssh

          再把is_rsa.pub文件復制到linux02和linux03機器上去。
          scp -r id_rsa.pub root@192.168.35.102:/root/.ssh/authorized_keys_01
          scp -r id_rsa.pub root@192.168.35.103:/root/.ssh/authorized_keys_01

          操作datanode機linux02:
          以用戶root登錄linux02,在目錄下執行命令:
          ssh-keygen -t rsa
          一路回車下去即可在目錄/root/.ssh/下建立兩個文件 id_rsa.pub和id_rsa。

          接下來,需要進入/root/.ssh目錄,執行如下命令:
          cd .ssh

          再把is_rsa.pub文件復制到namenode機linux上去。
          scp -r id_rsa.pub root@192.168.35.101:/root/.ssh/authorized_keys_02

          操作datanode機linux03:
          以用戶root登錄linux03,在目錄下執行命令:
          ssh-keygen -t rsa
          一路回車下去即可在目錄/root/.ssh/下建立兩個文件 id_rsa.pub和id_rsa。

          接下來,需要進入/root/.ssh目錄,執行如下命令:
          cd .ssh

          再把is_rsa.pub文件復制到namenode機linux上去。
          scp -r id_rsa.pub root@192.168.35.101:/root/.ssh/authorized_keys_03

          *******************************************************************************

          上述方式分別為linux\linux02\linux03機器生成了rsa密鑰,并且把linux的id_rsa.pub復制到linux02\linux03上去了,而把linux02和linux03上的id_rsa.pub復制到linux上去了。

          接下來還要完成如下步驟:

          linux機:
          以root用戶登錄linux,并且進入目錄/root/.ssh下,執行如下命令:
          cat id_rsa.pub >> authorized_keys
          cat authorized_keys_02 >> authorized_keys
          cat authorized_keys_03 >> authorized_keys
          chmod 644 authorized_keys

          linux02機:
          以root用戶登錄linux02,并且進入目錄/root/.ssh下,執行如下命令:
          cat id_rsa.pub >> authorized_keys
          cat authorized_keys_01 >> authorized_keys
          chmod 644 authorized_keys

          linux03機:
          以root用戶登錄linux03,并且進入目錄/root/.ssh下,執行如下命令:
          cat id_rsa.pub >> authorized_keys
          cat authorized_keys_01 >> authorized_keys
          chmod 644 authorized_keys

          通過上述配置,現在以用戶root登錄linux機,既可以無密鑰認證方式訪問linux02和linux03了,同樣也可以在linux02和linux03上以ssh linux方式連接到linux上進行訪問了。

          3.安裝和配置Hadoop
          > 在namenode機器即linux機上安裝hadoop
          我下載的是hadoop-0.20.2.tar.gz,ftp上傳到linux機的/root目錄上,解壓到安裝目錄/usr/hadoop,最終hadoop的根目錄是/usr/hadoop/hadoop-0.20.2/

          編輯/etc/profile文件,在文件尾部追加如下內容:
          export HADOOP_HOME=/usr/hadoop/hadoop-0.20.2
          export PATH=$HADOOP_HOME/bin:$PATH

          > 配置Hadoop
          core-site.xml:
          <?xml version="1.0"?>
          <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

          <!-- Put site-specific property overrides in this file. -->
          <configuration>
          ?? ?<property>
          ??????????????? <name>fs.default.name</name>
          ??????????????? <value>hdfs://192.168.35.101:9000</value>
          ??????? </property>
          ??????? <property>
          ??????????????? <name>hadoop.tmp.dir</name>
          ??????????????? <value>/tmp/hadoop/hadoop-${user.name}</value>
          ??????? </property>
          </configuration>

          hdfs-site.xml:
          <?xml version="1.0"?>
          <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

          <!-- Put site-specific property overrides in this file. -->
          <configuration>
          ??????? <property>
          ??????????????? <name>dfs.name.dir</name>
          ??????????????? <value>/home/hadoop/name</value>
          ??????? </property>
          ??????? <property>
          ??????????????? <name>dfs.data.dir</name>
          ??????????????? <value>/home/hadoop/data</value>
          ??????? </property>
          ??????? <property>
          ??????????????? <name>dfs.replication</name>
          ??????????????? <value>2</value>
          ??????? </property>
          </configuration>

          mapred-site.xml
          <?xml version="1.0"?>
          <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

          <!-- Put site-specific property overrides in this file. -->
          <configuration>
          ??????? <property>
          ??????????????? <name>mapred.job.tracker</name>
          ??????????????? <value>192.168.35.101:9001</value>
          ??????? </property>
          </configuration>

          masters
          192.168.35.101

          slaves
          192.168.35.102
          192.168.35.103

          至此,hadoop的簡單配置已經完成

          > 將在namenode機器上配置好的hadoop部署到datanode機器上
          這里使用scp命令進行遠程傳輸,先后執行命令
          scp -r /usr/hadoop/hadoop-0.20.2 root@192.168.35.102:/usr/hadoop/
          scp -r /usr/hadoop/hadoop-0.20.2 root@192.168.35.103:/usr/hadoop/

          4.測試
          以root用戶登入namenode機linux,進入目錄/usr/hadoop/hadoop-0.20.2/
          cd /usr/hadoop/hadoop-0.20.2

          > 執行格式化
          [root@linux hadoop-0.20.2]# bin/hadoop namenode -format
          11/07/26 21:16:03 INFO namenode.NameNode: STARTUP_MSG:
          /************************************************************
          STARTUP_MSG: Starting NameNode
          STARTUP_MSG:?? host = linux/127.0.0.1
          STARTUP_MSG:?? args = [-format]
          STARTUP_MSG:?? version = 0.20.2
          STARTUP_MSG:?? build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
          ************************************************************/
          Re-format filesystem in /home/hadoop/name ? (Y or N) Y
          11/07/26 21:16:07 INFO namenode.FSNamesystem: fsOwner=root,root,bin,daemon,sys,adm,disk,wheel
          11/07/26 21:16:07 INFO namenode.FSNamesystem: supergroup=supergroup
          11/07/26 21:16:07 INFO namenode.FSNamesystem: isPermissionEnabled=true
          11/07/26 21:16:07 INFO common.Storage: Image file of size 94 saved in 0 seconds.
          11/07/26 21:16:07 INFO common.Storage: Storage directory /home/hadoop/name has been successfully formatted.
          11/07/26 21:16:07 INFO namenode.NameNode: SHUTDOWN_MSG:
          /************************************************************
          SHUTDOWN_MSG: Shutting down NameNode at linux/127.0.0.1
          ************************************************************/

          > 啟動hadoop
          [root@linux hadoop-0.20.2]# bin/start-all.sh
          starting namenode, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-namenode-linux.out
          192.168.35.102: starting datanode, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-datanode-linux02.out
          192.168.35.103: starting datanode, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-datanode-linux03.out
          192.168.35.101: starting secondarynamenode, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-secondarynamenode-linux.out
          starting jobtracker, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-jobtracker-linux.out
          192.168.35.103: starting tasktracker, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-tasktracker-linux03.out
          192.168.35.102: starting tasktracker, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-tasktracker-linux02.out
          [root@linux hadoop-0.20.2]#

          > 用jps命令查看進程
          [root@linux hadoop-0.20.2]# jps
          7118 SecondaryNameNode
          7343 Jps
          6955 NameNode
          7204 JobTracker
          [root@linux hadoop-0.20.2]#

          posted on 2011-08-25 16:01 jadmin 閱讀(130) 評論(0)  編輯  收藏

          只有注冊用戶登錄后才能發表評論。


          網站導航:
           
          主站蜘蛛池模板: 安图县| 松溪县| 兴文县| 襄樊市| 射洪县| 济宁市| 新昌县| 唐海县| 越西县| 麟游县| 高淳县| 犍为县| 佛坪县| 体育| 民权县| 疏附县| 历史| 镇沅| 筠连县| 崇左市| 南丰县| 天长市| 新安县| 进贤县| 乌审旗| 仙居县| 尉犁县| 浠水县| 民乐县| 瑞丽市| 梁山县| 海城市| 黄陵县| 大余县| 措美县| 娄烦县| 当阳市| 大埔区| 平谷区| 五指山市| 格尔木市|