paulwong

          Install Hadoop in the AWS cloud

          1. get the Whirr tar file
            wget http://www.eu.apache.org/dist/whirr/stable/whirr-0.8.2.tar.gz
          2. untar the Whirr tar file
            tar -vxf whirr-0.8.2.tar.gz
          3. create credentials file
            mkdir ~/.whirr
            cp conf/credentials.sample ~/.whirr/credentials
          4. add the following content to credentials file
            # Set cloud provider connection details
            PROVIDER=aws-ec2
            IDENTITY=<AWS Access Key ID>
            CREDENTIAL=<AWS Secret Access Key>
          5. generate a rsa key pair
            ssh-keygen -t rsa -P ''
          6. create a hadoop.properties file and add the following content
            whirr.cluster-name=whirrhadoopcluster
            whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,2 hadoop-datanode+hadoop-tasktracker
            whirr.provider=aws-ec2
            whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
            whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub
            whirr.hadoop.version=1.0.2
            whirr.aws-ec2-spot-price=0.08
          7. launch hadoop
            bin/whirr launch-cluster --config hadoop.properties
          8. launch proxy
            cd ~/.whirr/whirrhadoopcluster/
            ./hadoop-proxy.sh
          9. add a rule to iptables
            0.0.0.0/0 50030
            0.0.0.0/0 50070
          10. check the web ui in the browser
            http://<aws-public-dns>:50030
          11. add to /etc/profile
            export HADOOP_CONF_DIR=~/.whirr/whirrhadoopcluster/
          12. check if the hadoop works
            hadoop fs -ls /

















          posted on 2013-09-08 13:45 paulwong 閱讀(410) 評(píng)論(0)  編輯  收藏 所屬分類: HADOOPAWS

          主站蜘蛛池模板: 秦皇岛市| 建水县| 韶关市| 六枝特区| 荆州市| 山东| 泗水县| 佛学| 靖宇县| 阳城县| 永善县| 哈尔滨市| 策勒县| 绥中县| 通海县| 遂昌县| 蛟河市| 长寿区| 罗平县| 安丘市| 高碑店市| 都匀市| 临桂县| 北安市| 伊宁县| 山东省| 通榆县| 拉孜县| 青岛市| 古浪县| 鄄城县| 罗平县| 大洼县| 咸丰县| 康保县| 本溪市| 慈利县| 河池市| 牟定县| 贵定县| 楚雄市|