paulwong

          Install Hadoop in the AWS cloud

          1. get the Whirr tar file
            wget http://www.eu.apache.org/dist/whirr/stable/whirr-0.8.2.tar.gz
          2. untar the Whirr tar file
            tar -vxf whirr-0.8.2.tar.gz
          3. create credentials file
            mkdir ~/.whirr
            cp conf/credentials.sample ~/.whirr/credentials
          4. add the following content to credentials file
            # Set cloud provider connection details
            PROVIDER=aws-ec2
            IDENTITY=<AWS Access Key ID>
            CREDENTIAL=<AWS Secret Access Key>
          5. generate a rsa key pair
            ssh-keygen -t rsa -P ''
          6. create a hadoop.properties file and add the following content
            whirr.cluster-name=whirrhadoopcluster
            whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,2 hadoop-datanode+hadoop-tasktracker
            whirr.provider=aws-ec2
            whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
            whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub
            whirr.hadoop.version=1.0.2
            whirr.aws-ec2-spot-price=0.08
          7. launch hadoop
            bin/whirr launch-cluster --config hadoop.properties
          8. launch proxy
            cd ~/.whirr/whirrhadoopcluster/
            ./hadoop-proxy.sh
          9. add a rule to iptables
            0.0.0.0/0 50030
            0.0.0.0/0 50070
          10. check the web ui in the browser
            http://<aws-public-dns>:50030
          11. add to /etc/profile
            export HADOOP_CONF_DIR=~/.whirr/whirrhadoopcluster/
          12. check if the hadoop works
            hadoop fs -ls /

















          posted on 2013-09-08 13:45 paulwong 閱讀(413) 評論(0)  編輯  收藏 所屬分類: HADOOPAWS

          主站蜘蛛池模板: 宣化县| 札达县| 镇平县| 张家口市| 紫阳县| 手机| 曲沃县| 竹北市| 区。| 浮梁县| 桓仁| 介休市| 昌宁县| 阿克苏市| 象州县| 丹寨县| 武强县| 瓦房店市| 宁德市| 宁海县| 新乡县| 万载县| 邓州市| 蓬莱市| 日照市| 抚松县| 鄂尔多斯市| 如东县| 安泽县| 尼勒克县| 延庆县| 涞源县| 双辽市| 五家渠市| 平顶山市| 界首市| 迭部县| 图们市| 韶关市| 留坝县| 湘西|