paulwong

          Install Hadoop in the AWS cloud

          1. get the Whirr tar file
            wget http://www.eu.apache.org/dist/whirr/stable/whirr-0.8.2.tar.gz
          2. untar the Whirr tar file
            tar -vxf whirr-0.8.2.tar.gz
          3. create credentials file
            mkdir ~/.whirr
            cp conf/credentials.sample ~/.whirr/credentials
          4. add the following content to credentials file
            # Set cloud provider connection details
            PROVIDER=aws-ec2
            IDENTITY=<AWS Access Key ID>
            CREDENTIAL=<AWS Secret Access Key>
          5. generate a rsa key pair
            ssh-keygen -t rsa -P ''
          6. create a hadoop.properties file and add the following content
            whirr.cluster-name=whirrhadoopcluster
            whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,2 hadoop-datanode+hadoop-tasktracker
            whirr.provider=aws-ec2
            whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
            whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub
            whirr.hadoop.version=1.0.2
            whirr.aws-ec2-spot-price=0.08
          7. launch hadoop
            bin/whirr launch-cluster --config hadoop.properties
          8. launch proxy
            cd ~/.whirr/whirrhadoopcluster/
            ./hadoop-proxy.sh
          9. add a rule to iptables
            0.0.0.0/0 50030
            0.0.0.0/0 50070
          10. check the web ui in the browser
            http://<aws-public-dns>:50030
          11. add to /etc/profile
            export HADOOP_CONF_DIR=~/.whirr/whirrhadoopcluster/
          12. check if the hadoop works
            hadoop fs -ls /

















          posted on 2013-09-08 13:45 paulwong 閱讀(411) 評論(0)  編輯  收藏 所屬分類: HADOOP 、AWS

          主站蜘蛛池模板: 沂水县| 洛宁县| 巴楚县| 察哈| 东乌| 灵川县| 缙云县| 荣昌县| 忻州市| 宜城市| 玛多县| 黔西| 博野县| 石狮市| 安阳市| 淮阳县| 修水县| 沙洋县| 缙云县| 平阴县| 黑山县| 三江| 兴仁县| 沅陵县| 溧水县| 布拖县| 汝阳县| 海宁市| 游戏| 林甸县| 平顶山市| 清徐县| 德州市| 潜江市| 同德县| 阿勒泰市| 彭泽县| 霍山县| 罗江县| 龙胜| 临潭县|