paulwong

          Install Hadoop in the AWS cloud

          1. get the Whirr tar file
            wget http://www.eu.apache.org/dist/whirr/stable/whirr-0.8.2.tar.gz
          2. untar the Whirr tar file
            tar -vxf whirr-0.8.2.tar.gz
          3. create credentials file
            mkdir ~/.whirr
            cp conf/credentials.sample ~/.whirr/credentials
          4. add the following content to credentials file
            # Set cloud provider connection details
            PROVIDER=aws-ec2
            IDENTITY=<AWS Access Key ID>
            CREDENTIAL=<AWS Secret Access Key>
          5. generate a rsa key pair
            ssh-keygen -t rsa -P ''
          6. create a hadoop.properties file and add the following content
            whirr.cluster-name=whirrhadoopcluster
            whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,2 hadoop-datanode+hadoop-tasktracker
            whirr.provider=aws-ec2
            whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
            whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub
            whirr.hadoop.version=1.0.2
            whirr.aws-ec2-spot-price=0.08
          7. launch hadoop
            bin/whirr launch-cluster --config hadoop.properties
          8. launch proxy
            cd ~/.whirr/whirrhadoopcluster/
            ./hadoop-proxy.sh
          9. add a rule to iptables
            0.0.0.0/0 50030
            0.0.0.0/0 50070
          10. check the web ui in the browser
            http://<aws-public-dns>:50030
          11. add to /etc/profile
            export HADOOP_CONF_DIR=~/.whirr/whirrhadoopcluster/
          12. check if the hadoop works
            hadoop fs -ls /

















          posted on 2013-09-08 13:45 paulwong 閱讀(413) 評論(0)  編輯  收藏 所屬分類: HADOOPAWS

          主站蜘蛛池模板: 四川省| 白玉县| 高阳县| 南康市| 逊克县| 湘乡市| 普陀区| 南部县| 锡林郭勒盟| 天等县| 乃东县| 双柏县| 隆尧县| 娱乐| 新河县| 孟村| 蛟河市| 当阳市| 双城市| 昆山市| 宝坻区| 宜宾市| 定安县| 顺昌县| 天气| 忻城县| 万荣县| 奇台县| 永春县| 志丹县| 如皋市| 延川县| 九台市| 泸溪县| 喀喇| 木里| 常山县| 分宜县| 宜章县| 镇康县| 泽库县|