??xml version="1.0" encoding="utf-8" standalone="yes"?>亚洲国产另类久久久精品极度,免费在线超碰,四虎精品在永久在线观看http://www.aygfsteel.com/ivanwan/archive/2015/04/25/424664.htmlivaneeoivaneeoSat, 25 Apr 2015 06:08:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/04/25/424664.htmlhttp://www.aygfsteel.com/ivanwan/comments/424664.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/04/25/424664.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/424664.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/424664.htmlhttp://www.csdn.net/article/2014-01-02/2817984-13-tools-let-hadoop-fly
好用的数据工?br />
http://blog.itpub.net/7816530/viewspace-1119924/


ivaneeo 2015-04-25 14:08 发表评论
]]>
mesos调度框架http://www.aygfsteel.com/ivanwan/archive/2015/04/15/424426.htmlivaneeoivaneeoTue, 14 Apr 2015 20:49:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/04/15/424426.htmlhttp://www.aygfsteel.com/ivanwan/comments/424426.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/04/15/424426.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/424426.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/424426.html
http://m.blog.csdn.net/blog/ebay/43529401


ivaneeo 2015-04-15 04:49 发表评论
]]>
centos6.5 docker installhttp://www.aygfsteel.com/ivanwan/archive/2015/04/02/424049.htmlivaneeoivaneeoThu, 02 Apr 2015 04:41:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/04/02/424049.htmlhttp://www.aygfsteel.com/ivanwan/comments/424049.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/04/02/424049.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/424049.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/424049.html

q行yum makecache生成~存

eple源:

rpm -Uvh http://ftp.sjtu.edu.cn/fedora/epel/6/i386/epel-release-6-8.noarch.rpm

docker 安装Q?/p>

You will need RHEL 6.5 or higher, with a RHEL 6 kernel version 2.6.32-431 or higher as this has specific kernel fixes to allow Docker to work.

CentOS 6.5已经?span style="padding: 0px; margin: 0px; font-family: Cabin, 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 20px;">2.6.32-431内核了,所以最好安装这个版本?/span>

yum -y install docker-io
升Q?/span>
yum -y update docker-io

手动升Q?/p>

wget https://get.docker.io/builds/Linux/x86_64/docker-latest -O docker mv -f docker /usr/bin/docker 

升完成

启动Q?/p>

service docker start

开机启动:

chkconfig docker on


ivaneeo 2015-04-02 12:41 发表评论
]]>
docker run restarthttp://www.aygfsteel.com/ivanwan/archive/2015/03/28/423906.htmlivaneeoivaneeoSat, 28 Mar 2015 02:31:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/03/28/423906.htmlhttp://www.aygfsteel.com/ivanwan/comments/423906.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/03/28/423906.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/423906.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/423906.htmlhttp://docs.docker.com/articles/host_integration/

ivaneeo 2015-03-28 10:31 发表评论
]]>
mincloud install loghttp://www.aygfsteel.com/ivanwan/archive/2015/03/27/423895.htmlivaneeoivaneeoFri, 27 Mar 2015 10:48:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/03/27/423895.htmlhttp://www.aygfsteel.com/ivanwan/comments/423895.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/03/27/423895.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/423895.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/423895.html172.20.20.8 mysql-mm1
172.20.20.11 mysql-mm2
172.20.20.10 mysql-data1
172.20.20.9 mysql-data2
172.20.20.10 mysql-sql1
172.20.20.9 mysql-sql2


mysql-mm1:
  docker run -d --name="mysql_mm1" --net=host -v /opt/mysql:/usr/local/mysql mysql_mm/ubuntu /bin/bash -exec 'echo -e "172.20.20.7 mysql-mm1\n172.20.20.10 mysql-mm2\n172.20.20.8 mysql-data1\n172.20.20.9 mysql-data2\n172.20.20.8 mysql-sql1\n172.20.20.9 mysql-sql2\n127.0.0.1 localhost" > /etc/hosts && ndb_mgmd -f /usr/local/mysql/data/mysql-cluster/config.ini && /usr/sbin/sshd -D'
mysql-mm2:
  docker run -d --name="mysql_mm2" --net=host -v /opt/mysql:/usr/local/mysql mysql_mm/ubuntu /bin/bash -exec 'echo -e "172.20.20.7 mysql-mm1\n172.20.20.10 mysql-mm2\n172.20.20.8 mysql-data1\n172.20.20.9 mysql-data2\n172.20.20.8 mysql-sql1\n172.20.20.9 mysql-sql2\n127.0.0.1 localhost" > /etc/hosts && ndb_mgmd -f /usr/local/mysql/data/mysql-cluster/config.ini && zabbix_agentd && /usr/sbin/sshd -D'
mysql-data1:
  docker run -d --name="mysql_data1" --net=host -v /opt/mysql:/usr/local/mysql mysql_data/ubuntu /bin/bash -exec 'echo -e "172.20.20.7 mysql-mm1\n172.20.20.10 mysql-mm2\n172.20.20.8 mysql-data1\n172.20.20.9 mysql-data2\n172.20.20.8 mysql-sql1\n172.20.20.9 mysql-sql2\n127.0.0.1 localhost" > /etc/hosts && /usr/local/mysql/bin/ndbd && zabbix_agentd && /usr/sbin/sshd -D'
mysql-data2:
  docker run -d --name="mysql_data2" --net=host -v /opt/mysql:/usr/local/mysql mysql_data/ubuntu /bin/bash -exec 'echo -e "172.20.20.7 mysql-mm1\n172.20.20.10 mysql-mm2\n172.20.20.8 mysql-data1\n172.20.20.9 mysql-data2\n172.20.20.8 mysql-sql1\n172.20.20.9 mysql-sql2\n127.0.0.1 localhost" > /etc/hosts && /usr/local/mysql/bin/ndbd && zabbix_agentd && /usr/sbin/sshd -D'
mysql-sql1:
  docker run -d --name="mysql_sql1" --net=host -v /opt/mysql:/usr/local/mysql mysql_sql/ubuntu /bin/bash -exec 'echo -e "172.20.20.7 mysql-mm1\n172.20.20.10 mysql-mm2\n172.20.20.8 mysql-data1\n172.20.20.9 mysql-data2\n172.20.20.8 mysql-sql1\n172.20.20.9 mysql-sql2\n127.0.0.1 localhost" > /etc/hosts && /usr/local/mysql/bin/mysqld_safe --user=mysql'
mysql-sql2:
  docker run -d --name="mysql_sql2" --net=host -v /opt/mysql:/usr/local/mysql mysql_sql/ubuntu /bin/bash -exec 'echo -e "172.20.20.7 mysql-mm1\n172.20.20.10 mysql-mm2\n172.20.20.8 mysql-data1\n172.20.20.9 mysql-data2\n172.20.20.8 mysql-sql1\n172.20.20.9 mysql-sql2\n127.0.0.1 localhost" > /etc/hosts && /usr/local/mysql/bin/mysqld_safe --user=mysql'
haproxy && nginx: 
  docker run -d --name="loadbalancer_master" -p 8888:8888 -p 6080:6080 -p 8089:8089 -p 8774:8774 -p 9696:9696 -p 9292:9292 -p 8776:8776 -p 5000:5000 -p 8777:8777 -p 11211:11211 -p 11222:11222 -p 5672:5672 -p 35357:35357 -p 8181:2181 -p 10389:10389 -p 2222:22 -p 80:80 -p 1936:1936 -p 3306:3306 -p 10052:10052 -p 10051:10051 -p 8080:8080 -v /opt/etc/nginx/conf:/usr/local/nginx-1.0.6/conf -v /opt/etc/haproxy:/etc/haproxy loadbalancer/ubuntu /bin/bash -exec 'echo -e "127.0.0.1 localhost" > /etc/hosts && service haproxy start && /usr/local/nginx-1.0.6/sbin/nginx && zabbix_agentd && /usr/sbin/sshd -D'
redis_master:  
  docker run -d --name="redis_master" -p 18:22 -p 6379:6379 -p 6380:6380 redis_master/ubuntu /bin/bash -exec '/usr/local/webserver/redis/start.sh && /usr/sbin/sshd -D'
redis_slave: 
  docker run -d --name="redis_slave1" -p 18:22 -p 6379:6379 -p 6380:6380 redis_slave/ubuntu /bin/bash -exec 'echo -e "172.20.20.10 redis-master\n127.0.0.1 localhost" > /etc/hosts && /usr/local/webserver/redis/start.sh && /usr/sbin/sshd -D' 

rabbitmq:        
  docker run -d --name="rabbitmq_master" -p 2222:22 -p 25672:25672 -p 15672:15672 -p 5672:5672 -p 4369:4369 -p 10051:10050 rabbitmq/ubuntu /bin/bash -exec 'echo -e "172.20.20.10 rabbitmq-master\n127.0.0.1 localhost" > /etc/hosts && /etc/init.d/rabbitmq-server start && /usr/sbin/sshd -D'
 
mule:
  docker run -d --name="mule1" -p 5005:5005 -p 2222:22 -p 9999:9999 -p 9003:9003 -p 9000:9000 -p 9001:9001 -p 9004:9004 -v /opt/mule:/opt/mule-standalone-3.5.0_cloud mule/ubuntu /bin/bash -exec 'echo -e "192.168.1.180 lb-master\n192.168.1.180 controller-node\n127.0.0.1 localhost" >> /etc/hosts && /usr/sbin/sshd && export JAVA_HOME=/opt/jdk1.7.0_51 && export PATH=$JAVA_HOME/bin:$PATH && /opt/mule-standalone-3.5.0_cloud/bin/mule'


zentao:

  docker run -d --name="zentao" -p 22222:22 -p 10008:80 -v /opt/www/html/zentaopms:/opt/zentao --privileged=true zentao/ubuntu /bin/bash -exec 'service apache2 start && /usr/sbin/sshd -D'

websocket-tomcat:
  docker run -d --name="websocket_tomcat1" -p 8888:8080 -p 2222:22 -v /opt/apache-tomcat-8.0.15:/opt/apache-tomcat websocket-tomcat/ubuntu /bin/bash -exec 'echo -e "192.168.1.180 lb-master\n127.0.0.1 localhost" > /etc/hosts && export JAVA_HOME=/opt/jdk1.7.0_51 && /opt/apache-tomcat/bin/startup.sh && /usr/sbin/sshd -D'

 docker run -d --name="guacamole1" -p 8088:8088 -p 38:22 -v /opt/apache-tomcat-7.0.53:/opt/apache-tomcat guacamole/ubuntu /bin/bash -exec 'echo -e "192.168.1.150 lb-master\n127.0.0.1 localhost" > /etc/hosts && /etc/init.d/guacd start && /opt/apache-tomcat/bin/start-tomcat.sh && /usr/sbin/sshd -D'


ivaneeo 2015-03-27 18:48 发表评论
]]>
mysql cluster install faqhttp://www.aygfsteel.com/ivanwan/archive/2015/03/27/423893.htmlivaneeoivaneeoFri, 27 Mar 2015 08:43:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/03/27/423893.htmlhttp://www.aygfsteel.com/ivanwan/comments/423893.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/03/27/423893.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/423893.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/423893.htmlhttp://www.docin.com/p-558099649.html

ivaneeo 2015-03-27 16:43 发表评论
]]>
centos7 testing yumhttp://www.aygfsteel.com/ivanwan/archive/2015/03/26/423873.htmlivaneeoivaneeoThu, 26 Mar 2015 15:32:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/03/26/423873.htmlhttp://www.aygfsteel.com/ivanwan/comments/423873.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/03/26/423873.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/423873.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/423873.html

わVップQ?/etc/yum.repos.d/virt7-testing.repo というファイルを作ります?/h1>
/etc/yum.repos.d/virt7-testing.repo
[virt7-testing] name=virt7-testing baseurl=http://cbs.centos.org/repos/virt7-testing/x86_64/os/ enabled=0  gpgcheck=0 

わVップ2 インわVヹ{します?/h1>
sudo yum --enablerepo=virt7-testing install docker 

します?/p>

$ docker --version Docker version 1.5.0, build a8a31ef/1.5.0 

やったー|Q?/p>

※ご利用は自己責Qでおいします?/p>

http://billpaxtonwasright.com/installing-docker-1-5-0-on-centos-7/



ivaneeo 2015-03-26 23:32 发表评论
]]>
解决KVM中鼠标不同步问题http://www.aygfsteel.com/ivanwan/archive/2015/03/23/423760.htmlivaneeoivaneeoMon, 23 Mar 2015 12:49:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/03/23/423760.htmlhttp://www.aygfsteel.com/ivanwan/comments/423760.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/03/23/423760.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/423760.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/423760.html在虚拟机的配|文件中增加Q?/span>

<input type=’tablet’ bus=’usb’/>
Q该句位?lt;devices>配置中)


Linux:


在终端中输入Q?/p>

xset -m 0

 

Windows:

q入控制面板 -> 鼠标 -> 指针选项Q去?#8220;提高指针_?#8221;前面的勾?/p>

ivaneeo 2015-03-23 20:49 发表评论
]]>
openstack virt vnc porthttp://www.aygfsteel.com/ivanwan/archive/2015/03/22/423729.htmlivaneeoivaneeoSun, 22 Mar 2015 15:16:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/03/22/423729.htmlhttp://www.aygfsteel.com/ivanwan/comments/423729.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/03/22/423729.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/423729.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/423729.htmlhttp://docs.openstack.org/image-guide/content/virt-install.html

ivaneeo 2015-03-22 23:16 发表评论
]]>
ceilometer alarm例子http://www.aygfsteel.com/ivanwan/archive/2015/03/17/423541.htmlivaneeoivaneeoTue, 17 Mar 2015 10:13:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/03/17/423541.htmlhttp://www.aygfsteel.com/ivanwan/comments/423541.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/03/17/423541.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/423541.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/423541.htmlhttp://blog.csdn.net/hackerain/article/details/38172941

ivaneeo 2015-03-17 18:13 发表评论
]]>
curl openstackhttp://www.aygfsteel.com/ivanwan/archive/2015/03/13/423445.htmlivaneeoivaneeoFri, 13 Mar 2015 11:32:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/03/13/423445.htmlhttp://www.aygfsteel.com/ivanwan/comments/423445.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/03/13/423445.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/423445.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/423445.htmlhttp://blog.csdn.net/anhuidelinger/article/details/9818693

ivaneeo 2015-03-13 19:32 发表评论
]]>
ubuntu docker1.5 installhttp://www.aygfsteel.com/ivanwan/archive/2015/03/02/423137.htmlivaneeoivaneeoMon, 02 Mar 2015 08:21:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/03/02/423137.htmlhttp://www.aygfsteel.com/ivanwan/comments/423137.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/03/02/423137.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/423137.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/423137.htmlhttps://docs.docker.com/installation/ubuntulinux/#ubuntu-trusty-1404-lts-64-bit

ivaneeo 2015-03-02 16:21 发表评论
]]>
docker api demohttp://www.aygfsteel.com/ivanwan/archive/2015/02/14/422927.htmlivaneeoivaneeoSat, 14 Feb 2015 06:29:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2015/02/14/422927.htmlhttp://www.aygfsteel.com/ivanwan/comments/422927.htmlhttp://www.aygfsteel.com/ivanwan/archive/2015/02/14/422927.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/422927.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/422927.htmlhttp://my.oschina.net/guol/blog/271416

ivaneeo 2015-02-14 14:29 发表评论
]]>
ndb manage showhttp://www.aygfsteel.com/ivanwan/archive/2014/12/26/421868.htmlivaneeoivaneeoFri, 26 Dec 2014 10:41:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2014/12/26/421868.htmlhttp://www.aygfsteel.com/ivanwan/comments/421868.htmlhttp://www.aygfsteel.com/ivanwan/archive/2014/12/26/421868.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/421868.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/421868.htmlroot@proxzone-project-4:/usr/local/mysql/bin# ndb_mgm -e show

Connected to Management Server at: localhost:1186

Cluster Configuration

---------------------

[ndbd(NDB)] 2 node(s)

id=3 @172.21.21.108  (mysql-5.6.21 ndb-7.3.7, Nodegroup: 0)

id=4 @172.21.21.109  (mysql-5.6.21 ndb-7.3.7, Nodegroup: 0, *)


[ndb_mgmd(MGM)] 2 node(s)

id=1 @172.21.21.107  (mysql-5.6.21 ndb-7.3.7)

id=2 @172.21.21.110  (mysql-5.6.21 ndb-7.3.7)


[mysqld(API)] 2 node(s)

id=5 @172.21.21.108  (mysql-5.6.21 ndb-7.3.7)

id=6 @172.21.21.109  (mysql-5.6.21 ndb-7.3.7)



ivaneeo 2014-12-26 18:41 发表评论
]]>
docker!http://www.aygfsteel.com/ivanwan/archive/2014/12/19/421553.htmlivaneeoivaneeoThu, 18 Dec 2014 16:57:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2014/12/19/421553.htmlhttp://www.aygfsteel.com/ivanwan/comments/421553.htmlhttp://www.aygfsteel.com/ivanwan/archive/2014/12/19/421553.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/421553.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/421553.htmlhttp://www.aygfsteel.com/yongboy/archive/2013/12/12/407498.html

docker-registry:

http://www.cnblogs.com/xguo/p/3829329.html


ubuntu 14.04
http://www.tuicool.com/articles/b63uei

centos 6.5
http://blog.yourtion.com/ubuntu-install-docker.html


ivaneeo 2014-12-19 00:57 发表评论
]]>
cloudstack xenserver agenthttp://www.aygfsteel.com/ivanwan/archive/2014/12/17/421501.htmlivaneeoivaneeoWed, 17 Dec 2014 06:54:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2014/12/17/421501.htmlhttp://www.aygfsteel.com/ivanwan/comments/421501.htmlhttp://www.aygfsteel.com/ivanwan/archive/2014/12/17/421501.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/421501.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/421501.html/etc/sysctl.conf

> > net.bridge.bridge-nf-call-iptables = 1 > 
> net.bridge.bridge-nf-call-ip6tables = 0 > 
> net.bridge.bridge-nf-call-arptables = 1

xe-switch-network-backend bridge

REBOOT


ivaneeo 2014-12-17 14:54 发表评论
]]>
Hazelcast River Plugin for ElasticSearchhttp://www.aygfsteel.com/ivanwan/archive/2013/10/08/404716.htmlivaneeoivaneeoMon, 07 Oct 2013 16:57:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2013/10/08/404716.htmlhttp://www.aygfsteel.com/ivanwan/comments/404716.htmlhttp://www.aygfsteel.com/ivanwan/archive/2013/10/08/404716.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/404716.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/404716.htmlhttps://github.com/sksamuel/elasticsearch-river-hazelcast

ivaneeo 2013-10-08 00:57 发表评论
]]>
elasticsearch安装配置及中文分?/title><link>http://www.aygfsteel.com/ivanwan/archive/2013/10/04/404680.html</link><dc:creator>ivaneeo</dc:creator><author>ivaneeo</author><pubDate>Thu, 03 Oct 2013 18:09:00 GMT</pubDate><guid>http://www.aygfsteel.com/ivanwan/archive/2013/10/04/404680.html</guid><wfw:comment>http://www.aygfsteel.com/ivanwan/comments/404680.html</wfw:comment><comments>http://www.aygfsteel.com/ivanwan/archive/2013/10/04/404680.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.aygfsteel.com/ivanwan/comments/commentRss/404680.html</wfw:commentRss><trackback:ping>http://www.aygfsteel.com/ivanwan/services/trackbacks/404680.html</trackback:ping><description><![CDATA[<div>ElasticSearch是一个基于Lucene构徏的开源,分布式,RESTful搜烦引擎。设计用于云计算中,能够辑ֈ实时搜烦Q稳定,可靠Q快速,安装使用方便。支持通过HTTP使用JSONq行数据索引?nbsp; <p><span>  我们建立一个网站或应用E序Qƈ要添加搜索功能,令我们受打击的是Q搜索工作是很难的。我们希望我们的搜烦解决Ҏ要快Q我们希?有一个零配置和一个完全免费的搜烦模式Q我们希望能够简单地使用JSON通过HTTP的烦引数据,我们希望我们的搜索服务器始终可用Q我们希望能够一台开 始ƈ扩展到数百,我们要实时搜索,我们要简单的多租P我们希望建立一个云的解x案。Elasticsearch旨在解决所有这些问题和更多的?/span></p> <h2>安装</h2> <p><span>  以windows操作pȝ和ES0.19.7版本ZQ?/span></p> <div> </div> <p><span>  ①下蝲elasticsearch-0.19.7.zip</span></p> <div> </div> <p><span>  ②直接解压x目录Q设|该目录为ES_HOME环境变量</span></p> <div> </div> <p><span>  ③安装JDKQƈ讄JAVA_HOME环境变量</span></p> <div> </div> <p><span>  ④在windows下,q行 %ES_HOME%\bin\elasticsearch.bat卛_q行<br /></span></p> <p><strong>分布式搜索elasticsearch单机与服务器环境搭徏</strong></p> <div class="wmqeeuq" id="article_content"> <p>      先到<a >http://www.elasticsearch.org/download/</a><span>?载最新版的elasticsearchq行包,本文写时最新的?.19.1Q作者是个很勤快的hQes的更新很频繁Qbug修复得很快。下载完解开有三 个包:bin是运行的脚本Qconfig是设|文Ӟlib是放依赖的包。如果你要装插g的话p多新Z个plugins的文件夹Q把插g攑ֈq个文g 夹中?br /></span></p> <p>1.单机环境Q?/p> <p>单机版的elasticsearchq行很简单,linux下直?nbsp;bin/elasticsearchp行了Qwindowsq行bin/elasticsearch.bat。如果是在局域网中运行elasticsearch集群也是很简单的Q只要cluster.name讄一_q且机器在同一|段下,启动的es会自动发现对方,l成集群?/p> <p>2.服务器环境:</p> <p>如果是在服务器上可以用elasticsearch-servicewrapperq个es插gQ它支持通过参数Q指定是在后台或前台q行esQƈ且支持启动,停止Q重启es服务Q默认es脚本只能通过ctrl+c关闭esQ。用方法是?a >https://github.com/elasticsearch/elasticsearch-servicewrapper</a>下蝲service文g夹,攑ֈes的bin目录下。下面是命o集合Q?br />bin/service/elasticsearch +<br />console 在前台运行es<br />start 在后台运行es<br />stop 停止es<br />install 使es作ؓ服务在服务器启动时自动启?br />remove 取消启动时自动启?/p> <p>在service目录下有个elasticsearch.conf配置文gQ主要是讄一些javaq行环境参数Q其中比较重要的是下面的</p> <p>参数Q?/p> <p>#es的home路径Q不用用默认值就可以<br />set.default.ES_HOME=<Path to ElasticSearch Home></p> <p>#分配les的最内?br />set.default.ES_MIN_MEM=256</p> <p>#分配les的最大内?br />set.default.ES_MAX_MEM=1024</p> <p><br /># 启动{待时旉Q以Uؓ单位Q?br />wrapper.startup.timeout=300</p> <p># 关闭{待时旉Q以Uؓ单位Q?/p> <p>wrapper.shutdown.timeout=300</p> <p># ping时旉(以秒为单?</p> <p>wrapper.ping.timeout=300</p> </div> <h2>安装插g</h2> <p><span>  以head插gZQ?/span></p> <div> </div> <p><span>  联网Ӟ直接q行%ES_HOME%\bin\plugin -install mobz/elasticsearch-head</span></p> <div> </div> <p><span>  不联|时Q下载elasticsearch-head的zipball的master包,把内容解压到%ES_HOME%\plugin\head\_site目录下,[该插件ؓsitecd插g]</span></p> <div> </div> <p><span>  安装完成Q重启服务,在浏览器打开 http://localhost:9200/_plugin/head/ 卛_<br /></span></p> <h2>ES概念</h2> <p><span>  cluster</span></p> <div> </div> <p><span><span>  代表一个集,集群中有多个节点Q其中有一个ؓ主节点,q个主节Ҏ可以通过选D产生的,M节点是对于集内部来?的。es的一个概念就是去中心化,字面上理解就是无中心节点Q这是对于集外部来说的Q因Z外部来看es集群Q在逻辑上是个整体,你与M一个节点的?信和与整个es集群通信是等L?/span></span></p> <div> </div> <p><span>  shards</span></p> <div> </div> <p><span>  代表索引分片Qes可以把一个完整的索引分成多个分片Q这L好处是可以把一个大的烦引拆分成多个Q分布到不同的节点上。构成分布式搜烦。分片的数量只能在烦引创建前指定Qƈ且烦引创建后不能更改?/span></p> <div> </div> <p><span>  replicas</span></p> <div> </div> <p><span>  代表索引副本Qes可以讄多个索引的副本,副本的作用一是提高系l的定w性,当个某个节点某个分片损坏或丢失时可以从副本中恢复。二是提高es的查询效率,es会自动对搜烦hq行负蝲均衡?/span></p> <div> </div> <p><span>  recovery</span></p> <div> </div> <p><span>  代表数据恢复或叫数据重新分布Qes在有节点加入或退出时会根据机器的负蝲对烦引分片进行重新分配,挂掉的节炚w新启动时也会q行数据恢复?/span></p> <div> </div> <p><span>  river</span></p> <div> </div> <p><span><span>  代表es的一个数据源Q也是其它存储方式(如:数据库)同步数据到es的一个方法。它是以插g方式存在的一个es?务,通过driver中的数据q把它烦引到es中,官方的river有couchDB的,RabbitMQ的,Twitter的,Wikipedia 的?/span></span></p> <div> </div> <p><span>  gateway</span></p> <div> </div> <p><span><span>  代表es索引的持久化存储方式Qes默认是先把烦引存攑ֈ内存中,当内存满了时再持久化到硬盘。当q个es集群关闭?重新启动时就会从gateway中读取烦引数据。es支持多种cd的gatewayQ有本地文gpȝQ默认)Q分布式文gpȝQHadoop的HDFS?amazon的s3云存储服务?/span></span></p> <div> </div> <p><span>  discovery.zen</span></p> <div> </div> <p><span>  代表es的自动发现节ҎӞes是一个基于p2p的系l,它先通过q播L存在的节点,再通过多播协议来进行节点之间的通信Q同时也支持点对点的交互?/span></p> <div> </div> <p><span>  Transport</span></p> <div> </div> <p><span>  代表es内部节点或集与客户端的交互方式Q默认内部是使用tcp协议q行交互Q同时它支持http协议Qjson格式Q、thrift、servlet、memcached、zeroMQ{的传输协议Q通过插g方式集成Q?br /></span></p> <p><strong>分布式搜索elasticsearch中文分词集成</strong></p> <div class="wmqeeuq" id="article_content"> <p>elasticsearch官方只提供smartcnq个中文分词插gQ效果不是很好,好在国内有medcl大神Q国内最早研Ies的h之一Q写的两个中文分词插Ӟ一个是ik的,一个是mmseg的,下面分别介绍下两者的用法Q其实都差不多的Q先安装插gQ命令行Q?br />安装ik插gQ?/p> <p>plugin -install medcl/elasticsearch-analysis-ik/1.1.0  </p> <p>下蝲ik相关配置词典文g到config目录</p> <div bg_plain"=""><ol start="1"><li><span>cd config  </span></li><li>wget http://github.com/downloads/medcl/elasticsearch-analysis-ik/ik.zip --no-check-certificate  </li><li>unzip ik.zip  </li><li>rm ik.zip  </li></ol></div> <p>安装mmseg插gQ?/p> <div bg_plain"=""><ol start="1"><li><span>bin/plugin -install medcl/elasticsearch-analysis-mmseg/1.1.0  </span></li></ol></div> <p>下蝲相关配置词典文g到config目录</p> <div bg_plain"=""><ol start="1"><li><span>cd config  </span></li><li>wget http://github.com/downloads/medcl/elasticsearch-analysis-mmseg/mmseg.zip --no-check-certificate  </li><li>unzip mmseg.zip  </li><li>rm mmseg.zip  </li></ol></div> <p>分词配置</p> <p>ik分词配置Q在elasticsearch.yml文g中加?/p> <div bg_html"=""><ol start="1"><li><span>index:  </span></li><li>  analysis:                     </li><li>    analyzer:        </li><li>      ik:  </li><li>          alias: [ik_analyzer]  </li><li>          type: org.elasticsearch.index.analysis.IkAnalyzerProvider  </li></ol></div> <p>?/p> <div bg_html"=""><ol start="1"><li><span>index.analysis.analyzer.ik.type : “ik”  </span></li></ol></div> <p>q两句的意义相同<br />mmseg分词配置Q也是在在elasticsearch.yml文g?/p> <div bg_html"=""><ol start="1"><li><span>index:  </span></li><li>  analysis:  </li><li>    analyzer:  </li><li>      mmseg:  </li><li>          alias: [news_analyzer, mmseg_analyzer]  </li><li>          type: org.elasticsearch.index.analysis.MMsegAnalyzerProvider  </li></ol></div> <p>?/p> <div bg_html"=""><ol start="1"><li><span>index.analysis.analyzer.default.type : "mmseg"  </span></li></ol></div> <p>mmseg分词q有些更加个性化的参数设|如?/p> <div bg_html"=""><ol start="1"><li><span>index:  </span></li><li>  analysis:  </li><li>    tokenizer:  </li><li>      mmseg_maxword:  </li><li>          type: mmseg  </li><li>          seg_type: "max_word"  </li><li>      mmseg_complex:  </li><li>          type: mmseg  </li><li>          seg_type: "complex"  </li><li>      mmseg_simple:  </li><li>          type: mmseg  </li><li>          seg_type: "simple"  </li></ol></div> <p>q样配置完后插g安装完成Q启动es׃加蝲插g?/p> <p>定义mapping</p> <p>在添加烦引的mapping时就可以q样定义分词?/p> <div bg_plain"=""><ol start="1"><li><span>{  </span></li><li>   "page":{  </li><li>      "properties":{  </li><li>         "title":{  </li><li>            "type":"string",  </li><li>            "indexAnalyzer":"ik",  </li><li>            "searchAnalyzer":"ik"  </li><li>         },  </li><li>         "content":{  </li><li>            "type":"string",  </li><li>            "indexAnalyzer":"ik",  </li><li>            "searchAnalyzer":"ik"  </li><li>         }  </li><li>      }  </li><li>   }  </li><li>}  </li></ol></div> <p>indexAnalyzer为烦引时使用的分词器QsearchAnalyzer为搜索时使用的分词器?/p> <p>java mapping代码如下Q?/p> <div bg_java"=""><ol start="1"><li><span>XContentBuilder content = XContentFactory.jsonBuilder().startObject()  </span></li><li>        .startObject(<span>"page")  </span></li><li>          .startObject(<span>"properties")         </span></li><li>            .startObject(<span>"title")  </span></li><li>              .field(<span>"type", "string")             </span></li><li>              .field(<span>"indexAnalyzer", "ik")  </span></li><li>              .field(<span>"searchAnalyzer", "ik")  </span></li><li>            .endObject()   </li><li>            .startObject(<span>"code")  </span></li><li>              .field(<span>"type", "string")           </span></li><li>              .field(<span>"indexAnalyzer", "ik")  </span></li><li>              .field(<span>"searchAnalyzer", "ik")  </span></li><li>            .endObject()       </li><li>          .endObject()  </li><li>         .endObject()  </li><li>       .endObject()  </li></ol></div> <p>定义完后操作索引׃以指定的分词器来q行分词?/p> <p> 附:</p> <p>ik分词插g目地址Q?a >https://github.com/medcl/elasticsearch-analysis-ik</a></p> <p>mmseg分词插g目地址Q?a >https://github.com/medcl/elasticsearch-analysis-mmseg</a></p> <p>如果觉得配置ȝQ也可以下蝲个配|好的es版本Q地址如下Q?a >https://github.com/medcl/elasticsearch-rtf</a></p> </div> <p> </p> <div> <h3><strong>elasticsearch的基本用?/strong></h3> </div> <div class="wmqeeuq" id="blog_content"><br />最大的特点Q?nbsp;<br />1. 数据库的 database, 是  index <br />2. 数据库的 table,  是 tag <br />3. 不要使用browserQ?使用curl来进行客L操作.  否则会出?java heap ooxx... <br /><br />curl:  -X 后面?RESTful Q?nbsp; GET, POST ... <br />-d 后面跟数据?(d = data to send) <br /><br />1. create:  <br /><br />指定 ID 来徏立新记录?Q貌似PUTQ?POST都可以) <br />$ curl -XPOST localhost:9200/films/md/2 -d ' <br />{ "name":"hei yi ren", "tag": "good"}' <br /><br />使用自动生成?ID 建立新纪录: <br />$ curl -XPOST localhost:9200/films/md -d ' <br />{ "name":"ma da jia si jia3", "tag": "good"}' <br /><br />2. 查询Q?nbsp;<br />2.1 查询所有的 index, type: <br />$ curl localhost:9200/_search?pretty=true <br /><br />2.2 查询某个index下所有的type: <br />$ curl localhost:9200/films/_search <br /><br />2.3 查询某个index 下, 某个 type下所有的记录Q?nbsp;<br />$ curl localhost:9200/films/md/_search?pretty=true <br /><br />2.4 带有参数的查询:  <br />$ curl localhost:9200/films/md/_search?q=tag:good <br />{"took":7,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":1.0,"hits":[{"_index":"film","_type":"md","_id":"2","_score":1.0, "_source" : <br />{ "name":"hei yi ren", "tag": "good"}},{"_index":"film","_type":"md","_id":"1","_score":0.30685282, "_source" : <br />{ "name":"ma da jia si jia", "tag": "good"}}]}} <br /><br />2.5 使用JSON参数的查询: Q注?query ?term 关键字) <br />$ curl localhost:9200/film/_search -d ' <br />{"query" : { "term": { "tag":"bad"}}}' <br /><br />3. update  <br />$ curl -XPUT localhost:9200/films/md/1 -d { ...(data)... } <br /><br />4. 删除?删除所有的Q?nbsp;<br />$ curl -XDELETE localhost:9200/films</div></div><img src ="http://www.aygfsteel.com/ivanwan/aggbug/404680.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.aygfsteel.com/ivanwan/" target="_blank">ivaneeo</a> 2013-10-04 02:09 <a href="http://www.aygfsteel.com/ivanwan/archive/2013/10/04/404680.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Cloudera Impala TarBall ~译、安装与配置http://www.aygfsteel.com/ivanwan/archive/2013/06/29/401074.htmlivaneeoivaneeoSat, 29 Jun 2013 09:12:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2013/06/29/401074.htmlhttp://www.aygfsteel.com/ivanwan/comments/401074.htmlhttp://www.aygfsteel.com/ivanwan/archive/2013/06/29/401074.html#Feedback1http://www.aygfsteel.com/ivanwan/comments/commentRss/401074.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/401074.html

Impala是由Cloudera开发的高性能实时计算工具Q相比Hive性能提升了几十、甚臌癑ր,基本思想是将计算分发到每?Datanode所在的节点Q依靠内存实现数据的~存q行快速计,cM的系l还有Berkeley的Shark。从实际试来看QImpala效率实 不错Q由于Impala大量使用C++实现Q不使用CDH的Image而自q译安装要费不功夫,q里记录一下安装配|过E和到的一些问题。我在测?时候用的是CentOS6.2?br /> 一些基本的安装步骤在这里,但我在安装的时候碰C些问题,q里再详l说明一下过E?/p>

1.安装所需的依赖libQ这一步没有什么不?/strong>

sudo yum install boost-test boost-program-options libevent-devel automake libtool flex bison gcc-c++ openssl-devel make cmake doxygen.x86_64 glib-devel boost-devel python-devel bzip2-devel svn libevent-devel cyrus-sasl-devel wget git unzip

2.安装LLVMQ按照流E做卛_Q注意要在多台机器上~译安装Impala的话Q只用在一台机器上执行下面蓝色的部分,再把llvm分发到多台机器上执行后面U色部分的指令就可以了,没必要每个机器都通过svn下蝲一遍源代码Q很Ҏ?/p>

wget http://llvm.org/releases/3.2/llvm-3.2.src.tar.gz
tar xvzf llvm-3.2.src.tar.gz
cd llvm-3.2.src/tools
svn co http://llvm.org/svn/llvm-project/cfe/tags/RELEASE_32/final/ clang
cd ../projects
svn co http://llvm.org/svn/llvm-project/compiler-rt/tags/RELEASE_32/final/ compiler-rt
cd ..
./configure –with-pic
make -j4 REQUIRES_RTTI=1
sudo make install

3.安装MavenQ这个没什么好说的Q按照步骤,讄一下环境变量即可,Maven是ؓ了后面build impala源代码用的?/p>

wget http://www.fightrice.com/mirrors/apache/maven/maven-3/3.0.4/binaries/apache-maven-3.0.4-bin.tar.gz
tar xvf apache-maven-3.0.4.tar.gz && sudo mv apache-maven-3.0.4 /usr/local

修改~/.bashrcQ增加maven环境变量

export M2_HOME=/usr/local/apache-maven-3.0.4
export M2=$M2_HOME/bin
export PATH=$M2:$PATH

更新环境变量Q查看mvn版本是否正确

source ~/.bashrc
mvn -version

4.下蝲Impala源代?/strong>

git clone https://github.com/cloudera/impala.git

5.讄Impala环境变量Q编译时需?/strong>

cd impala
./bin/impala-config.sh

6.下蝲impala依赖的第三方package

cd thirdparty
./download_thirdparty.sh

注意q里其中一个包cyrus-sasl-2.1.23可能下蝲p|Q可以自行搜?CSDN里面有)下蝲下来然后解压~到thirdparty 文g夹,最好是在执行完download_thirdparty.sh之后做这一步,因ؓdownload_thirdparty.sh会把所有目录下?载下来的tar.gzl删除掉?/p>

7.理论上现在可以开始build impala?/strong>Q但是实际buildq程中可能会出现问题Q我到的问题和 Boost相关?具体错误不记得了)Q最后发现是׃boost版本太低D的,CentOS 6.2pȝ默认yum源中的boost和boost-devel版本?.41Q但是impala~译需?.44以上的版本,因此需要做的是自己重新~?译boostQ我用的是boost 1.46版本?/p>

#删除已安装的boost和boost-devel
yum remove boost
yum remove boost-devel
#下蝲boost
#可以?http://www.boost.org/users/history/)下蝲boost
#下蝲后解压羃
tar xvzf boost_1_46_0.tar.gz
mv boost_1_46_0 /usr/local/
cd /usr/include
./bootstrap.sh
./bjam
#执行后若打印以下内容Q则表示安装成功
# The Boost C++ Libraries were successfully built!
# The following directory should be added to compiler include paths:
# /usr/local/boost_1_46_0
# The following directory should be added to linker library paths:
# /usr/local/boost_1_46_0/stage/lib
#现在q需要设|Boost环境变量和Impala环境变量

export BOOST_ROOT=’/usr/local/boost_1_46_0′
export IMPALA_HOME=’/home/extend/impala’

#注意一下,q里虽然安装了boostQ但是我在实际用的时候,~译q是会报错的Q报的错误是找不到这个包Q?libboost_filesystem-mt.soQ这个包是由boost-devel提供的,所以我的做法是把boost-devell重新装?br /> #我没有试q如果之前不删除boost-devel会不会有问题Q能定的是按这里写的流E做是没问题?/p>

yum install boost-devel

8.现在l于可以~译impala?/strong>

cd $IMPALA_HOME
./build_public.sh -build_thirdparty
#~译首先会编译C++部分Q然后再用mvn~译java部分Q整个过E比较慢Q我在虚拟机上大概需?-2个小时?br /> #Impala~译完后的东西在be/build/debug里面

9.启动impala_shell需要用到的python?/strong>

#W一ơ执行impalad_shell可能会报错,q里需要安装python的两个包:thrift和prettytableQ用easy_install卛_
easy_install prettytable
easy_install thrift

10.如果你以为到q里׃事大吉就太天真了Q在配置、启动、用Impala的时候还会有很多奇葩的问题;

问题1QHive和Hadoop使用的版?/strong>
CDH对版本的依赖要求比较高,Z保证Impala正常q行Q强烈徏议用Impala里面thirdparty目录中自带的Hadoop(native lib已经~译好的)和Hive版本?br /> Hadoop的配|文件在$HADOOP_HOME/etc/hadoop中,要注意的是需要启用native lib

#修改hadoop的core-site.xmlQ除了这个选项之外Q其他配|和问题2中的core-site.xml一?br /> <property>
  <name>hadoop.native.lib</name>
  <value>true</value>
  <description>Should native hadoop libraries, if present, be used.</description>
</property>

问题2QImpala的配|文件位|?/strong>
Impala默认使用的配|文件\径是在bin/set-classpath.sh中配|的Q徏议把CLASSPATH部分Ҏ

CLASSPATH=\
$IMPALA_HOME/conf:\
$IMPALA_HOME/fe/target/classes:\
$IMPALA_HOME/fe/target/dependency:\
$IMPALA_HOME/fe/target/test-classes:\
${HIVE_HOME}/lib/datanucleus-core-2.0.3.jar:\
${HIVE_HOME}/lib/datanucleus-enhancer-2.0.3.jar:\
${HIVE_HOME}/lib/datanucleus-rdbms-2.0.3.jar:\
${HIVE_HOME}/lib/datanucleus-connectionpool-2.0.3.jar:

卌求Impala使用其目录下的Conf文g夹作为配|文Ӟ然后创徏一下Conf目录Q把3样东西拷贝进来:core-site.xml、hdfs-site.xml、hive-site.xml?br /> core-site.xml的配|,下面几个选项是必要配置的,

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://10.200.4.11:9000</value>
</property>
<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
</property>
<property>
<name>dfs.client.use.legacy.blockreader.local</name>
<value>false</value>
</property>
<property>
<name>dfs.client.read.shortcircuit.skip.checksum</name>
<value>false</value>
</property>
</configuration>

hdfs-site.xml的配|?/p>

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.block.local-path-access.user</name>
<value>${your user}</value>
</property>
<property>
<name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>${yourdatadir}</value>
</property>
<property>
   <name>dfs.client.use.legacy.blockreader.local</name>
   <value>false</value>
</property>
<property>
   <name>dfs.datanode.data.dir.perm</name>
   <value>750</value>
</property>
<property>
    <name>dfs.client.file-block-storage-locations.timeout</name>
    <value>5000</value>
</property>
<property>
    <name>dfs.domain.socket.path</name>
    <value>/home/extend/cdhhadoop/dn.8075</value>
</property>
</configuration>

最后是hive-site.xmlQ这个比较简单,指定使用DBMS为元数据存储卛_(impala必须和hive׃n元数据,因ؓimpala?法create table)QHive-site.xml使用mysql作ؓmetastore的说明在很多地方都可以查刎ͼ配置如下Q?/p>

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://10.28.0.190:3306/impala?createDatabaseIfNotExist=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>root</value>
  <description>username to use against metastore database</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>root</value>
  <description>password to use against metastore database</description>
</property>
</configuration>

记得把mysql-connector的jar包给拯到hive的lib里面去,同样也要拯limpala ( 拯?IMPALA_HOME/fe/target/dependency)

11.启动Impala。到此,Impala是可以正常启动的。这里说明一下,官方文档没有说很清楚Impala的Service之间是如何互相协调的Q按照官方的步骤Q最后通过如下Ҏ来在一台机器上启动Impala ServiceQ?/p>

#启动单机impala service
${IMPALA_HOME}/bin/start-impalad.sh -use_statestore=false
#启动impala shell
${IMPALA_HOME}/bin/impala-shell.sh

然后impala-shell可以连接到localhostq行查询了;注意Q这里只是单机查询,可以用来验证你的Impala是否正常work 了;如何启动一个Impala集群Q蟩到第12步。这里l说一下可能遇到的问题Q我遇到的一个比较奇葩的问题是show tables和count(1)没有问题Q但是select * from table的时候impala在读取数据的时候就崩溃?有时报错could not find method close from class org/apache/hadoop/fs/FSDataInputStream with signature ()V )Q这里修改了两个地方解决q个问题:

a.修改impala的set-classpath.shq移?IMPALA_HOME/fe/target/dependency目录中除了hadoop-auth-2.0.0-*.jar之外所有hadoop-*开头的jar包?/p>

#把impala dependency中和hadoop相关的包l弄出来Q只保留auth
mv $IMPALA_HOME/fe/target/dependency/hadoo* $IMPALA_HOME
mv $IMPALA_HOME/hadoop-auth*.jar mv $IMPALA_HOME/fe/target/dependency
#修改bin/set-classpath.shQ将$HADOOP_HOME中的libl加入,在set-classpath.sh最后一行export CLASSPATH之前#d
for jar in `ls $HADOOP_HOME/share/hadoop/common/*.jar`; do
CLASSPATH=${CLASSPATH}:$jar
done
for jar in `ls $HADOOP_HOME/share/hadoop/yarn/*.jar`; do
CLASSPATH=${CLASSPATH}:$jar
done
for jar in `ls $HADOOP_HOME/share/hadoop/hdfs/*.jar`; do
CLASSPATH=${CLASSPATH}:$jar
done
for jar in `ls $HADOOP_HOME/share/hadoop/mapreduce/*.jar`; do
CLASSPATH=${CLASSPATH}:$jar
done
for jar in `ls $HADOOP_HOME/share/hadoop/tools/lib/*.jar`; do
CLASSPATH=${CLASSPATH}:$jar
done

b.注意到Impala对待table的时候只能够使用hive的默认列分隔W,如果在hive里面create table的时候用了自定义的分隔W,Impala servive׃在读数据的时候莫名其妙的崩溃?/p>

12.启动Impala 集群
Impala实际上由两部分组成,一个是StateStoreQ用来协调各个机器计,相当于MasterQ然后就是ImpaladQ相当于SlaveQ启动方法如下:

#启动statestore
#Ҏ1Q直接利用impala/bin下面的这个python脚本
#q个脚本会启动一个StateStoreQ同时启?s个数量的Impala Service在本?br /> $IMPALA_HOME/bin/start-impala-cluster.py -s 1 –log_dir /home/extend/impala/impalaLogs
#Ҏ2Q手动启动StateStore
$IMPALA_HOME/be/build/debug/statestore/statestored -state_store_port=24000

#启动impala service
#在每个编译安装了impala的节点上执行命o
#参数-state_store_host指定启动了stateStore的机器名
#-nn即namenodeQ指定hadoop的namenode
#-nn_port是namenode的HDFS入口端口?br /> $IMPALA_HOME/bin/start-impalad.sh -state_store_host=m11 -nn=m11 -nn_port=9000

正常启动之后Q访问http://${stateStore_Server}:25010/ 可以看到StateStore的状态,其中的subscribers面可以看到已经q接上的impala service nodeQ?/p>

13.使用Impala客户?/strong>
q一步最单,随便找一个机器启?/p> $IMPALA_HOME/bin/impala-shell.sh
#启动之后可以随便q接一个impala service
connect m12
#q接上之后就可以执行show tables之类的操作了
#需要注意的是,如果hive创徏表或更新了表l构Qimpala的节Ҏ不知道的
#必须通过客户端连接各个impala serviceq执行refresh来刷新metadata
#或者重启所有impala service

ivaneeo 2013-06-29 17:12 发表评论
]]>
Virtual Desktophttp://www.aygfsteel.com/ivanwan/archive/2012/10/20/389916.htmlivaneeoivaneeoSat, 20 Oct 2012 05:18:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2012/10/20/389916.htmlhttp://www.aygfsteel.com/ivanwan/comments/389916.htmlhttp://www.aygfsteel.com/ivanwan/archive/2012/10/20/389916.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/389916.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/389916.html
8 Virtual Desktop program: Ulteo, NX Enteprise Server, FoSS CLOUD, Orcale Virtualbox, Thinstuff, JetClouding, Go Grid,2xCloud Computing


ivaneeo 2012-10-20 13:18 发表评论
]]>
kvm创徏http://www.aygfsteel.com/ivanwan/archive/2012/06/08/380368.htmlivaneeoivaneeoFri, 08 Jun 2012 09:55:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2012/06/08/380368.htmlhttp://www.aygfsteel.com/ivanwan/comments/380368.htmlhttp://www.aygfsteel.com/ivanwan/archive/2012/06/08/380368.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/380368.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/380368.html
sudo qemu-img create -f qcow2 -o size=30240M,preallocation=metadata win2003_hda.img
http://blog.kreyolys.com/2011/09/27/kvm-virtual-machines-disk-format-file-basedqcow2-or-block-devicelvm2/---比较
sudo virt-install \
--name win2003_test \
--ram=1024 \
--vcpus=2 \
--disk /kvm/win2003_hda.img,bus=virtio \
--network bridge:br0,model=virtio \
--vnc \
--accelerate \
-c /share/os/win2003-i386.iso \
--disk /home/kvm/virtio-win-1.1.16.vfd,device=floppy \
-c /home/kvm/virtio-win-0.1-22.iso \
--os-type=windows \
--os-variant=win2k3 \
--noapic \
--connect \
qemu:///system \
--hvm

http://www.howtoforge.com/installing-kvm-guests-with-virt-install-on-ubuntu-12.04-lts-server


半虚拟化参考:
  1. #!/bin/sh
  2. WINISO=/path/to/win7.iso    #Windows ISO
  3. INSTALLDISK=win7virtio.img  #Disk location. Can be LVM LV
  4. VFD=http://alt.fedoraproject.org/pub/alt/virtio-win/latest/images/bin/virtio-win-1.1.16.vfd
  5. DRVRISO=http://alt.fedoraproject.org/pub/alt/virtio-win/latest/images/bin/virtio-win-0.1-22.iso
  6.  
  7. [ -e $(basename $VFD) ]     || wget $VFD
  8. [ -e $(basename $DRVRISO) ] || wget $DRVRISO
  9. [ -e $INSTALLDISK ]         || qemu-img create $INSTALLDISK 30G
  10.  
  11. sudo virt-install -c qemu:///system --virt-type kvm --name win7virtio --ram 1024 --disk path="$INSTALLDISK",bus=virtio \
  12. --disk $(basename $VFD),device=floppy --os-variant win7 --cdrom $(basename $DRVRISO) --cdrom "$WINISO" --vcpus 2
  13. ENDING OF BASH SCRIPT

其他参考:

 

In my previous article KVM Guests: Using Virt-Install to Import an Existing Disk Image we discussed how to use virt-install to import an existing disk image, which already has an OS installed into it.  Additionally in KVM Guests: Using Virt-Install to Install Debian and Ubuntu Guests I documented how to initiate an install directly off of the apt mirror of your choice for Debian and Ubuntu Guests using virt-install.  In this article we will use virt-install to create a guest and begin the installation using a CD or ISO image for installation media.

Assumptions I Have Made

  • My KVM host is Ubuntu 10.10 and I am assuming that yours is as well.  If it is not then the syntax might be slightly different or may not include the same features.
  • That you have kvm installed on the host and you can manually create VMs using virt-manager and they work perfectly.
  • That you have a bridge configured and working on other guests.
  • That you have virt-install and libvirt-bin installed as well as virt-manager or virt-viewer so that you can complete the install after the virt-install command has completed.
  • That you are trying to import disk images that support VirtIO devices (most recent Linux distributions, Windows does not natively support the VirtIO interface, so you will had to have manually installed the VirtIO drivers into your disk image).

The Basic Command

# virt-install -n vmname -r 2048 --os-type=linux --os-variant=ubuntu --disk /kvm/images/disk/vmname_boot.img,device=disk,bus=virtio,size=40,sparse=true,format=raw -w bridge=br0,model=virtio --vnc --noautoconsole -c /kvm/images/iso/ubuntu.iso

Parameters Detailed

  • -n vmname [the name of your VM]
  • -r 2048 [the amount of RAM in MB for your VM]
  • –os-type=linux [the type of OS linux or windows]
  • –os-variant=ubuntu [the distribution or version of Windows for a full list see man virt-install]
  • –disk /kvm/images/disk/vmname_boot.img,device=disk,bus=virtio,size=40,sparse=true,format=raw [this is a long one you define the path, then comma delimited options, device is the type of storage cdrom, disk, floppy, bus is the interface ide, scsi, usb, virtio - virtio is the fastest but you need to install the drivers for Windows and older versions of Linux don't have support]
  • -w bridge=br0,model=virtio [the network configuration, in this case we are connecting to a bridge named br0, and using the virtio drivers which perform much better if you are using an OS which doesn't support virtio you can use e1000 or rtl8139.  You could alternatively use --nonetworks if you do not need networking]
  • –vnc [configures the graphics card to use VNC allowing you to use virt-viewer or virt-manager to see the desktop as if you were at the a monitor of a physical machine]
  • –noautoconsole [configures the installer to NOT automatically try to open virt-viewer to view the console to complete the installation - this is helpful if you are working on a remote system through SSH]
  • -c /kvm/images/iso/ubuntu.iso [this option specifies the cdrom device or iso image with which to boot off of.  You could additionally specify the cdrom device as a disk device, and not use the -c option, it will then boot off of the cdrom if you don't specify another installation method]

LVM Disk Variation

# virt-install -n vmname -r 2048 --os-type=linux --os-variant=ubuntulucid  --disk  /dev/vg_name/lv_name,device=disk,bus=virtio  -w bridge=br0,model=virtio --vnc --noautoconsole -c  /kvm/images/iso/ubuntu.iso

No VirtIO Variation (Uses IDE and e1000 NIC Emulation)

# virt-install -n vmname -r 2048 --os-type=linux  --os-variant=ubuntulucid --disk  /kvm/images/disk/vmname_boot.img,device=disk,bus=ide,size=40,sparse=true,format=raw  -w bridge=br0,model=e1000 --vnc --noautoconsole -c  /kvm/images/iso/ubuntu.iso

Define VM Without Installation Method

# virt-install -n vmname -r 2048 --os-type=linux --os-variant=ubuntulucid --disk /kvm/images/disk/vmname_boot.img,device=disk,bus=virtio,size=40,sparse=true,format=raw --disk /kvm/images/iso/ubuntu.iso,device=cdrom -w bridge=br0,model=virtio --vnc --noautoconsole

 



ivaneeo 2012-06-08 17:55 发表评论
]]>
Cassandra、MongoDB、CouchDB、Redis、Riak、HBase比较http://www.aygfsteel.com/ivanwan/archive/2011/07/05/353713.htmlivaneeoivaneeoTue, 05 Jul 2011 07:11:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2011/07/05/353713.htmlhttp://www.aygfsteel.com/ivanwan/comments/353713.htmlhttp://www.aygfsteel.com/ivanwan/archive/2011/07/05/353713.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/353713.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/353713.html

本文有标题党之嫌。在NoSQL如日中天的今天,各种NoSQL产品可谓百花齐放Q但每一个品都有自q特点Q有长处也有不适合的场景。本文对CassandraMongodbCouchDBRedisRiak 以及 HBase q行了多斚w的特点分析,希望看完此文的您能够对这些NoSQL产品的特性有所了解?/p>

CouchDB

  • Written in: Erlang
  • Main point: DB consistency, ease of use
  • License: Apache
  • Protocol: HTTP/REST
  • Bi-directional (!) replication,
  • continuous or ad-hoc,
  • with conflict detection,
  • thus, master-master replication. (!)
  • MVCC – write operations do not block reads
  • Previous versions of documents are available
  • Crash-only (reliable) design
  • Needs compacting from time to time
  • Views: embedded map/reduce
  • Formatting views: lists & shows
  • Server-side document validation possible
  • Authentication possible
  • Real-time updates via _changes (!)
  • Attachment handling
  • thus, CouchApps (standalone js apps)
  • jQuery library included

Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.

For example: CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.

Redis

  • Written in: C/C++
  • Main point: Blazing fast
  • License: BSD
  • Protocol: Telnet-like
  • Disk-backed in-memory database,
  • but since 2.0, it can swap to disk.
  • Master-slave replication
  • Simple keys and values,
  • but complex operations like ZREVRANGEBYSCORE
  • INCR & co (good for rate limiting or statistics)
  • Has sets (also union/diff/inter)
  • Has lists (also a queue; blocking pop)
  • Has hashes (objects of multiple fields)
  • Of all these databases, only Redis does transactions (!)
  • Values can be set to expire (as in a cache)
  • Sorted sets (high score table, good for range queries)
  • Pub/Sub and WATCH on data changes (!)

Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).

For example: Stock prices. Analytics. Real-time data collection. Real-time communication.

MongoDB

  • Written in: C++
  • Main point: Retains some friendly properties of SQL. (Query, index)
  • License: AGPL (Drivers: Apache)
  • Protocol: Custom, binary (BSON)
  • Master/slave replication
  • Queries are javascript expressions
  • Run arbitrary javascript functions server-side
  • Better update-in-place than CouchDB
  • Sharding built-in
  • Uses memory mapped files for data storage
  • Performance over features
  • After crash, it needs to repair tables
  • Better durablity coming in V1.8

Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.

For example: For all things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.

Cassandra

  • Written in: Java
  • Main point: Best of BigTable and Dynamo
  • License: Apache
  • Protocol: Custom, binary (Thrift)
  • Tunable trade-offs for distribution and replication (N, R, W)
  • Querying by column, range of keys
  • BigTable-like features: columns, column families
  • Writes are much faster than reads (!)
  • Map/reduce possible with Apache Hadoop
  • I admit being a bit biased against it, because of the bloat and complexity it has partly because of Java (configuration, seeing exceptions, etc)

Best used: When you write more than you read (logging). If every component of the system must be in Java. (“No one gets fired for choosing Apache’s stuff.”)

For example: Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is real time data analysis.

Riak

  • Written in: Erlang & C, some Javascript
  • Main point: Fault tolerance
  • License: Apache
  • Protocol: HTTP/REST
  • Tunable trade-offs for distribution and replication (N, R, W)
  • Pre- and post-commit hooks,
  • for validation and security.
  • Built-in full-text search
  • Map/reduce in javascript or Erlang
  • Comes in “open source” and “enterprise” editions

Best used: If you want something Cassandra-like (Dynamo-like), but no way you’re gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you’re ready to pay for multi-site replication.

For example: Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt.

HBase

  • Written in: Java
  • Main point: Billions of rows X millions of columns
  • License: Apache
  • Protocol: HTTP/REST (also Thrift)
  • Modeled after BigTable
  • Map/reduce with Hadoop
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A high performance Thrift gateway
  • HTTP supports XML, Protobuf, and binary
  • Cascading, hive, and pig source and sink modules
  • Jruby-based (JIRB) shell
  • No single point of failure
  • Rolling restart for configuration changes and minor upgrades
  • Random access performance is like MySQL

Best used: If you’re in love with BigTable. :) And when you need random, realtime read/write access to your Big Data.

For example: Facebook Messaging Database (more general example coming soon)

原文链接Q?a >Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison



ivaneeo 2011-07-05 15:11 发表评论
]]>
Java虚拟机类型卸载和cd更新解析http://www.aygfsteel.com/ivanwan/archive/2011/06/16/352458.htmlivaneeoivaneeoThu, 16 Jun 2011 12:05:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2011/06/16/352458.htmlhttp://www.aygfsteel.com/ivanwan/comments/352458.htmlhttp://www.aygfsteel.com/ivanwan/archive/2011/06/16/352458.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/352458.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/352458.html 前面pȝ讨论qjavacd加蝲(loading)的问题,在这文章中要分析一下javacd卸蝲(unloading)的问题,q简要分析一下如何解军_何运行时加蝲newly compiled version的问题?br />
【相兌范摘要?br />    首先看一下,关于java虚拟范中时如何阐q类型卸?unloading)的:
    A class or interface may be unloaded if and only if its class loader is unreachable. The bootstrap class loader is always reachable; as a resultQ?system classes may never be unloaded.
    Java虚拟范中关于cd卸蝲的内容就q么单两句话Q大致意思就是:只有当加载该cd的类加蝲器实?非类加蝲器类?为unreachable状态时Q当前被加蝲的类型才被卸?启动cd载器实例永远为reachable状态,由启动类加蝲器加载的cd可能永远不会被卸?

    我们再看一下Java语言规范提供的关于类型卸载的更详l的信息(部分摘录)Q?br />    //摘自JLS 12.7 Unloading of Classes and Interfaces
    1、An implementation of the Java programming language may unload classes.
    2、Class unloading is an optimization that helps reduce memory use. ObviouslyQthe semantics of a program should not depend  on whether and how a system chooses to implement an optimization such as class unloading.
    3、ConsequentlyQwhether a class or interface has been unloaded or not should be transparent to a program

    通过以上我们可以得出l论Q?cd卸蝲(unloading)仅仅是作ZU减内存用的性能优化措施存在的,具体和虚拟机实现有关Q对开发者来说是透明?

    U观java语言规范及其相关的API规范Q找不到昄cd卸蝲(unloading)的接口, 换句话说Q?
    1、一个已l加载的cd被卸载的几率很小臛_被卸载的旉是不定?br />    2、一个被特定cd载器实例加蝲的类型运行时可以认ؓ是无法被更新?br />
【类型卸载进一步分析?br />     前面提到q,如果惛_载某cdQ必M证加载该cd的类加蝲器处于unreachable状态,现在我们再看看有 关unreachable状态的解释Q?br />    1、A reachable object is any object that can be accessed in any potential continuing computation from any live thread.
    2、finalizer-reachable: A finalizer-reachable object can be reached from some finalizable object through some chain of references, but not from any live thread. An unreachable object cannot be reached by either means.

    某种E度上讲Q在一个稍微复杂的java应用中,我们很难准确判断Z个实例是否处于unreachable状态,所    以ؓ了更加准的Dq个所谓的unreachable状态,我们下面的测试代码尽量简单一?
    
    【测试场景一】用自定义cd载器加蝲Q?然后试其讄为unreachable的状?br />    说明Q?br />    1、自定义cd载器(Z单v见, q里假讑֊载当前工E以外D盘某文g夹的class)
    2、假讄前有一个简单自定义cdMyClass对应的字节码存在于DQ?classes目录?br />    
public class MyURLClassLoader extends URLClassLoader {
   public MyURLClassLoader() {
      super(getMyURLs());
   }

   private static URL[] getMyURLs() {
    try {
       return new URL[]{new File ("DQ?classes/").toURL()};
    } catch (Exception e) {
       e.printStackTrace();
       return null;
    }
  }
}

 1 public class Main {
 2     public static void main(String[] args) {
 3       try {
 4          MyURLClassLoader classLoader = new MyURLClassLoader();
 5          Class classLoaded = classLoader.loadClass("MyClass");
 6          System.out.println(classLoaded.getName());
 7
 8          classLoaded = null;
 9          classLoader = null;
10
11          System.out.println("开始GC");
12          System.gc();
13          System.out.println("GC完成");
14        } catch (Exception e) {
15            e.printStackTrace();
16        }
17     }
18 }

        我们增加虚拟机参?verboseQgc来观察垃圾收集的情况Q对应输出如下:   
MyClass
开始GC
[Full GC[Unloading class MyClass]
207K->131K(1984K)Q?0.0126452 secs]
GC完成

    【测试场景二】用系l类加蝲器加载,但是无法其讄为unreachable的状?br />      说明Q将场景一中的MyClasscd字节码文件放|到工程的输出目录下Q以便系l类加蝲器可以加?br />        
 1 public class Main {
 2     public static void main(String[] args) {
 3      try {
 4       Class classLoaded =  ClassLoader.getSystemClassLoader().loadClass(
 5 "MyClass");
 6
 7
 8      System.out.printl(sun.misc.Launcher.getLauncher().getClassLoader());
 9      System.out.println(classLoaded.getClassLoader());
10      System.out.println(Main.class.getClassLoader());
11
12      classLoaded = null;
13
14      System.out.println("开始GC");
15      System.gc();
16      System.out.println("GC完成");
17
18      //判断当前pȝcd载器是否有被引用(是否是unreachable状?
19      System.out.println(Main.class.getClassLoader());
20     } catch (Exception e) {
21         e.printStackTrace();
22     }
23   }
24 }
        
        我们增加虚拟机参?verboseQgc来观察垃圾收集的情况Q?对应输出如下Q?
sun.misc.Launcher$AppClassLoader@197d257
sun.misc.Launcher$AppClassLoader@197d257
sun.misc.Launcher$AppClassLoader@197d257
开始GC
[Full GC 196K->131K(1984K)Q?0.0130748 secs]
GC完成
sun.misc.Launcher$AppClassLoader@197d257

        ׃pȝClassLoader实例(AppClassLoader@197d257">sun.misc.Launcher$AppClassLoader@197d257)加蝲了很多类型,而且又没有明的接口其讄为nullQ所以我们无法将加蝲MyClasscd的系l类加蝲器实例设|ؓunreachable状态,所以通过试l果我们可以看出QMyClasscdq没有被卸蝲.(说明Q?像类加蝲器实例这U较为特D的对象一般在很多地方被引用, 会在虚拟Z呆比较长的时?

    【测试场景三】用扩展类加蝲器加载, 但是无法其讄为unreachable的状?br />
        说明Q将试场景二中的MyClasscd字节码文件打包成jar攄到JRE扩展目录下,以便扩展cd载器可以加蝲的到。由于标志扩展ClassLoader实例(ExtClassLoader@7259da">sun.misc.Launcher$ExtClassLoader@7259da)加蝲了很多类型,而且又没有明的接口其讄为nullQ所以我们无法将加蝲MyClasscd的系l类加蝲器实例设|ؓunreachable状态,所以通过试l果我们可以看出QMyClasscdq没有被卸蝲.
        
 1 public class Main {
 2      public static void main(String[] args) {
 3        try {
 4          Class classLoaded = ClassLoader.getSystemClassLoader().getParent()
 5 .loadClass("MyClass");
 6
 7          System.out.println(classLoaded.getClassLoader());
 8
 9          classLoaded = null;
10
11          System.out.println("开始GC");
12          System.gc();
13          System.out.println("GC完成");
14          //判断当前标准扩展cd载器是否有被引用(是否是unreachable状?
15          System.out.println(Main.class.getClassLoader().getParent());
16       } catch (Exception e) {
17          e.printStackTrace();
18       }
19    }
20 }

        我们增加虚拟机参?verboseQgc来观察垃圾收集的情况Q对应输出如下:
sun.misc.Launcher$ExtClassLoader@7259da
开始GC
[Full GC 199K->133K(1984K)Q?0.0139811 secs]
GC完成
sun.misc.Launcher$ExtClassLoader@7259da


    关于启动cd载器我们׃需再做相关的测试了Qjvm规范和JLS中已l有明确的说明了.


    【类型卸载ȝ?br />    通过以上的相x?虽然试的场景较为简?我们可以大致q样概括Q?br />    1、有启动cd载器加蝲的类型在整个q行期间是不可能被卸载的(jvm和jls规范).
    2、被pȝcd载器和标准扩展类加蝲器加载的cd在运行期间不太可能被卸蝲Q因为系l类加蝲器实例或者标准扩展类的实例基本上在整个运行期间总能直接或者间接的讉K的到Q其辑ֈunreachable的可能性极?(当然Q在虚拟机快退出的时候可以,因ؓ不管ClassLoader实例或者Class(java.lang.Class)实例也都是在堆中存在Q同样遵循垃圾收集的规则).
    3、被开发者自定义的类加蝲器实例加载的cd只有在很单的上下文环境中才能被卸载,而且一般还要借助于强制调用虚拟机的垃圾收集功能才可以做到.可以预想Q稍微复杂点的应用场景中(其很多时候,用户在开发自定义cd载器实例的时候采用缓存的{略以提高系l性能)Q被加蝲的类型在q行期间也是几乎不太可能被卸载的(臛_卸蝲的时间是不确定的).

      l合以上三点Q我们可以默认前面的l论1Q?一个已l加载的cd被卸载的几率很小臛_被卸载的旉是不定?同时Q我们可以看的出来,开发者在开发代码时候,不应该对虚拟机的cd卸蝲做Q何假讄前提下来实现pȝ中的特定功能.
    
      【类型更新进一步分析?br />    前面已经明确说过Q被一个特定类加蝲器实例加载的特定cd在运行时是无法被更新?注意q里说的
         是一个特定的cd载器实例Q而非一个特定的cd载器cd.
    
        【测试场景四?br />        说明Q现在要删除前面已经攑֜工程输出目录下和扩展目录下的对应的MyClasscd对应的字节码
        
 1 public class Main {
 2      public static void main(String[] args) {
 3        try {
 4          MyURLClassLoader classLoader = new MyURLClassLoader();
 5          Class classLoaded1 = classLoader.loadClass("MyClass");
 6          Class classLoaded2 = classLoader.loadClass("MyClass");
 7          //判断两次加蝲classloader实例是否相同
 8           System.out.println(classLoaded1.getClassLoader() == classLoaded2.getClassLoader());
 9
10         //判断两个Class实例是否相同
11           System.out.println(classLoaded1 == classLoaded2);
12       } catch (Exception e) {
13          e.printStackTrace();
14       }
15    }
16 }
        输出如下Q?br />        true
        true

        通过l果我们可以看出来,两次加蝲获取到的两个Classcd实例是相同的.那是不是实是我们的自定?br />       cd载器真正意义上加载了两次?即从获取class字节码到定义classcd…整个q程??
      通过对java.lang.ClassLoader的loadClass(String nameQboolean resolve)Ҏq行调试Q我们可以看出来Q第?br />      ?nbsp; 加蝲q不是真正意义上的加载,而是直接q回了上ơ加载的l果.

       说明Qؓ了调试方便, 在Class classLoaded2 = classLoader.loadClass("MyClass");行设|断点,然后单步跛_Q?可以看到W二ơ加载请求返回的l果直接是上ơ加载的Class实例. 调试q程中的截图?最好能自己调试一?.
      
     
        【测试场景五】同一个类加蝲器实例重复加载同一cd
        说明Q首先要对已有的用户自定义类加蝲器做一定的修改Q要覆盖已有的类加蝲逻辑Q?MyURLClassLoader.javacȝ要修改如下:重新q行试场景四中的测试代?br />      
 1 public class MyURLClassLoader extends URLClassLoader {
 2     //省略部分的代码和前面相同Q只是新增如下覆盖方?
 3     /*
 4     * 覆盖默认的加载逻辑Q如果是DQ?classes/下的cd每次强制重新完整加蝲
 5     *
 6     * @see java.lang.ClassLoader#loadClass(java.lang.String)
 7     */
 8     @Override
 9     public Class<?> loadClass(String name) throws ClassNotFoundException {
10      try {
11        //首先调用pȝcd载器加蝲
12         Class c = ClassLoader.getSystemClassLoader().loadClass(name);
13        return c;
14      } catch (ClassNotFoundException e) {
15       // 如果pȝcd载器及其父类加蝲器加载不上,则调用自w逻辑来加载DQ?classes/下的cd
16          return this.findClass(name);
17      }
18   }
19 }
说明Q?this.findClass(name)会进一步调用父cURLClassLoader中的对应ҎQ其中涉及到了defineClass(String name)的调用,所以说现在cd载器MyURLClassLoader会针对DQ?classes/目录下的cdq行真正意义上的强制加蝲q定义对应的cd信息.

        试输出如下Q?br />        Exception in thread "main" java.lang.LinkageErrorQ?duplicate class definitionQ?MyClass
       at java.lang.ClassLoader.defineClass1(Native Method)
       at java.lang.ClassLoader.defineClass(ClassLoader.javaQ?20)
       at java.security.SecureClassLoader.defineClass(SecureClassLoader.javaQ?24)
       at java.net.URLClassLoader.defineClass(URLClassLoader.javaQ?60)
       at java.net.URLClassLoader.access$100(URLClassLoader.javaQ?6)
       at java.net.URLClassLoader$1.run(URLClassLoader.javaQ?95)
       at java.security.AccessController.doPrivileged(Native Method)
       at java.net.URLClassLoader.findClass(URLClassLoader.javaQ?88)
       at MyURLClassLoader.loadClass(MyURLClassLoader.javaQ?1)
       at Main.main(Main.javaQ?7)
      
       l论Q如果同一个类加蝲器实例重复强制加?含有定义cddefineClass动作)相同cdQ会引vjava.lang.LinkageError: duplicate class definition.
    
       【测试场景六】同一个加载器cd的不同实例重复加载同一cd
       
 1 public class Main {
 2     public static void main(String[] args) {
 3       try {
 4         MyURLClassLoader classLoader1 = new MyURLClassLoader();
 5         Class classLoaded1 = classLoader1.loadClass("MyClass");
 6         MyURLClassLoader classLoader2 = new MyURLClassLoader();
 7         Class classLoaded2 = classLoader2.loadClass("MyClass");
 8
 9         //判断两个Class实例是否相同
10          System.out.println(classLoaded1 == classLoaded2);
11       } catch (Exception e) {
12          e.printStackTrace();
13       }
14    }
15 }

      试对应的输出如下:
      false
     
    
        【类型更新ȝ?nbsp;  
     ׃同类加蝲器实例重复强制加?含有定义cddefineClass动作)同一cd不会引vjava.lang.LinkageError错误Q?但是加蝲l果对应的Classcd实例是不同的Q即实际上是不同的类?虽然包名+cd相同). 如果强制转化使用Q会引vClassCastException.(说明Q?头一D|间那文章中解释q,Z么不同类加蝲器加载同名类型实际得到的l果其实是不同类型, 在JVM中一个类用其全名和一个加载类ClassLoader的实例作为唯一标识Q不同类加蝲器加载的cd被置于不同的命名I间).


        应用场景Q我们在开发的时候可能会遇到q样的需求,是要动态加载某指定cdclass文g的不同版本,以便能动态更新对应功?
         Q?br />        1. 不要寄希望于{待指定cd的以前版本被卸蝲Q卸载行为对java开发h员透明?
        2. 比较可靠的做法是Q每ơ创建特定类加蝲器的新实例来加蝲指定cd的不同版本,q种使用场景下,一般就要牺牲缓存特定类型的cd载器实例以带来性能优化的策略了.对于指定cd已经被加载的版本Q?会在适当时机辑ֈunreachable状态,被unloadq垃圑֛?每次使用完类加蝲器特定实例后(定不需要再使用?Q?其昄赋ؓnullQ?q样可能会比较快的达到jvm 规范中所说的cd载器实例unreachable状态, 增大已经不再使用的类型版本被快卸蝲的机?
        3. 不得不提的是Q每ơ用新的cd载器实例d载指定类型的指定版本Q确实会带来一定的内存消耗,一般类加蝲器实例会在内存中保留比较长的旉. 在bea开发者网站上扑ֈ一相关的文章(有专门分析ClassLoader的部?QhttpQ?/dev2dev.bea.com/pub/a/2005/06/memory_leaks.html

           写的q程中参考了jvm规范和jlsQ?q参考了sun公司官方|站上的一些bug的分析文档?br />
           Ƣ迎大家批评指正Q?br />

本博客中的所有文章、随W除了标题中含有引用或者{载字LQ其他均为原创。{载请注明出处Q谢谢!

ivaneeo 2011-06-16 20:05 发表评论
]]>
hbase单独启动region serverhttp://www.aygfsteel.com/ivanwan/archive/2011/06/16/352414.htmlivaneeoivaneeoThu, 16 Jun 2011 04:10:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2011/06/16/352414.htmlhttp://www.aygfsteel.com/ivanwan/comments/352414.htmlhttp://www.aygfsteel.com/ivanwan/archive/2011/06/16/352414.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/352414.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/352414.html启动集群中所有的regionserver
./hbase-daemons.sh start regionserver
启动某个regionserver
./hbase-daemon.sh start regionserver


ivaneeo 2011-06-16 12:10 发表评论
]]>
Htable数据的访问问?/title><link>http://www.aygfsteel.com/ivanwan/archive/2011/06/15/352369.html</link><dc:creator>ivaneeo</dc:creator><author>ivaneeo</author><pubDate>Wed, 15 Jun 2011 09:17:00 GMT</pubDate><guid>http://www.aygfsteel.com/ivanwan/archive/2011/06/15/352369.html</guid><wfw:comment>http://www.aygfsteel.com/ivanwan/comments/352369.html</wfw:comment><comments>http://www.aygfsteel.com/ivanwan/archive/2011/06/15/352369.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.aygfsteel.com/ivanwan/comments/commentRss/352369.html</wfw:commentRss><trackback:ping>http://www.aygfsteel.com/ivanwan/services/trackbacks/352369.html</trackback:ping><description><![CDATA[<div><h1><cite> </cite> </h1> <div> <p> </p><p>做了几天工程Q对HBase中的表操作熟悉了一下。下面ȝ一下常用的表操作和Ҏ出错的几个方面。当然主要来源于大牛们的文章。我在前人的基础上稍作解释?/p> <p>1.q接HBase中的表testtable,用户名:root,密码Qroot</p> <p>public void ConnectHBaseTable()<br /> {<br />  Configuration conf = new Configuration();       <br />        conf.set("hadoop.job.ugi", "root,root");      <br />  HBaseConfiguration config = new HBaseConfiguration();<br />  try<br />  {<br />   table = new HTable(config, "testtable");<br />  }catch(Exception e){e.printStackTrace();}<br /> }</p> <p>2.Ҏ行名name获得一行数据,存入Result.注意HBase中的表数据是字节存储的?/p> <p>   下面的例子表C得行名ؓname的行的famA列族col1列的数据?/p> <p>      String rowId = "name";<br />      Get get = new Get(rowId);<br />      Result result = hTable.get(get);<br />      byte[] value = result.getValue(famA, col1);<br />      System.out.println(Bytes.toString(value));<br /></p> <p>3.向表中存数据</p> <p>      下面的例子表C写入一行。行名ؓabcdQfamA列族col1列的数据?hello world!"?/p> <p><span>      byte[] rowId = Bytes.toBytes("abcd");<br />      byte[] famA = Bytes.toBytes("famA");<br />      byte[] col1 = Bytes.toBytes("col1");<br />      Put put = new Put(rowId).<br />         add(famA, col1, Bytes.toBytes("hello world!"));<br />      hTable.put(put);<br />     </span></p> <p><span>4.扫描的用法(scanQ:便于获得自己需要的数据Q相当于SQL查询?/span></p> <p><span><span>      byte[] famA = Bytes.toBytes("famA");<br />      byte[] col1 = Bytes.toBytes("col1");  <br /><br />      HTable hTable = new HTable("test");  <br /></span></span></p> <p><span><span>      //表示要查询的行名是从a开始,到zl束?br />      Scan scan = new Scan(Bytes.toBytes("a"), Bytes.toBytes("z"));<br />     </span></span></p> <p><span><span>      //用scan.setStartRow(Bytes.toBytes(""));讄起始?/span></span></p> <p><span><span>      //用scan.setStopRow(Bytes.toBytes(""));讄l止?/span></span></p> <p><span><span>      //表示查询famA族col1?/span></span></p> <p><span><span>      scan.addColumn(famA, col1);  <br /></span></span></p> <p><span><span>      //注意Q下面是filter的写法。相当于SQL的where子句</span></span></p> <p><span><span>      //表示famA族col1列的数据{于<span>"hello world!"<br />      </span>SingleColumnValueFilter singleColumnValueFilterA = new SingleColumnValueFilter(<br />           famA, col1, CompareOp.EQUAL, Bytes.toBytes("hello world!"));<br />      singleColumnValueFilterA.setFilterIfMissing(true);  <br /></span></span></p> <p><span><span>      //表示famA族col1列的数据{于<span>"hello hbase!"<br />      </span>SingleColumnValueFilter singleColumnValueFilterB = new SingleColumnValueFilter(<br />           famA, col1, CompareOp.EQUAL, Bytes.toBytes("hello hbase!"));<br />      singleColumnValueFilterB.setFilterIfMissing(true);  <br />      </span></span></p> <p><span><span>      //表示famA族col1列的数据是两者中的一?br />      FilterList filter = new FilterList(Operator.MUST_PASS_ONE, Arrays<br />           .asList((Filter) singleColumnValueFilterA,<br />                singleColumnValueFilterB));  <br /><br />      scan.setFilter(filter);  <br /><br />      ResultScanner scanner = hTable.getScanner(scan);  <br />      //遍历每个数据<br />      for (Result result : scanner) {<br />         System.out.println(Bytes.toString(result.getValue(famA, col1)));<br />      }<br /></span></span></p> <p><span><span>5.上面的代码容易出错的地方在于Q需要导入HBase的类所在的包。导入时需要选择包,׃cd能出现在HBase的各个子包中Q所以要选择好,下面列出常用的包。尽量用HBase的包</span></span></p> <p><span><span>import org.apache.hadoop.conf.Configuration;<br />import org.apache.hadoop.hbase.HBaseConfiguration;<br />import org.apache.hadoop.hbase.client.Get;<br />import org.apache.hadoop.hbase.client.HTable;<br />import org.apache.hadoop.hbase.client.Put;<br />import org.apache.hadoop.hbase.client.Result;<br />import org.apache.hadoop.hbase.client.ResultScanner;<br />import org.apache.hadoop.hbase.client.Scan;<br />import org.apache.hadoop.hbase.filter.Filter;<br />import org.apache.hadoop.hbase.filter.FilterList;<br />import org.apache.hadoop.hbase.filter.SingleColumnValueFilter;<br />import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp;<br />import org.apache.hadoop.hbase.filter.FilterList.Operator;<br />import org.apache.hadoop.hbase.util.Bytes;</span></span></p> <p><span><span>import java.io.IOException;<br />import java.text.SimpleDateFormat;<br />import java.util.Arrays;<br />import java.util.Date;</span></span></p> <p><span><span>6.下面列出HBase常用的操?/span></span></p> <p><span><span>Q?Q时间戳到时间的转换.单一的时间戳无法l出直观的解释?/span></span></p> <p><span><span>public String GetTimeByStamp(String timestamp)<br /> {</span></span></p> <p><span><span>  long datatime= Long.parseLong(timestamp); <br />     Date date=new Date(datatime);   <br />     SimpleDateFormat   format=new   SimpleDateFormat("yyyy-MM-dd HH:MM:ss");   <br />     String timeresult=format.format(date);<br />     System.out.println("Time : "+timeresult);<br />     return timeresult;<br /> }</span></span></p> <p><span><span>Q?Q时间到旉戳的转换。注意时间是字符串格式。字W串与时间的怺转换Q此不赘q?/span></span><span><span>?/span></span></p> <p><span><span>public String GetStampByTime(String time)<br /> {<br />  String Stamp="";<br />  SimpleDateFormat sdf=new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");<br />  Date date;<br />  try<br />  {<br />   date=sdf.parse(time);<br />   Stamp=date.getTime()+"000";<br />   System.out.println(Stamp);<br />  }catch(Exception e){e.printStackTrace();}<br />  return Stamp;<br /> }</span></span></p> <p><span><span>上面是我的一点心得。以后碰C么问题,再来解决?/span></span></p> <p><span><span>参考文献:<a >http://www.nearinfinity.com/blogs/aaron_mccurry/using_hbase-dsl.html</a></span></span></p></div></div><img src ="http://www.aygfsteel.com/ivanwan/aggbug/352369.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.aygfsteel.com/ivanwan/" target="_blank">ivaneeo</a> 2011-06-15 17:17 <a href="http://www.aygfsteel.com/ivanwan/archive/2011/06/15/352369.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>HBase性能调优http://www.aygfsteel.com/ivanwan/archive/2011/06/15/352350.htmlivaneeoivaneeoWed, 15 Jun 2011 05:39:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2011/06/15/352350.htmlhttp://www.aygfsteel.com/ivanwan/comments/352350.htmlhttp://www.aygfsteel.com/ivanwan/archive/2011/06/15/352350.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/352350.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/352350.html

?a target="_blank">官方Book Performance Tuning部分章节没有按配|项q行索引Q不能达到快速查阅的效果。所以我以配|项驱动Q重新整理了原文Qƈ补充一些自q理解Q如有错误,Ƣ迎指正?/p>

配置优化

zookeeper.session.timeout
默认?/strong>Q?分钟Q?80000msQ?br /> 说明QRegionServer与Zookeeper间的q接时旉。当时旉到后QReigonServer?被Zookeeper从RS集群清单中移除,HMaster收到U除通知后,会对q台server负责的regions重新balanceQ让其他存活?RegionServer接管.
调优Q?br /> q个timeout军_了RegionServer是否能够及时的failover。设|成1分钟或更低,可以减少因等待超时而被廉的failover旉?br /> 不过需要注意的是,对于一些Online应用QRegionServer的宕机到恢复旉本n很短的Q网l闪断,crash{故障,q维可快速介入)Q?如果调低timeout旉Q会得不偿失。因为当ReigonServer被正式从RS集群中移除时QHMaster开始做balance了,当故障的 RS快速恢复后Q这个balance动作是毫无意义的Q反而会使负载不均匀Q给RS带来更多负担?/p>

hbase.regionserver.handler.count
默认?/strong>Q?0
说明QRegionServer的请求处理IOU程数?br /> 调优Q?br /> q个参数的调优与内存息息相关?br /> 较少的IOU程Q适用于处理单ơ请求内存消耗较高的Big PUT场景Q大定w单次PUT或设|了较大cache的scanQ均属于Big PUTQ或ReigonServer的内存比较紧张的场景?br /> 较多的IOU程Q适用于单ơ请求内存消耗低QTPS要求非常高的场景?br /> q里需要注意的是如果server的region数量很少Q大量的h都落在一个region上,因快速充满memstore触发flushD的读写锁会媄响全局TPSQ不是IOU程数越高越好?br /> 压测Ӟ开?a title="Enabling RPC-level logging" >Enabling RPC-level loggingQ可以同时监控每ơ请求的内存消耗和GC的状况,最后通过多次压测l果来合理调节IOU程数?br /> q里是一个案?nbsp;Hadoop and HBase Optimization for Read Intensive Search ApplicationsQ作者在SSD的机器上讄IOU程Cؓ100Q仅供参考?/p>

hbase.hregion.max.filesize
默认?/strong>Q?56M
说明Q在当前ReigonServer上单个Reigon的大,单个Region过指定值时Q这个Region会被自动split成更的region?br /> 调优Q?br /> region对split和compaction友好Q因为拆分region或compactregion里的storefile速度很快Q内存占用低。缺Ҏsplit和compaction会很频繁?br /> 特别是数量较多的region不停地split, compactionQ会使响应时间L动很大,region数量太多不仅l管理上带来ȝQ甚臛_发一些Hbase的bug?br /> 一?12以下的都小region?/p>

大regionQ则不太适合l常split和compactionQ因为做一ơcompact和split会生较长时间的停顿Q对应用的读写性能冲击非常大。此外,大region意味着较大的storefileQcompaction时对内存也是一个挑战?br /> 当然Q大regionq是有其用武之地Q你只要在某个访问量低峰的时间点l一做compact和splitQ大region可以发挥优势了Q毕竟它能保证绝大多数时间^E的d性能?/p>

既然split和compaction如此影响性能Q有没有办法LQ?br /> compaction是无法避免的Qsplit倒是可以从自动调整ؓ手动?br /> 只要通过这个参数D大到某个很难辑ֈ的|比如100GQ就可以间接用自动splitQRegionServer不会Ҏ到达100G的region做splitQ?br /> 再配?a title="class in org.apache.hadoop.hbase.util" >RegionSplitterq个工具Q在需要splitӞ手动split?br /> 手动split在灵zL和E_性上比v自动split要高很多Q相反,理成本增加不多Q比较推荐online实时pȝ使用?/p>

内存斚wQ小region在设|memstore的大g比较灉|Q大region则过大过都不行Q过大会Dflush时app的IO wait增高Q过则因store fileq多L能降低?/p>

hbase.regionserver.global.memstore.upperLimit/lowerLimit

默认|0.4/0.35
upperlimit说明Qhbase.hregion.memstore.flush.size q个参数的作用是 当单个memstore辑ֈ指定值时Qflush该memstore。但是,一台ReigonServer可能有成百上千个memstoreQ每?memstore也许未达到flush.sizeQjvm的heap׃够用了。该参数是Z限制memstores占用的d存?br /> 当ReigonServer内所有的memstore所占用的内存综合达到heap?0%ӞHBase会强制block所有的更新qflushq些memstore以释放所有memstore占用的内存?br /> lowerLimit说明Q? 同upperLimitQ只不过当全局memstore的内存达?5%Ӟ它不会flush所有的memstoreQ它会找一些内存占用较大的 memstoreQ个别flushQ当然更新还是会被block。lowerLimit是一个在全局flush前的补救措施。可以想象一下,如果 memstore需要在一D|间内全部flushQ且q段旉内无法接受写hQ对HBase集群的性能影响是很大的?br /> 调优Q这是一个Heap内存保护参数Q默认值已l能适用大多数场景。它的调整一般是Z配合某些专属优化Q比如读密集型应用,读~存开大,降低该|腑և更多内存l其他模块用?br /> q个参数会给使用者带来什么媄响?
比如Q?0G内存Q?00个regionQ每个memstore 64MQ假设每个region只有一个memstoreQ那么当100个memstoreq_占用?0%左右Ӟ׃辑ֈlowerLimit的限制?假设此时Q其他memstore同样有很多的写请求进来。在那些大的region未flush完,可能又过了upperlimitQ则所?region都会被blockQ开始触发全局flush?/p>

hfile.block.cache.size

默认?/strong>Q?.2
说明Qstorefile的读~存占用Heap的大百分比Q?.2表示20%。该值直接媄响数据读的性能?br /> 调优Q当然是大好Q如果读比写,开?.4-0.5也没问题。如果读写较均衡Q?.3左右。如果写比读多,果断 默认吧。设|这个值的时候,你同时要参?nbsp;hbase.regionserver.global.memstore.upperLimit Q该值是 memstore占heap的最大百分比Q两个参C个媄响读Q一个媄响写。如果两值加h过80-90%Q会有OOM的风险,谨慎讄?/p>

hbase.hstore.blockingStoreFiles

默认|7
说明Q在compactionӞ如果一个StoreQCoulmn FamilyQ内有超q?个storefile需要合qӞ则block所有的写请求,q行flushQ限制storefile数量增长q快?br /> 调优Qblockh会媄响当前region的读写性能Q将D为单个region可以支撑的最大store file数量会是个不错的选择。最大storefile数量可通过region size/memstore size来计。如果你region size设ؓ无限大,那么你需要预C个region可能产生的最大storefile数?/p>

hbase.hregion.memstore.block.multiplier

默认|2
说明Q当一个region里的memstore过单个memstore.size两倍的大小Ӟblock?region的所有请求,q行flushQ释攑ֆ存。虽然我们设|了memstore的d,比如64MQ但惌一下,在最?3.9M的时候,?Put了一?00M的数据或写请求量暴增Q最后一U钟put?万次Q此时memstore的大会瞬间暴涨到超q预期的memstore.size?q个参数的作用是当memstore的大增臌qmemstore.sizeӞblock所有请求,遏制风险q一步扩大?br /> 调优Q? q个参数的默认D是比较靠q。如果你预估你的正常应用场景Q不包括异常Q不会出现突发写或写的量可控Q那么保持默认值即可。如果正常情况下Q你的写?׃l常暴增Q那么你应该调大q个倍数q调整其他参数|比如hfile.block.cache.size?hbase.regionserver.global.memstore.upperLimit/lowerLimitQ以预留更多内存Q防止HBase server OOM?/p>

其他

启用LZO压羃
LZOҎHbase默认的GZipQ前者性能较高Q后者压~比较高Q具体参?nbsp;Using LZO Compression ?/strong>对于x高HBased性能的开发者,采用LZO是比较好的选择。对于非常在乎存储空间的开发者,则徏议保持默认?/p>

不要在一张表里定义太多的Column Family

Hbase目前不能良好的处理超q?-3个CF的表。因为某个CF在flush发生Ӟ它邻q的CF也会因关联效应被触发flushQ最l导致系l生很多IO?/p>

扚w导入

在批量导入数据到Hbase前,你可以通过预先创徏regionQ来q数据的负载。详?nbsp;Table Creation: Pre-Creating Regions

Hbase客户端优?/h3>

AutoFlush

?a target="_top">HTable的setAutoFlush设ؓfalseQ可以支持客L扚w更新。即当Put填满客户端flush~存Ӟ才发送到服务端?br /> 默认是true?/p>

Scan Caching

scanner一ơ缓存多数据来scanQ从服务端一ơ抓多少数据回来scanQ?br /> 默认值是 1Q一ơ只取一条?/p>

Scan Attribute Selection

scan时徏议指定需要的Column FamilyQ减通信量,否则scan默认会返回整个row的所有数据(所有Coulmn FamilyQ?/p>

Close ResultScanners

通过scan取完数据后,记得要关闭ResultScannerQ否则RegionServer可能会出现问题?/p>

Optimal Loading of Row Keys

当你scan一张表的时候,q回l果只需要row keyQ不需要CF, qualifier,values,timestapsQ时Q你可以在scan实例中添加一个filterListQƈ讄 MUST_PASS_ALL操作QfilterList中add FirstKeyOnlyFilter?a target="_top">KeyOnlyFilter。这样可以减网l通信量?/p>

Turn off WAL on Puts

当Put某些非重要数据时Q你可以讄writeToWAL(false)Q来q一步提高写性能。writeToWAL(false)会在Put时放弃写WAL log。风险是Q当RegionServer宕机Ӟ可能你刚才Put的那些数据会丢失Q且无法恢复?/p>

启用Bloom Filter

Bloom Filter通过I间换时_提高L作性能?/p>

转蝲h明原文链接:http://kenwublog.com/hbase-performance-tuning



ivaneeo 2011-06-15 13:39 发表评论
]]>HBase Compound Indexeshttp://www.aygfsteel.com/ivanwan/archive/2011/06/11/352094.htmlivaneeoivaneeoSat, 11 Jun 2011 08:21:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2011/06/11/352094.htmlhttp://www.aygfsteel.com/ivanwan/comments/352094.htmlhttp://www.aygfsteel.com/ivanwan/archive/2011/06/11/352094.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/352094.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/352094.html

We recently set up HBase and HBase-trx (from https://github.com/hbase-trx) to use multiple-column indexes with this code.  After you compile it, just copy the jar and the hbase-trx jar into your hbase’s lib folder and you should be good to to!

When you create a composite index, you can see the metadata for the index by looking at the table description.  One of the properties will read “INDEXES =>” followed by index names and ‘family:qualifier’ style column names in the index.

KeyGeneratorFactory:

package com.ir.store.hbase.indexes;

import java.util.List;

import org.apache.hadoop.hbase.client.tableindexed.IndexKeyGenerator;

public class KeyGeneratorFactory {

public static IndexKeyGenerator getInstance(List columns) {
return new HBaseIndexKeyGenerator(columns);
}
}

HBaseIndexKeyGenerator:

package com.ir.store.hbase.indexes;

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import org.apache.hadoop.hbase.client.tableindexed.IndexKeyGenerator;
import org.apache.hadoop.hbase.util.Bytes;

public class HBaseIndexKeyGenerator extends Object implements IndexKeyGenerator {
public static byte[] KEYSEPERATOR = "~;?".getBytes();

private int columnCount;
private List columnNames = new ArrayList();

public HBaseIndexKeyGenerator(List memberColumns) {
// For new key generators
columnNames = memberColumns;
columnCount = memberColumns.size();
}

public HBaseIndexKeyGenerator() {
// Hollow constructor for deserializing -- should call readFields shortly
columnCount = 0;
}

public void readFields(DataInput binaryInput) throws IOException {
columnCount = binaryInput.readInt();
for (int currentColumn = 0; currentColumn < columnCount; currentColumn++)
columnNames.add(Bytes.readByteArray(binaryInput));
}

public void write(DataOutput binaryOutput) throws IOException {
binaryOutput.writeInt(columnCount);
for (byte[] columnName : columnNames)
Bytes.writeByteArray(binaryOutput, columnName);
}

public byte[] createIndexKey(byte[] baseRowIdentifier, Map baseRowData) {
byte[] indexRowIdentifier = null;
for (byte[] columnName: columnNames) {
if (indexRowIdentifier == null)
indexRowIdentifier = baseRowData.get(columnName);
else indexRowIdentifier = Bytes.add(indexRowIdentifier, HBaseIndexKeyGenerator.KEYSEPERATOR, baseRowData.get(columnName));
}
if (baseRowIdentifier != null)
return Bytes.add(indexRowIdentifier, HBaseIndexKeyGenerator.KEYSEPERATOR, baseRowIdentifier);
return indexRowIdentifier;
}
}


ivaneeo 2011-06-11 16:21 发表评论
]]>
HBase性能深度分析http://www.aygfsteel.com/ivanwan/archive/2011/06/10/352071.htmlivaneeoivaneeoFri, 10 Jun 2011 15:33:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2011/06/10/352071.htmlhttp://www.aygfsteel.com/ivanwan/comments/352071.htmlhttp://www.aygfsteel.com/ivanwan/archive/2011/06/10/352071.html#Feedback1http://www.aygfsteel.com/ivanwan/comments/commentRss/352071.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/352071.html

对于Bigtablecd的分布式数据库应用来_用户往往会对其性能状况有极大的兴趣Q这其中又对实时数据插入性能更ؓx。HBase作ؓBigtable的一个实玎ͼ在这斚w的性能会如何呢Q这需要通过试数据来说话了?/p>

数据插入性能试的设计场景是q样的,取随机值的Rowkey长度?000字节Q固定值的Value长度?000字节Q由于单行Row插入速度太快Q系l统计精度不够,所以将插入500行Row做一ơ耗时l计?/p>

q里要对HBase的特点做个说明,首先是Rowkeygؓ何取随机敎ͼq是因ؓHBase是对Rowkeyq行排序的,随机Rowkey被分配C同的region上,q样才能发挥出分布式数据库的性能优点。而Value对于HBase来说不会q行M解析Q其数据是否变化Q对性能是不应该有Q何媄响的。同时ؓ了简单v见,所有的数据都将只插入到一个表格的同一个Column中?/p>

在测试之初,需要对集群q行调优Q关闭可能大量耗费内存、带宽以及CPU的服务,例如Apache的Http服务。保持集的宁静度。此外,Z保证试不受q扰QHbase的集系l需要被独立Q以保证不与HDFS所在的Hadoop集群有所交叉?/p>

那么做好一切准备,开始进行数据灌入,客户端从Zookeeper上查询到Regionserver的地址后,开始源源不断的向Hbase的Regionserver上喂入Row?/p>

q里Q我写了一个通过JFreeChart来实时生成图片的E序Q每3分钟Q喂数据的客L会将获取到的耗时l计打印在一张十字坐标图中,q些囑֏被保存在制定的web站点中,q过http服务展示出来。在通过长时间不间断的测试后Q我得到了如下图形:

q个囑Ş非常有特点,好似一条直U上Q每隔一D|间就会泛起一个L,且两个高C间必有一个较矮的波浪。高峰的间隔则呈现出来大的趋ѝ而较矮的波浪恰好处于两高峰的中间位置?/p>

Z解释q个现象Q我对HDFS上Hbase所在的ȝ录下文gQ以及被插入表格的region情况q行了实时监控,以期发现q些波浪上发生了什么事情?/p>

回溯到客L喂入数据的开始阶D,创徏表格Q在HDFS上便被创Z一个与表格同名的目录,该目录下出现第一个regionQregion中会以family名创Z个目录,q个目录下才存在记录具体数据的文件。同时在该表表名目录下,q会生成一?#8220;compaction.dir”目录Q该目录在family名目录下region文g过指定数目时用于合qregion?/p>

当第一个region目录出现的时候,内存中最初被写入的数据将被保存到q个文g中,q个间隔是由选项“hbase.hregion.memstore.flush.size”军_的,默认?4MBQ该region所在的Regionserver的内存中一旦有过64MB的数据的时候,将被写入到region文g中。这个文件将不断增殖Q直到超q由“hbase.hregion.max.filesize”军_的文件大时Q默认是256MBQ此时加上内存刷入的数据Q实际最大可能到256+64MQ,该region被执行splitQ立卌一切ؓ二,其过E是在该目录下创Z个名?#8220;.splits”的目录作为标讎ͼ然后由Regionserver文件信息读取进来,分别写入C个新的region目录中,最后再老的region删除。这里的标记目录“.splits”避免在splitq程中发生其他操作,起到cM于多U程安全的锁功能。在新的region中,从老的region中切分出的数据独立ؓ一个文件ƈ不再接受新的数据Q该文g大小过?4MQ最大可辑ֈQ?56+64Q?2=160MBQ,内存中新的数据将被保存到一个重新创建的文g中,该文件大将?4MB。内存每h一ơ,region所在的目录下就增加一?4M的文Ӟ直到L件数过?#8220;hbase.hstore.compactionThreshold”指定的数量时Q默认ؓ3Q,compactionq程将被触发了。在上述gؓ3Ӟ此时该region目录下,实际文g数只有两个,q有额外的一个正处于内存中将要被刷入到磁盘的q程中。Compactionq程是Hbase的一个大动作QHbase不仅要将q些文g转移?#8220;compaction.dir”目录q行压羃Q而且在压~后的文件超q?56MBӞq必ȝ卌行split动作。这一pd行ؓ在HDFS上可谓是d倒vQ媄响颇大。待Compactionl束之后Q后l的split依然会持l进行一段旉Q直到所有的region都被切割分配完毕QHbase才会恢复q静q等待下一ơ数据从内存写入到HDFS的到来?/p>

理解了上q过E,则必然对HBase的数据插入性能Z是上图所C的曲线的原因一目了然。与X轴几乎^行的直线Q表明数据正在被写入HBase的Regionserver所在机器的内存中。而较低的波峰意味着Regionserver正在内存写入到HDFS上,较高的L峰意味着Regionserver不仅正在内存刷入到HDFSQ而且q在执行Compaction和Split两种操作。如果调?#8220;hbase.hstore.compactionThreshold”的gؓ一个较大的数量Q例如改?Q可以预见,在每两个高峰之间必然会等间隔的出Cơ较低的波峰Qƈ可预见到Q高峰的高度远过上述gؓ3时的高峰高度Q因为Compaction的工作更巨)。由于region数量由少到多Q而我们插入的Row的Rowkey是随机的Q因此每一个region中的数据都会均匀的增加,同一D|间插入的数据被分布到越来越多的region上,因此波峰之间的间隔时间也会来长?/p>

再次理解上述Q我们可以推断出Hbase的数据插入性能实际上应该被分ؓ三种情况Q即直线状态、低峰状态和高峰状态。在q三U情况下得到的性能数据才是最lHbase数据插入性能的真实描q。那么提供给用户的数据该是采取哪一个呢Q我认ؓ直线状态由于其所占时间会较长Q尤其在用户写入数据的速度也许q不是那么快的情况下Q所以这个状态下得到的性能数据l果更应该提供给用户?/p>

ivaneeo 2011-06-10 23:33 发表评论
]]>
HBase的性能优化和相x?/title><link>http://www.aygfsteel.com/ivanwan/archive/2011/06/10/352069.html</link><dc:creator>ivaneeo</dc:creator><author>ivaneeo</author><pubDate>Fri, 10 Jun 2011 15:14:00 GMT</pubDate><guid>http://www.aygfsteel.com/ivanwan/archive/2011/06/10/352069.html</guid><wfw:comment>http://www.aygfsteel.com/ivanwan/comments/352069.html</wfw:comment><comments>http://www.aygfsteel.com/ivanwan/archive/2011/06/10/352069.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.aygfsteel.com/ivanwan/comments/commentRss/352069.html</wfw:commentRss><trackback:ping>http://www.aygfsteel.com/ivanwan/services/trackbacks/352069.html</trackback:ping><description><![CDATA[<div><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">HBase的写效率q是很高的,但其随机d效率q不?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">可以采取一些优化措施来提高其性能Q如Q?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">1. 启用lzo压羃Q见<a target="_blank" style="color: #006699; text-decoration: none; ">q里</a></p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">2. 增大hbase.regionserver.handler.countCؓ100</p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">3. 增大hfile.block.cache.size?.4Q提高cache大小</p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">4. 增大hbase.hstore.blockingStoreFiles?5</p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">5. 启用BloomFilterQ在HBase0,89中可以设|?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">6.Put时可以设|setAutoFlush为falseQ到一定数目后再flushCommits</p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; "> </p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">?4个Region Server的集上Q新建立一个lzo压羃?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">试的Put和Get的性能如下Q?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">1. Put数据Q?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">单线E灌?.4亿数据,p?0分钟Q每U能辑ֈ4万个Q这个性能实很好了,不过插入的value比较,只有不到几十个字?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">多线EputQ没有测试,因ؓ单线E的效率已经相当高了</p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">2. Get数据Q?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">在没有Q何Block CacheQ而且是Random Read的情况:</p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 80px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">单线E^均每U只能到250个左?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 80px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">6个线E^均每U能辑ֈ1100个左?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 80px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">16个线E^均每U能辑ֈ2500个左?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 40px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">有BlockCacheQ曾lgetq对应的rowQ而且q在cache中)的情况:</p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 80px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">单线E^均每U能?600个左?/p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 80px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">6个线E^均每U能辑ֈ1.2万个左右</p><p style="padding-top: 0px; padding-right: 0px; padding-bottom: 15px; padding-left: 0px; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 80px; color: #333333; font-family: 'Trebuchet MS', Tahoma, Arial; font-size: 13px; line-height: 19px; ">16个线E^均每U能辑ֈ2.5万个左右</p></div><img src ="http://www.aygfsteel.com/ivanwan/aggbug/352069.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.aygfsteel.com/ivanwan/" target="_blank">ivaneeo</a> 2011-06-10 23:14 <a href="http://www.aygfsteel.com/ivanwan/archive/2011/06/10/352069.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>HADOOP报错Incompatible namespaceIDshttp://www.aygfsteel.com/ivanwan/archive/2011/06/09/351981.htmlivaneeoivaneeoThu, 09 Jun 2011 06:20:00 GMThttp://www.aygfsteel.com/ivanwan/archive/2011/06/09/351981.htmlhttp://www.aygfsteel.com/ivanwan/comments/351981.htmlhttp://www.aygfsteel.com/ivanwan/archive/2011/06/09/351981.html#Feedback0http://www.aygfsteel.com/ivanwan/comments/commentRss/351981.htmlhttp://www.aygfsteel.com/ivanwan/services/trackbacks/351981.html

今早一来,H然发现使用-put命o往HDFS里传数据传不上去了,׃大堆错误Q然后我使用bin/hadoop dfsadmin -report查看pȝ状?/p>

admin@adw1:/home/admin/joe.wangh/hadoop-0.19.2>bin/hadoop dfsadmin -report
Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: ?%

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

使用bin/stop-all.sh关闭HADOOP

admin@adw1:/home/admin/joe.wangh/hadoop-0.19.2>bin/stop-all.sh
stopping jobtracker
172.16.197.192: stopping tasktracker
172.16.197.193: stopping tasktracker
stopping namenode
172.16.197.193: no datanode to stop
172.16.197.192: no datanode to stop

172.16.197.191: stopping secondarynamenode

哦,看到了吧Q发现datanode前面q没有启动v来。去DATANODE上查看一下日?/p>

admin@adw2:/home/admin/joe.wangh/hadoop-0.19.2/logs>vi hadoop-admin-datanode-adw2.hst.ali.dw.alidc.net.log

************************************************************/
2010-07-21 10:12:11,987 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /home/admin/joe.wangh/hadoop/data/dfs.data.dir: namenode namespaceID = 898136669; datanode namespaceID = 2127444065
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:288)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:206)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1239)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1194)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1202)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1324)
......

错误提示namespaceIDs不一致?/span>

下面l出两种解决办法Q我使用的是W二U?/span>

Workaround 1: Start from scratch

I can testify that the following steps solve this error, but the side effects won't make you happy (me neither). The crude workaround I have found is to:

1.     stop the cluster

2.     delete the data directory on the problematic datanode: the directory is specified by dfs.data.dir in conf/hdfs-site.xml; if you followed this tutorial, the relevant directory is /usr/local/hadoop-datastore/hadoop-hadoop/dfs/data

3.     reformat the namenode (NOTE: all HDFS data is lost during this process!)

4.     restart the cluster

When deleting all the HDFS data and starting from scratch does not sound like a good idea (it might be ok during the initial setup/testing), you might give the second approach a try.

Workaround 2: Updating namespaceID of problematic datanodes

Big thanks to Jared Stehler for the following suggestion. I have not tested it myself yet, but feel free to try it out and send me your feedback. This workaround is "minimally invasive" as you only have to edit one file on the problematic datanodes:

1.     stop the datanode

2.     edit the value of namespaceID in <dfs.data.dir>/current/VERSION to match the value of the current namenode

3.     restart the datanode

If you followed the instructions in my tutorials, the full path of the relevant file is /usr/local/hadoop-datastore/hadoop-hadoop/dfs/data/current/VERSION (background: dfs.data.dir is by default set to ${hadoop.tmp.dir}/dfs/data, and we set hadoop.tmp.dir to /usr/local/hadoop-datastore/hadoop-hadoop).

If you wonder how the contents of VERSION look like, here's one of mine:

#contents of <dfs.data.dir>/current/VERSION

namespaceID=393514426

storageID=DS-1706792599-10.10.10.1-50010-1204306713481

cTime=1215607609074

storageType=DATA_NODE

layoutVersion=-13

 

原因:每次namenode format会重新创Z个namenodeId,而tmp/dfs/data下包含了上次format下的id,namenode format清空了namenode下的数据,但是没有晴空datanode下的数据,D启动时失?所要做的就是每ơfotmat?清空tmp一?的所有目?



ivaneeo 2011-06-09 14:20 发表评论
]]>
վ֩ģ壺 | ʯɽ| ̫| ƽ| | | | | ͼ| | | | ̩| ͩ| Ϫ| | | | | ¸| | | ʯ| | | | ²| Ͱ| ˮ| | ¦| ڻ| | ְ| | | ̳| | Ļ| | ̨|