配置Hadoop M/R 采用Fair Scheduler算法代替FIFO
采用Cloudera版本的hadoop/hbase:hadoop-0.20.2-cdh3u0
hbase-0.90.1-cdh3u0
zookeeper-3.3.3-cdh3u0
默認(rèn)已支持FairScheduler調(diào)度算法.
只需改配置使期用FairSchedule而非默認(rèn)的JobQueueTaskScheduler即可.
配置fair-scheduler.xml (/$HADOOP_HOME/conf/):
<?xml version="1.0"?>
<property>
<name>mapred.fairscheduler.allocation.file</name>
<value>[HADOOP_HOME]/conf/fair-scheduler.xml</value>
</property>
<allocations>
<pool name="qiji-task-pool">
<minMaps>5</minMaps>
<minReduces>5</minReduces>
<maxRunningJobs>
<maxRunningJobs>5</maxRunningJobs>
<minSharePreemptionTimeout>300</minSharePreemptionTimeout>
<weight>1.0</weight>
</pool>
<user name="ecap">
<maxRunningJobs>
<maxRunningJobs>6</maxRunningJobs>
</user>
<poolMaxJobsDefault>10</poolMaxJobsDefault>
<userMaxJobsDefault>8</userMaxJobsDefault>
<defaultMinSharePreemptionTimeout>600
</defaultMinSharePreemptionTimeout>
<fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
</allocations>
<property>
<name>mapred.fairscheduler.allocation.file</name>
<value>[HADOOP_HOME]/conf/fair-scheduler.xml</value>
</property>
<allocations>
<pool name="qiji-task-pool">
<minMaps>5</minMaps>
<minReduces>5</minReduces>
<maxRunningJobs>
<maxRunningJobs>5</maxRunningJobs>
<minSharePreemptionTimeout>300</minSharePreemptionTimeout>
<weight>1.0</weight>
</pool>
<user name="ecap">
<maxRunningJobs>
<maxRunningJobs>6</maxRunningJobs>
</user>
<poolMaxJobsDefault>10</poolMaxJobsDefault>
<userMaxJobsDefault>8</userMaxJobsDefault>
<defaultMinSharePreemptionTimeout>600
</defaultMinSharePreemptionTimeout>
<fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
</allocations>
配置$HADOOP_HOME/conf/mapred-site.xml,最后添加:
<property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.FairScheduler</value>
</property>
<property>
<name>mapred.fairscheduler.allocation.file</name>
<value>/opt/hadoop/conf/fair-scheduler.xml</value>
</property>
<property>
<name>mapred.fairscheduler.assignmultiple</name>
<value>true</value>
</property>
<property>
<name>mapred.fairscheduler.sizebasedweight</name>
<value>true</value>
</property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.FairScheduler</value>
</property>
<property>
<name>mapred.fairscheduler.allocation.file</name>
<value>/opt/hadoop/conf/fair-scheduler.xml</value>
</property>
<property>
<name>mapred.fairscheduler.assignmultiple</name>
<value>true</value>
</property>
<property>
<name>mapred.fairscheduler.sizebasedweight</name>
<value>true</value>
</property>
然后重新運(yùn)行集群,這樣有幾個(gè)Job(上面配置是5個(gè)并行)并行運(yùn)行時(shí),不會(huì)因?yàn)橐粋€(gè)Job把Map/Reduce占滿而使其它Job處于Pending狀態(tài).
可從: http://<masterip>:50030/scheduler查看并行運(yùn)行的狀態(tài).
posted on 2013-01-31 17:30 paulwong 閱讀(1519) 評(píng)論(1) 編輯 收藏 所屬分類: HADOOP 、云計(jì)算