q行hadoopE序Ӟ 中途我把它l止了,然后再向hdfs加文件或删除文gӞ出现Name node is in safe mode错误Q?br />rmr: org.apache.hadoop.dfs.SafeModeException: Cannot delete /user/hadoop/input. Name node is in safe mode
解决的命令:
bin/hadoop dfsadmin -safemode leave #关闭safe mode
转自Q?nbsp;http://shutiao2008.iteye.com/blog/318950
?安全模式 学习Q?/p> safemode模式
NameNode在启动的时候首先进入安全模式,如果datanode丢失的block辑ֈ一定的比例Q?-dfs.safemode.threshold.pctQ,则系l会一直处于安全模式状态即只读状态?br />dfs.safemode.threshold.pctQ缺省?.999fQ表CHDFS启动的时候,如果DataNode上报的block个数辑ֈ了元数据记录的block个数?.999倍才可以d安全模式Q否则一直是q种只读模式。如果设?则HDFS永远是处于SafeMode?br />下面q行摘录自NameNode启动时的日志Qblock上报比例1辑ֈ了阀?.9990Q?br />The ratio of reported blocks 1.0000 has reached the threshold 0.9990. Safe mode will be turned off automatically in 18 seconds.
hadoop dfsadmin -safemode leave
有两个方法离开q种安全模式
1. 修改dfs.safemode.threshold.pctZ个比较小的|~省?.999?br />2. hadoop dfsadmin -safemode leave命o强制d
http://bbs.hadoopor.com/viewthread.php?tid=61&extra=page%3D1
Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q?br />Safe mode is exited when the minimal replication condition is reached, plus an extension
time of 30 seconds. The minimal replication condition is when 99.9% of the blocks in
the whole filesystem meet their minimum replication level (which defaults to one, and
is set by dfs.replication.min).
安全模式的退出前?- 整个文gpȝ中的99.9%Q默认是99.9%Q可以通过dfs.safemode.threshold.pct讄Q的Blocks辑ֈ最备份?默认?Q可以通过dfs.replication.min讄)?br />dfs.safemode.threshold.pct float 0.999
The proportion of blocks in the system that must meet the minimum
replication level defined by dfs.rep lication.min before the namenode
will exit safe mode. Setting
this value to 0 or less forces the name-node not to start in safe mode.
Setting this value to more than 1 means the namenode never exits safe
mode.
Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q?br />用户可以通过dfsadmin -safemode value 来操作安全模式,参数value的说明如下:
enter - q入安全模式
leave - 强制NameNoded安全模式
get - q回安全模式是否开启的信息
wait - {待Q一直到安全模式l束?/div>
程如下Q?/p>
1.下蝲hadoop 1.0.3 Qhttp://hadoop.apache.org/releases.html#DownloadQ,解压在自定义的一个目录中Q最好全英文路径Q试q中文\径出了问题)?/p>
2.Eclipse导入..\hadoop-1.0.3\src\contrib\eclipse-plugin目Q默认项目是MapReduceTools?/p>
3. 在项目MapReduceTools中新建lib目录Qƈ把hadoop的hadoop-coreQ由hadoop根目录的hadoop-*.jar改名 获得Q、commons-cli-1.2.jar、commons-lang-2.4.jar、commons-configuration- 1.6.jar、jackson-mapper-asl-1.8.8.jar、jackson-core-asl-1.8.8.jar、commons- httpclient-3.0.1.jar拯到该目录?/p>
4.修改上目录中的build-contrib.xmlQ?/p>
扑ֈ<property name="hadoop.root" location="${root}/../../../"/>修改location为hadoop1.0.3实际解压目录Q在其下d
<property name="eclipse.home" location="D:/Program Files/eclipse"/>
<property name="version" value="http://x-goder.iteye.com/blog/1.0.3"/>
5.修改目目录下的build.xmlQ?/p>
<target name="jar" depends="compile" unless="skip.contrib">
<mkdir dir="${build.dir}/lib"/>
<copy file="${hadoop.root}/hadoop-core-${version}.jar" tofile="${build.dir}/lib/hadoop-core.jar" verbose="true"/>
<copy file="${hadoop.root}/lib/commons-cli-1.2.jar" todir="${build.dir}/lib" verbose="true"/>
<copy file="${hadoop.root}/lib/commons-lang-2.4.jar" todir="${build.dir}/lib" verbose="true"/>
<copy file="${hadoop.root}/lib/commons-configuration-1.6.jar" todir="${build.dir}/lib" verbose="true"/>
<copy file="${hadoop.root}/lib/jackson-mapper-asl-1.8.8.jar" todir="${build.dir}/lib" verbose="true"/>
<copy file="${hadoop.root}/lib/jackson-core-asl-1.8.8.jar" todir="${build.dir}/lib" verbose="true"/>
<copy file="${hadoop.root}/lib/commons-httpclient-3.0.1.jar" todir="${build.dir}/lib" verbose="true"/>
<jar
jarfile="${build.dir}/hadoop-${name}-${version}.jar"
manifest="${root}/META-INF/MANIFEST.MF">
<fileset dir="${build.dir}" includes="classes/ lib/"/>
<fileset dir="${root}" includes="resources/ plugin.xml"/>
</jar>
</target>
6.右键eclipse里的build.xml选择run as - ant build?/p>
如果出现Q?#8220;软g包org.apache.hadoop.fs 不存?#8221;的错误则修改build.xmlQ?/p>
<path id="hadoop-jars">
<fileset dir="${hadoop.root}/">
<include name="hadoop-*.jar"/>
</fileset>
</path>
?lt;path id="classpath">中添加:<path refid="hadoop-jars"/>
7.{Ant~译完毕后。编译后的文件在Q\build\contrib 中的 hadoop-eclipse-plugin-1.0.3.jar?/p>
8.查看~译好的jar包下META-INF/MANIFEST.MF 下的配置属性是否完_如果不完_补充完整?/p>
9.攑օeclipse/plugins下,重启eclipseQ查看是否安装成功?/p>
插g
话说Hadoop 1.0.2/src/contrib/eclipse-plugin只有插g的源代码Q这里给Z个我打包好的对应的Eclipse插gQ?br />下蝲地址
下蝲后扔到eclipse/dropins目录下即可,当然eclipse/plugins也是可以的,前者更便,推荐Q重启EclipseQ即可在透视?Perspective)中看到Map/Reduce?/p>
配置
点击蓝色的小象图标,新徏一个Hadoopq接Q?/p>
注意Q一定要填写正确Q修改了某些端口Q以及默认运行的用户名等
具体的设|,可见
正常情况下,可以在项目区域可以看?/p>
q样可以正常的进行HDFS分布式文件系l的理Q上传,删除{操作?/p>
Z面测试做准备Q需要先Z一个目?user/root/input2Q然后上传两个txt文g到此目录Q?/p>
intput1.txt 对应内容QHello Hadoop Goodbye Hadoop
intput2.txt 对应内容QHello World Bye World
HDFS的准备工作好了,下面可以开始测试了?/p>
Hadoop工程
新徏一个Map/Reduce Project工程Q设定好本地的hadoop目录
新徏一个测试类WordCountTestQ?/p>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 |
|
右键Q选择“Run Configurations”,弹出H口Q点?#8220;Arguments”选项??#8220;Program argumetns”处预先输入参?
hdfs://master:9000/user/root/input2 dfs://master:9000/user/root/output2
备注Q参Cؓ了在本地调试使用Q而非真实环境?/p>
然后Q点?#8220;Apply”Q然?#8220;Close”。现在可以右键,选择“Run on Hadoop”Q运行?/p>
但此时会出现cM异常信息Q?/p>
12/04/24 15:32:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
12/04/24 15:32:44 ERROR security.UserGroupInformation: PriviledgedActionException as:Administrator cause:java.io.IOException: Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator-519341271\.staging to 0700
Exception in thread "main" java.io.IOException: Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator-519341271\.staging to 0700
at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:682)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:655)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
at com.hadoop.learn.test.WordCountTest.main(WordCountTest.java:85)
q个是Windows下文件权限问题,在Linux下可以正常运行,不存在这L问题?/p>
解决Ҏ是,修改/hadoop-1.0.2/src/core/org/apache/hadoop/fs/FileUtil.java里面的checkReturnValueQ注释掉卛_Q有些粗_在Window下,可以不用查)Q?/p>
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
重新~译打包hadoop-core-1.0.2.jarQ替换掉hadoop-1.0.2根目录下的hadoop-core-1.0.2.jar卛_?/p>
q里提供一份修改版?a target="_blank">hadoop-core-1.0.2-modified.jar文gQ替换原hadoop-core-1.0.2.jar卛_?/p>
替换之后Q刷新项目,讄好正的jar包依赖,现在再运行WordCountTestQ即可?/p>
成功之后Q在Eclipse下刷新HDFS目录Q可以看到生成了ouput2目录Q?/p>
点击“ part-r-00000”文gQ可以看到排序结果:
Bye 1
Goodbye 1
Hadoop 2
Hello 2
World 2
嗯,一样可以正常Debug调试该程序,讄断点Q右?–> Debug As – > Java ApplicationQ,卛_Q每ơ运行之前,都需要收到删除输出目录)?/p>
另外Q该插g会在eclipse对应的workspace\.metadata\.plugins\org.apache.hadoop.eclipse下,自动生成jar文gQ以及其他文Ӟ包括Haoop的一些具体配|等?/p>
嗯,更多l节Q慢慢体验吧?/p>
遇到的异?/strong>
org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /user/root/output2/_temporary. Name node is in safe mode.
The ratio of reported blocks 0.5000 has not reached the threshold 0.9990. Safe mode will be turned off automatically.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:2055)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2029)
at org.apache.hadoop.hdfs.server.namenode.NameNode.mkdirs(NameNode.java:817)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
在主节点处,关闭掉安全模式:
#bin/hadoop dfsadmin –safemode leave
如何打包
创建的Map/Reduce目打包成jar包,很简单的事情Q无需多言。保证jar文g的META-INF/MANIFEST.MF文g中存在Main-Class映射Q?/p>
Main-Class: com.hadoop.learn.test.TestDriver
若用到W三方jar包,那么在MANIFEST.MF中增加Class-Path好了?/p>
另外可用插件提供的MapReduce Driver向导Q可以帮忙我们在Hadoop中运行,直接指定别名Q尤其是包含多个Map/Reduce作业Ӟ很有用?/p>
一个MapReduce Driver只要包含一个main函数Q指定别名:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
q里有一个小技巧,MapReduce DrivercM面,右键q行QRun on HadoopQ会在Eclipse的workspace\.metadata\.plugins\org.apache.hadoop.eclipse?录下自动生成jar包,上传到HDFSQ或者远Ehadoop根目录下Q运行它:
# bin/hadoop jar LearnHadoop_TestDriver.java-460881982912511899.jar testcount input2 output3
OKQ本文结束?/p>
l过上一的分析Q我们知道了Hadoop的作业提交目标是Clusterq是LocalQ与conf文g夹内的配|文件参数有着密切关系Q不仅如此,其它的很多类都跟conf有关Q所以提交作业时切记把conf攑ֈ你的classpath中?/p>
因ؓConfiguration是利用当前线E上下文的类加蝲器来加蝲资源和文件的Q所以这里我们采用动态蝲入的方式Q先d好对应的依赖库和资源Q然后再构徏一个URLClassLoader作ؓ当前U程上下文的cd载器?/p>
public static ClassLoader getClassLoader() {
ClassLoader parent = Thread.currentThread().getContextClassLoader();
if (parent == null) {
parent = EJob.class.getClassLoader();
}
if (parent == null) {
parent = ClassLoader.getSystemClassLoader();
}
return new URLClassLoader(classPath.toArray(new URL[0]), parent);
}
代码很简单,废话׃多说了。调用例子如下:
EJob.addClasspath("/usr/lib/hadoop-0.20/conf");
ClassLoader classLoader = EJob.getClassLoader();
Thread.currentThread().setContextClassLoader(classLoader);
讄好了cd载器Q下面还有一步就是要打包Jar文gQ就是让Project自打包自qclassZ个Jar包,我这里以标准Eclipse工程文g夹布局ZQ打包的是bin文g多w的class?/p>
public static File createTempJar(String root) throws IOException {
if (!new File(root).exists()) {
return null;
}
Manifest manifest = new Manifest();
manifest.getMainAttributes().putValue("Manifest-Version", "1.0");
final File jarFile = File.createTempFile("EJob-", ".jar", new File(System
.getProperty("java.io.tmpdir")));
Runtime.getRuntime().addShutdownHook(new Thread() {
public void run() {
jarFile.delete();
}
});
JarOutputStream out = new JarOutputStream(new FileOutputStream(jarFile),
manifest);
createTempJarInner(out, new File(root), "");
out.flush();
out.close();
return jarFile;
}
private static void createTempJarInner(JarOutputStream out, File f,
String base) throws IOException {
if (f.isDirectory()) {
File[] fl = f.listFiles();
if (base.length() > 0) {
base = base + "/";
}
for (int i = 0; i < fl.length; i++) {
createTempJarInner(out, fl[i], base + fl[i].getName());
}
} else {
out.putNextEntry(new JarEntry(base));
FileInputStream in = new FileInputStream(f);
byte[] buffer = new byte[1024];
int n = in.read(buffer);
while (n != -1) {
out.write(buffer, 0, n);
n = in.read(buffer);
}
in.close();
}
}
q里的对外接口是createTempJarQ接收参Cؓ需要打包的文gҎ路径Q支持子文gҎ包。用递归处理法,依次把文件夹里的l构?文g打包到Jar里。很单,是基本的文件流操作Q陌生一点的是Manifest和JarOutputStreamQ查查API明了?/p>
好,万事具备Q只Ơ东风了Q我们来实践一下试试。还是拿WordCount来D例:
// Add these statements. XXX
File jarFile = EJob.createTempJar("bin");
EJob.addClasspath("/usr/lib/hadoop-0.20/conf");
ClassLoader classLoader = EJob.getClassLoader();
Thread.currentThread().setContextClassLoader(classLoader);
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args)
.getRemainingArgs();
if (otherArgs.length != 2) {
System.err.println("Usage: wordcount <in> <out>");
System.exit(2);
}
Job job = new Job(conf, "word count");
job.setJarByClass(WordCountTest.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
Run as Java Application。。?span style="color: #ff0000;">Q!Q?/span>No job jar file set...异常Q看?/span>job.setJarByClass(WordCountTest.class)q个语句讄作业Jar包没有成功。这是ؓ什么呢Q?/p> 因ؓq个Ҏ使用了WordCount.class的类加蝲器来L包含该类的Jar包,然后讄该Jar包ؓ作业所用的Jar包。但是我们的作业 Jar包是在程序运行时才打包的Q而WordCount.class的类加蝲器是AppClassLoaderQ运行后我们无法改变它的搜烦路径Q所以 用setJarByClass是无法设|作业Jar包的。我们必M用JobConf里的setJar来直接设|作业Jar包,像下面一P 好,我们对上面的例子再做下修改,加上上面q条语句?/p> 再Run as Java ApplicationQ终于OK了~~ 该种Ҏ的Run on Hadoop使用单,兼容性好Q推荐一试。:Q?/p> 本例子由于时间关p,只在Ubuntu上做了伪分布式测试,但理Z是可以用到真实分布式上去的?/p> The end.
// And add this statement. XXX
((JobConf) job.getConfiguration()).setJar(jarFile.toString());