您的位置:首页 > 运维架构

使用Hadoop遇到问题笔记

2013-08-30 12:21 531 查看
50030端口被占用的情况:

  2011-05-1 14:30:43,931 INFO org.apache.hadoop.http.HttpServer:
Port returned by webServer.getConnectors()[0].getLocalPort() before
open() is -1. Opening the listener on 50030

  2011-05-1 14:30:43,933 FATAL org.apache.hadoop.mapred.JobTracker:
java.net.BindException: Address already in use

    at
sun.nio.ch.Net.bind(NativeMethod)

    at
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)

  在mapred-default.xml中修改下端口号:

<property>
<name>mapred.job.tracker.http.address</name>
<value>
0.0
.
0.0
:
50030
</value>
<description>The job
tracker http server address and
port
the server will listen on.If the
port
is
0
then
the server
will start on a free
port.
</description>
</property>
java.io.IOException: All datanodes
xxx.xxx.xxx.xxx:xxx are bad. Aborting…

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2158)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)

java.io.IOException: Could not get block
locations. Aborting…

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)

经查明,问题原因是linux机器打开了过多的文件导致。用命令ulimit
-n可以发现linux默认的文件打开数目为1024,修改/ect/security/limit.conf,增加hadoop soft
65535

再重新运行程序(最好所有的datanode都修改),问题解决

P.S:据说hadoop dfs不能管理总数超过100M个文件,有待查证

启动hadoop报错如下:

java.io.IOException: File
/exapp/hadoop/hadooptmp/mapred/system/jobtracker.info could only be
replicated to 0 nodes, instead of 1

at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)

at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at
java.lang.reflect.Method.invoke(Method.java:597)

at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)

at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)

at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Subject.java:396)

at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

此处没发现原因,然后查看了下datanode 上的日志:

************************************************************/

ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
java.io.IOException: Incompatible namespaceIDs in
/exapp/hadoop/data: namenode namespaceID
=472560000; datanode
namespaceID = 719491592

at
org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233)

at
org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148)

at
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298)

at
org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:216)

at
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283)

at
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238)

at
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246)

at
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368)

发现问题所在:由于我之前格式化一次namenode,所以导致namenode 和datanode 版本不一致。

解决方法:分别找到datanode下的 /exapp/hadoop/data/current/VERSION
文件把 namespaceID =
719491592替换成namespaceID = 472560000

不同安装目录位置不一样。

然后,启动hadoop
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: