您的位置:首页 > Web前端 > Node.js

File***could only be replicated to 0 nodes instead of minReplication (=1)

2017-11-07 10:52 363 查看
1、集群部署完成之后,测试上传文件至hdfs时,报出 File /user/hdfs_test.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication
(=1)的异常信息。

[hadoop@abcd08 chx]$ hadoop fs -put hdfs_test.txt /user

14/11/30 21:43:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

14/11/30 21:43:06 WARN hdfs.DFSClient: DataStreamer Exception

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hdfs_test.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.

        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1331)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198)

        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:480)

        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299)

        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)

        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)

        at org.apache.hadoop.ip
4000
c.Server$Handler$1.run(Server.java:1701)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1697)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1695)

        at org.apache.hadoop.ipc.Client.call(Client.java:1225)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)

        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)

        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)

        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:291)

        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1176)

        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1029)

        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)

put: File /user/hdfs_test.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.

14/11/30 21:43:06 ERROR hdfs.DFSClient: Failed to close file /user/hdfs_test.txt._COPYING_

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hdfs_test.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.

        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1331)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198)

        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:480)

        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299)

        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)

        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1701)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1697)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1695)

        at org.apache.hadoop.ipc.Client.call(Client.java:1225)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)

        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)

        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)

        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)

        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:291)

        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1176)

        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1029)

        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)

[hadoop@abcd08 chx]$ 

2、执行hadoop dfsadmin -report看集群的状态
[hadoop@abcd08
~]$ hadoop dfsadmin -report

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.

14/11/30 22:09:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Configured Capacity: 0 (0 B)

Present Capacity: 0 (0 B)

DFS Remaining: 0 (0 B)

DFS Used: 0 (0 B)

DFS Used%: NaN%

Under replicated blocks: 0

Blocks with corrupt replicas: 0

Missing blocks: 0

------------------------------------------------

Datanodes available: 0 (0 total, 0 dead)

[hadoop@abcd08 ~]$    
很明显,datanode异常。

3、重新执行sh start-dfs.sh(连接执行两次,都是如下信息):
[hadoop@abcd08 sbin]$ sh start-dfs.sh 

14/11/30 23:30:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Starting namenodes on [abcd08]

abcd08: starting namenode, logging to /home/hadoop/cdh4/hadoop-2.0.0-cdh4.3.0/logs/hadoop-hadoop-namenode-abcd08.out

abcd05: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd05.out

abcd06: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd06.out

abcd02: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd02.out

abcd01: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd01.out

abcd07: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd07.out

abcd04: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd04.out

abcd03: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd03.out

Starting secondary namenodes [abcd01]

abcd01: starting secondarynamenode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-secondarynamenode-abcd01.out

14/11/30 23:30:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[hadoop@abcd08 sbin]$ 
此证明:各datanode启动之后,进程随即又退出了。

4、查看datanode上的日志文件:
[hadoop@abcd07 logs]$ pwd

/home/hadoop/cdh4/hadoop/logs

[hadoop@abcd07 logs]$ tail -50 hadoop-hadoop-datanode-abcd07.log

2014-11-30 22:51:36,896 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-434621394-100.101.138.60-1417342220181 (storage id DS-139576203-100.101.138.59-50010-1416757117263) service to abcd08/100.101.138.60:8020

java.io.IOException: Incompatible clusterIDs in /data1/hadoop: namenode clusterID = CID-2d09836b-d546-4066-9deb-b28cc55de11a; datanode clusterID = CID-b664bda9-3f22-412f-86bd-372f98a73a52

        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391)

        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)

        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)

        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:911)

        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:882)

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:308)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:218)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)

        at java.lang.Thread.run(Thread.java:662)

2014-11-30 22:51:36,898 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-434621394-100.101.138.60-1417342220181 (storage id DS-139576203-100.101.138.59-50010-1416757117263) service to abcd08/100.101.138.60:8020

2014-11-30 22:51:36,999 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-434621394-100.101.138.60-1417342220181 (storage id DS-139576203-100.101.138.59-50010-1416757117263)

2014-11-30 22:51:39,000 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode

2014-11-30 22:51:39,002 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0

2014-11-30 22:51:39,007 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: 

/************************************************************

SHUTDOWN_MSG: Shutting down DataNode at abcd07/100.101.138.59

************************************************************/

报的是:java.io.IOException:
Incompatible clusterIDs in /data1/hadoop: namenode clusterID = CID-2d09836b-d546-4066-9deb-b28cc55de11a; datanode clusterID = CID-b664bda9-3f22-412f-86bd-372f98a73a52。

5、查看各节点的 hdfs-site.xml配置文件,找到所配置的 namenode 和 datanode 的位置,如下:
[hadoop@abcd07 hadoop]$ more hdfs-site.xml 

<!-- Put site-specific property overrides in this file. -->

<configuration>

        <property>

                <name>dfs.replication</name>

                <value>3</value>

        </property>

        <property>

                <name>dfs.namenode.name.dir</name>

                <value>/home/hadoop/cdh4/hadoop/dfs/name</value>

        </property>

        <property>

                <name>dfs.datanode.data.dir</name>

                <value>/data1/hadoop</value>

        </property>

        <property>

                <name>dfs.namenode.secondary.http-address</name>

                <value>abcd01:50090</value>

                <description></description>

        </property>

        <property>

                <name>dfs.webhdfs.enabled</name>

                <value>true</value>

        </property>

</configuration>

[hadoop@abcd07 hadoop]$

在各datanode上打开 namenode 和 datanode 的位置(上面红色标注目录),并与namenode上的/home/hadoop/cdh4/hadoop/dfs/name/current目录下VERSION进行比较,发现它们clusterID确实是不一样。

如下:
namenode上:
[hadoop@abcd08 current]$ more VERSION 
#Sun Nov 30 18:10:20 CST 2014
namespaceID=1651811630
clusterID=CID-2d09836b-d546-4066-9deb-b28cc55de11a
cTime=0
storageType=NAME_NODE
blockpoolID=BP-434621394-100.101.138.60-1417342220181
layoutVersion=-40
[hadoop@abcd08 current]$

datanode上:
[hadoop@abcd07 current]$ more VERSION 

#Mon Nov 24 00:31:56 CST 2014

namespaceID=1422355035

clusterID=CID-597dcc33-c77e-4d18-a8b9-6e940b63d3e6

cTime=0

storageType=NAME_NODE

blockpoolID=BP-178246317-100.101.138.60-1416760316740

layoutVersion=-40

[hadoop@abcd07 current]$  

clusterID不一样的原因:在第一次格式化dfs后,启动并使用了hadoop,后来又重新执行了格式化命令(hdfs namenode -format),这时namenode的clusterID会重新生成,而datanode的clusterID
保持不变。

6、修改各datanode上的namenode上的 namenode位置下的current目录下的VERSION文件,将clusterID修改为与 namenode上的clusterID一致,问题解决。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐