【发布时间】:2023-04-01 08:34:01
【问题描述】:
我已经在 8 节点集群上部署了 hadoop (0.20.203.0rc1)。将文件上传到 hdfs 后,我只在其中一个节点上得到了这个文件,而不是均匀分布在所有节点上。可能是什么问题?
$HADOOP_HOME/bin/hadoop dfs -copyFromLocal ../data/rmat-20.0 /user/frolo/input/rmat-20.0
$HADOOP_HOME/bin/hadoop dfs -stat "%b %o %r %n" /user/frolo/input/rmat-*
1220222968 67108864 1 rmat-20.0
$HADOOP_HOME/bin/hadoop dfsadmin -report
Configured Capacity: 2536563998720 (2.31 TB)
Present Capacity: 1642543419392 (1.49 TB)
DFS Remaining: 1641312030720 (1.49 TB)
DFS Used: 1231388672 (1.15 GB)
DFS Used%: 0.07%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 8 (8 total, 0 dead)
Name: 10.10.1.15:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131536928768 (122.5 GB)
DFS Remaining: 185533546496(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.51%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.13:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131533377536 (122.5 GB)
DFS Remaining: 185537097728(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.52%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.17:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 120023924736 (111.78 GB)
DFS Remaining: 197046550528(183.51 GB)
DFS Used%: 0%
DFS Remaining%: 62.15%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.18:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 78510628864 (73.12 GB)
DFS Remaining: 238559846400(222.18 GB)
DFS Used%: 0%
DFS Remaining%: 75.24%
Last contact: Fri Feb 07 12:10:24 MSK 2014
Name: 10.10.1.14:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131537530880 (122.5 GB)
DFS Remaining: 185532944384(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.51%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.11:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 1231216640 (1.15 GB)
Non DFS Used: 84698116096 (78.88 GB)
DFS Remaining: 231141167104(215.27 GB)
DFS Used%: 0.39%
DFS Remaining%: 72.9%
Last contact: Fri Feb 07 12:10:24 MSK 2014
Name: 10.10.1.16:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 131537494016 (122.5 GB)
DFS Remaining: 185532981248(172.79 GB)
DFS Used%: 0%
DFS Remaining%: 58.51%
Last contact: Fri Feb 07 12:10:27 MSK 2014
Name: 10.10.1.12:50010
Decommission Status : Normal
Configured Capacity: 317070499840 (295.29 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 84642578432 (78.83 GB)
DFS Remaining: 232427896832(216.47 GB)
DFS Used%: 0%
DFS Remaining%: 73.3%
Last contact: Fri Feb 07 12:10:27 MSK 2014
【问题讨论】:
-
我刚刚使用相同的命令上传了另一个文件: $HADOOP_HOME/bin/hadoop dfs -copyFromLocal ../data/rmat-20.0 /user/frolo/input/rmat-20.0-2 和它也已加载到 10.10.1.11 节点,顺便说一下,这是我运行命令的节点(主节点)。
-
你的复制因子是多少? HDFS 数据可能并不总是均匀地放置在 DataNode 中。如果您主要关心的是单个节点上的所有数据,并且如果您正在寻找跨节点强制平衡数据的方法(以任何值复制),一个简单的选项是 $HADOOP_HOME/bin/start-balancer.sh 它将运行一个平衡过程自动在集群中移动块