【问题标题】:Hadoop cluster setup - java.net.ConnectException: Connection refusedHadoop 集群设置 - java.net.ConnectException:连接被拒绝
【发布时间】:2015-02-22 18:04:24
【问题描述】:

我想在伪分布式模式下设置一个 hadoop 集群。我设法执行了所有设置步骤,包括在我的机器上启动 Namenode、Datanode、Jobtracker 和 Tasktracker。

然后我尝试运行一些示例程序并遇到java.net.ConnectException: Connection refused 错误。我回到了在独立模式下运行某些操作的最初步骤,并遇到了同样的问题。

我什至对所有安装步骤进行了三重检查,但不知道如何修复它。 (我是 Hadoop 新手和 Ubuntu 初学者,因此如果提供任何指南或提示,我恳请您“考虑到这一点”)。

这是我不断收到的错误输出

hduser@marta-komputer:/usr/local/hadoop$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep input output 'dfs[a-z.]+'
15/02/22 18:23:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/22 18:23:04 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
java.net.ConnectException: Call From marta-komputer/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
    at org.apache.hadoop.ipc.Client.call(Client.java:1472)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy9.delete(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:521)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.delete(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1929)
    at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:638)
    at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:634)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:634)
    at org.apache.hadoop.examples.Grep.run(Grep.java:95)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.examples.Grep.main(Grep.java:101)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
    at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
    at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
    at org.apache.hadoop.ipc.Client.call(Client.java:1438)
    ... 32 more

etc/hadoop/hadoop-env.sh 文件:

# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/java-8-oracle

# The jsvc implementation to use. Jsvc is required to run secure datanodes
# that bind to privileged ports to provide authentication of data transfer
# protocol.  Jsvc is not required if SASL is configured for authentication of
# data transfer protocol using non-privileged ports.
#export JSVC_HOME=${JSVC_HOME}

export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}

# Extra Java CLASSPATH elements.  Automatically insert capacity-scheduler.
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
  if [ "$HADOOP_CLASSPATH" ]; then
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
  else
    export HADOOP_CLASSPATH=$f
  fi
done

# The maximum amount of heap to use, in MB. Default is 1000.
#export HADOOP_HEAPSIZE=
#export HADOOP_NAMENODE_INIT_HEAPSIZE=""

# Extra Java runtime options.  Empty by default.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"

# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"

export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"

export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
export HADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
#HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"

# On secure datanodes, user to run the datanode as after dropping privileges.
# This **MUST** be uncommented to enable secure HDFS if using privileged ports
# to provide authentication of data transfer protocol.  This **MUST NOT** be
# defined if SASL is configured for authentication of data transfer protocol
# using non-privileged ports.
export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}

# Where log files are stored.  $HADOOP_HOME/logs by default.
#export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER

# Where log files are stored in the secure data environment.
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}

# HDFS Mover specific parameters
###
# Specify the JVM options to be used when starting the HDFS Mover.
# These options will be appended to the options specified as HADOOP_OPTS
# and therefore may override any similar flags set in HADOOP_OPTS
#
# export HADOOP_MOVER_OPTS=""

###
# Advanced Users Only!
###

# The directory where pid files are stored. /tmp by default.
# NOTE: this should be set to a directory that can only be written to by 
#       the user that will run the hadoop daemons.  Otherwise there is the
#       potential for a symlink attack.
export HADOOP_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}

# A string representing this instance of hadoop. $USER by default.
export HADOOP_IDENT_STRING=$USER

.bashrc 文件 Hadoop 相关片段:

# -- HADOOP ENVIRONMENT VARIABLES START -- #
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
# -- HADOOP ENVIRONMENT VARIABLES END -- #

/usr/local/hadoop/etc/hadoop/core-site.xml文件:

<configuration>

<property>
  <name>hadoop.tmp.dir</name>
  <value>/usr/local/hadoop_tmp</value>
  <description>A base for other temporary directories.</description>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>

</configuration>

/usr/local/hadoop/etc/hadoop/hdfs-site.xml文件:

<configuration>
<property>
      <name>dfs.replication</name>
      <value>1</value>
 </property>
 <property>
      <name>dfs.namenode.name.dir</name>
      <value>file:/usr/local/hadoop_tmp/hdfs/namenode</value>
 </property>
 <property>
      <name>dfs.datanode.data.dir</name>
      <value>file:/usr/local/hadoop_tmp/hdfs/datanode</value>
 </property>
</configuration>

/usr/local/hadoop/etc/hadoop/yarn-site.xml文件:

<configuration> 
<property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
</property>
<property>
      <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
      <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

/usr/local/hadoop/etc/hadoop/mapred-site.xml文件:

<configuration>
<property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
</property>
<configuration>

运行hduser@marta-komputer:/usr/local/hadoop$ bin/hdfs namenode -format 会产生如下输出(我用(...) 替换了它的一部分):

hduser@marta-komputer:/usr/local/hadoop$ bin/hdfs namenode -format
15/02/22 18:50:47 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = marta-komputer/127.0.1.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.6.0
STARTUP_MSG:   classpath = /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/htrace-core-3.0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli (...)2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.6.0.jar:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
STARTUP_MSG:   build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins' on 2014-11-13T21:10Z
STARTUP_MSG:   java = 1.8.0_31
************************************************************/
15/02/22 18:50:47 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
15/02/22 18:50:47 INFO namenode.NameNode: createNameNode [-format]
15/02/22 18:50:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Formatting using clusterid: CID-0b65621a-eab3-47a4-bfd0-62b5596a940c
15/02/22 18:50:48 INFO namenode.FSNamesystem: No KeyProvider found.
15/02/22 18:50:48 INFO namenode.FSNamesystem: fsLock is fair:true
15/02/22 18:50:48 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
15/02/22 18:50:48 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
15/02/22 18:50:48 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
15/02/22 18:50:48 INFO blockmanagement.BlockManager: The block deletion will start around 2015 Feb 22 18:50:48
15/02/22 18:50:48 INFO util.GSet: Computing capacity for map BlocksMap
15/02/22 18:50:48 INFO util.GSet: VM type       = 64-bit
15/02/22 18:50:48 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB
15/02/22 18:50:48 INFO util.GSet: capacity      = 2^21 = 2097152 entries
15/02/22 18:50:48 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
15/02/22 18:50:48 INFO blockmanagement.BlockManager: defaultReplication         = 1
15/02/22 18:50:48 INFO blockmanagement.BlockManager: maxReplication             = 512
15/02/22 18:50:48 INFO blockmanagement.BlockManager: minReplication             = 1
15/02/22 18:50:48 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
15/02/22 18:50:48 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks  = false
15/02/22 18:50:48 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
15/02/22 18:50:48 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
15/02/22 18:50:48 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
15/02/22 18:50:48 INFO namenode.FSNamesystem: fsOwner             = hduser (auth:SIMPLE)
15/02/22 18:50:48 INFO namenode.FSNamesystem: supergroup          = supergroup
15/02/22 18:50:48 INFO namenode.FSNamesystem: isPermissionEnabled = true
15/02/22 18:50:48 INFO namenode.FSNamesystem: HA Enabled: false
15/02/22 18:50:48 INFO namenode.FSNamesystem: Append Enabled: true
15/02/22 18:50:48 INFO util.GSet: Computing capacity for map INodeMap
15/02/22 18:50:48 INFO util.GSet: VM type       = 64-bit
15/02/22 18:50:48 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB
15/02/22 18:50:48 INFO util.GSet: capacity      = 2^20 = 1048576 entries
15/02/22 18:50:48 INFO namenode.NameNode: Caching file names occuring more than 10 times
15/02/22 18:50:48 INFO util.GSet: Computing capacity for map cachedBlocks
15/02/22 18:50:48 INFO util.GSet: VM type       = 64-bit
15/02/22 18:50:48 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB
15/02/22 18:50:48 INFO util.GSet: capacity      = 2^18 = 262144 entries
15/02/22 18:50:48 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
15/02/22 18:50:48 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
15/02/22 18:50:48 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000
15/02/22 18:50:48 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
15/02/22 18:50:48 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
15/02/22 18:50:48 INFO util.GSet: Computing capacity for map NameNodeRetryCache
15/02/22 18:50:48 INFO util.GSet: VM type       = 64-bit
15/02/22 18:50:48 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
15/02/22 18:50:48 INFO util.GSet: capacity      = 2^15 = 32768 entries
15/02/22 18:50:48 INFO namenode.NNConf: ACLs enabled? false
15/02/22 18:50:48 INFO namenode.NNConf: XAttrs enabled? true
15/02/22 18:50:48 INFO namenode.NNConf: Maximum size of an xattr: 16384
Re-format filesystem in Storage Directory /usr/local/hadoop_tmp/hdfs/namenode ? (Y or N) Y
15/02/22 18:50:50 INFO namenode.FSImage: Allocated new BlockPoolId: BP-948369552-127.0.1.1-1424627450316
15/02/22 18:50:50 INFO common.Storage: Storage directory /usr/local/hadoop_tmp/hdfs/namenode has been successfully formatted.
15/02/22 18:50:50 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
15/02/22 18:50:50 INFO util.ExitUtil: Exiting with status 0
15/02/22 18:50:50 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at marta-komputer/127.0.1.1
************************************************************/

dfsyarn 开始会产生以下输出:

hduser@marta-komputer:/usr/local/hadoop$ start-dfs.sh
15/02/22 18:53:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduser-namenode-marta-komputer.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduser-datanode-marta-komputer.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduser-secondarynamenode-marta-komputer.out
15/02/22 18:53:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
hduser@marta-komputer:/usr/local/hadoop$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduser-resourcemanager-marta-komputer.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduser-nodemanager-marta-komputer.out

在此之后不久致电jps

hduser@marta-komputer:/usr/local/hadoop$ jps
11696 ResourceManager
11842 NodeManager
11171 NameNode
11523 SecondaryNameNode
12167 Jps

netstat 输出:

hduser@marta-komputer:/usr/local/hadoop$ sudo netstat -lpten | grep java
tcp        0      0 0.0.0.0:8088            0.0.0.0:*               LISTEN      1001       690283      11696/java      
tcp        0      0 0.0.0.0:42745           0.0.0.0:*               LISTEN      1001       684574      11842/java      
tcp        0      0 0.0.0.0:13562           0.0.0.0:*               LISTEN      1001       680955      11842/java      
tcp        0      0 0.0.0.0:8030            0.0.0.0:*               LISTEN      1001       684531      11696/java      
tcp        0      0 0.0.0.0:8031            0.0.0.0:*               LISTEN      1001       684524      11696/java      
tcp        0      0 0.0.0.0:8032            0.0.0.0:*               LISTEN      1001       680879      11696/java      
tcp        0      0 0.0.0.0:8033            0.0.0.0:*               LISTEN      1001       687392      11696/java      
tcp        0      0 0.0.0.0:8040            0.0.0.0:*               LISTEN      1001       680951      11842/java      
tcp        0      0 127.0.0.1:9000          0.0.0.0:*               LISTEN      1001       687242      11171/java      
tcp        0      0 0.0.0.0:8042            0.0.0.0:*               LISTEN      1001       680956      11842/java      
tcp        0      0 0.0.0.0:50090           0.0.0.0:*               LISTEN      1001       690252      11523/java      
tcp        0      0 0.0.0.0:50070           0.0.0.0:*               LISTEN      1001       687239      11171/java  

/etc/hosts 文件:

127.0.0.1       localhost
127.0.1.1       marta-komputer

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

================================================ =====

更新 1。

我更新了 core-site.xml,现在我有了:

<property>
<name>fs.default.name</name>
<value>hdfs://marta-komputer:9000</value>
</property>

但我一直收到错误 - 现在开始为:

15/03/01 00:59:34 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
java.net.ConnectException: Call From marta-komputer.home/192.168.1.8 to marta-komputer:9000 failed on connection exception:     java.net.ConnectException: Connection refused; For more details see:    http://wiki.apache.org/hadoop/ConnectionRefused

我还注意到telnet localhost 9000 不起作用:

hduser@marta-komputer:~$ telnet localhost 9000
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused

【问题讨论】:

  • 分享您的日志文件。
  • 当我从 Standalone Operation 部分执行程序时(请参阅文档:hadoop.apache.org/docs/current/hadoop-project-dist/…hadoop/logs 中的任何文件都没有更新(我检查过)所以据我所知没有生成日志。
  • 您可以尝试使用nmap localhostnmap marta-komputer 找出哪些端口实际上是开放的。
  • 嗨@AndreySozykin 谢谢你的建议!我同时运行nmap localhosnmap marta-komputer 并收到以下结果:pic / txt。您能否帮助我并提供一些关于解释这些结果的想法?提前谢谢!
  • nmap 列出您计算机上的开放端口。 nmap 输出中没有端口 9000。因此,端口被关闭。您的防火墙可能仍然打开,或者 java 进程没有运行。

标签: java hadoop configuration connectexception


【解决方案1】:

对我来说,这些步骤有效

  1. stop-all.sh
  2. hadoop namenode -format
  3. start-all.sh

【讨论】:

  • 成功了。多么垃圾的消息:Connection refused。误导人
  • 在这些步骤之后,我无法再将本地文件复制到 dfs。错误:copyFromLocal: File /user/rovkp/trip_data_small.csv._COPYING_ could only be written to 0 of the 1 minReplication nodes. There are 0 datanode(s) running and no node(s) are excluded in this operation.
  • 就我而言,这还不够。执行remove 'hadoop.tmp.dir'目录后(在core-site.xml中找到值),一切正常。
  • 但是为什么每次启动DFS都要格式化namenode呢?
  • 成功了。那么'hadoop namenode -format'究竟是做什么的呢?格式标志有什么作用?
【解决方案2】:

您好,编辑您的 conf/core-site.xml 并将 localhost 更改为 0.0.0.0。使用下面的conf。应该可以的。

<configuration>
  <property>
 <name>fs.default.name</name>
 <value>hdfs://0.0.0.0:9000</value>
</property>

【讨论】:

  • 谢谢,我认为这个问题和我的花药stackoverflow.com/questions/34410181/… 一样,我有点困惑这个值不是分配地址而是谁可以访问它?
  • 这对我有用。我相信这是其他服务器类似问题的常见解决方案,即服务器仅侦听 localhost 或仅 IP。
  • 地址“0.0.0.0”的意思是,在服务器上,“在你拥有的所有网络接口上启动你的服务器”。在客户端上,它不会告诉您主机在哪里。客户端无法与 0.0.0.0 的服务通信,因为没有关于服务运行位置的信息。
【解决方案3】:

netstat 输出可以看到进程正在监听地址127.0.0.1

tcp        0      0 127.0.0.1:9000          0.0.0.0:*  ...

从异常消息中你可以看到它试图连接到地址127.0.1.1

java.net.ConnectException: Call From marta-komputer/127.0.1.1 to localhost:9000 failed ...

在例外中进一步提及结束

For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

在这个页面上你可以找到

检查在 /etc/hosts 中没有映射到 127.0.0.1 或 127.0.1.1 的主机名条目(Ubuntu 因这方面而臭名昭著)

所以结论是删除/etc/hosts中的这一行

127.0.1.1       marta-komputer

【讨论】:

  • 感谢您的关注!我最终能够检查出这一点并评论您指出的行没有帮助 - 现在错误以:java.net.ConnectException: Call From marta-komputer.home/192.168.1.8 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
【解决方案4】:

我对 OP 也有类似的问题。正如终端输出所建议的那样,我去了 http://wiki.apache.org/hadoop/ConnectionRefused

我尝试按照此处的建议更改我的 /etc/hosts 文件,即按照 OP 的建议删除 127.0.1.1 会产生另一个错误。

所以最后,我保持原样。以下是我的 /etc/hosts

127.0.0.1       localhost.localdomain   localhost
127.0.1.1       linux
# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

最后,我发现我的namenode没有正确启动,即 当你在终端输入sudo netstat -lpten | grep java时,9000端口上不会有任何JVM进程在运行(监听)。

所以我分别为namenode和datanode创建了两个目录(如果你还没有这样做的话)。你不必放在我放的地方,请根据你的hadoop目录替换它。 即

mkdir -p /home/hadoopuser/hadoop-2.6.2/hdfs/namenode
mkdir -p /home/hadoopuser/hadoop-2.6.2/hdfs/datanode

我重新配置了我的 hdfs-site.xml

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
   <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/home/hadoopuser/hadoop-2.6.2/hdfs/namenode</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/home/hadoopuser/hadoop-2.6.2/hdfs/datanode</value>
    </property>
</configuration>

在终端中,使用脚本 stop-dfs.shstop-yarn.sh 停止 hdfs 和 yarn。它们位于您的 hadoop 目录/sbin 中。就我而言,它是 /home/hadoopuser/hadoop-2.6.2/sbin/。

然后使用脚本 start-dfs.shstart-yarn.sh 启动你的 hdfs 和 yarn 启动后,在终端输入jps,查看JVM进程是否正常运行。它应该显示以下内容。

15678 NodeManager
14982 NameNode
15347 SecondaryNameNode
23814 Jps
15119 DataNode
15548 ResourceManager

然后再次尝试使用netstat查看你的namenode是否在监听9000端口

sudo netstat -lpten | grep java

如果您成功设置了名称节点,您应该会在终端输出中看到以下内容。

tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN 1001 175157 14982/java

然后尝试输入命令hdfs dfs -mkdir /user/hadoopuser 如果此命令执行成功,现在您可以通过hdfs dfs -ls /user列出您在HDFS用户目录中的目录

【讨论】:

  • 谢谢,这对我有帮助。
  • 我在 Virtual Box 上使用 Ubuntu VM,并试图在输入 jps 后显示名称节点,但没有运气。我花了几天时间寻找解决方案。我按照上面的说明进行操作。起初,它不起作用,当我检查它们时,namenode 和 datanode 文件夹是空的。所以我不得不 stop-dfs.sh 和 stop-yarn.sh,然后使用 hdfs namenode -format。在我这样做之后,我在这两个文件夹中得到了新文件。我再次使用了 start-dfs.sh 和 start-yarn.sh。当我输入 jps 时,我得到了 namenode、datanode、resourceman.ager 等,并且连接错误消息消失了。感谢您的解决方案。
【解决方案5】:

对我来说,我无法集群我的 zookeeper。

hdfs haadmin -getServiceState 1
active

hdfs haadmin -getServiceState 2
active

我的 hadoop-hdfs-zkfc-[hostname].log 显示:

2017-04-14 11:46:55,351 警告 org.apache.hadoop.ha.HealthMonitor: 尝试监控 NameNode 健康状况的传输级异常 HOST/192.168.1.55:9000:java.net.ConnectException:连接被拒绝 从 HOST/192.168.1.55 到 HOST:9000 的呼叫连接失败 异常:java.net.ConnectException:连接被拒绝;更多 详情见:http://wiki.apache.org/hadoop/ConnectionRefused

解决方案:

hdfs-site.xml
  <property>
    <name>dfs.namenode.rpc-bind-host</name>
      <value>0.0.0.0</value>
  </property>

之前

netstat -plunt

tcp        0      0 192.168.1.55:9000        0.0.0.0:*               LISTEN      13133/java

nmap localhost -p 9000

Starting Nmap 6.40 ( http://nmap.org ) at 2017-04-14 12:15 EDT
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000047s latency).
Other addresses for localhost (not scanned): 127.0.0.1
PORT     STATE  SERVICE
9000/tcp closed cslistener

之后

netstat -plunt
tcp        0      0 0.0.0.0:9000            0.0.0.0:*               LISTEN      14372/java

nmap localhost -p 9000

Starting Nmap 6.40 ( http://nmap.org ) at 2017-04-14 12:28 EDT
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000039s latency).
Other addresses for localhost (not scanned): 127.0.0.1
PORT     STATE SERVICE
9000/tcp open  cslistener

【讨论】:

    【解决方案6】:

    在 /etc/hosts 中:

    1. 添加这一行:

    你的 IP 地址你的主机名

    示例:192.168.1.8 主控

    在 /etc/hosts 中:

    1. 删除127.0.1.1的行(会导致回环)

    2. 在您的核心站点中,将 localhost 更改为 your-ip 或 your-hostname

    现在,重启集群。

    【讨论】:

      【解决方案7】:

      确保 HDFS 在线。从$HADOOP_HOME/sbin/start-dfs.sh开始 完成此操作后,您使用 telnet localhost 9001 进行的测试应该可以正常工作。

      【讨论】:

        【解决方案8】:

        检查您的防火墙设置 并设置

          <property>
          <name>fs.default.name</name>
          <value>hdfs://MachineName:9000</value>
          </property>
        

        将本地主机替换为机器名称

        【讨论】:

        • 防火墙可以停止连接到9000端口,禁用防火墙
        • 感谢您的建议!不幸的是,它不起作用。我按照您的建议更改了 标记值(更改为:&lt;value&gt;hdfs://marta-komputer:9000&lt;/value&gt;)并确保禁用了防火墙(在 Ubuntu 14.04 下:sudo ufw disableFirewall stopped and disabled on system startup)。现在我收到的错误开始于:Connecting to ResourceManager at /0.0.0.0:8032 java.net.ConnectException: Call From marta-komputer/127.0.1.1 to marta-komputer:9000 failed on connection exception: java.net.ConnectException: Connection refused;
        • 我最近在安装 hadoop 集群时遇到了这个问题。在我的情况下,我无法正确启动 hadoop 集群,因此 hadoop 无法打开特定端口。我的案例问题是堆内存。你能检查一下你是否有足够的堆内存可用。还是溢出了?您正在使用哪个 Hadoop 发行版。请提供详细信息。根据情况,我猜测您的集群没有正常启动。
        • 嗨!感谢您的评论。请注意,我意识到即使在运行 独立操作 时,我也会不断收到连接被拒绝的错误(请参阅我尝试从 official documentation 运行的示例)。我相信,这应该与启动 hadoop 集群分开考虑和解决(准确地说:之前):)
        【解决方案9】:

        hduser@marta-komputer:/usr/local/hadoop$ jps

        11696 资源管理器

        11842 节点管理器

        11171 名称节点

        11523 次要名称节点

        12167 日/秒

        你的 DataNode 在哪里? Connection refused 问题也可能是由于没有活动的DataNode。检查数据节点日志是否有问题。

        更新:

        对于这个错误:

        15/03/01 00:59:34 INFO client.RMProxy: 在 /0.0.0.0:8032 连接到 ResourceManager java.net.ConnectException:从marta-komputer.home/192.168.1.8 调用marta-komputer:9000 连接异常失败:java.net.ConnectException:连接被拒绝;更多详情见:http://wiki.apache.org/hadoop/ConnectionRefused

        yarn-site.xml 中添加这些行:

        <property>
        <name>yarn.resourcemanager.address</name>
        <value>192.168.1.8:8032</value>
        </property>
        <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>192.168.1.8:8030</value>
        </property>
        <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>192.168.1.8:8031</value>
        </property>
        

        重启 hadoop 进程。

        【讨论】:

        • 嗨!谢谢您的回答。请注意,我已经意识到即使在运行 独立操作 时我也会不断收到 connection refused 错误(请参阅我尝试从 official documentation 运行的示例)。我相信,这应该与启动节点分开考虑和解决(准确地说:之前):)
        【解决方案10】:

        您的问题非常有趣。由于系统的复杂性和涉及的许多移动部件,Hadoop 设置可能会令人沮丧一段时间。我认为您面临的问题绝对是防火墙问题。 我的 hadoop 集群有类似的设置。使用命令添加防火墙规则:

         sudo iptables -A INPUT -p tcp --dport 9000 -j REJECT
        

        我可以看到确切的问题:

        15/03/02 23:46:10 INFO client.RMProxy: Connecting to ResourceManager at  /0.0.0.0:8032
        java.net.ConnectException: Call From mybox/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
             at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        

        您可以使用以下命令验证您的防火墙设置:

        /usr/local/hadoop/etc$ sudo iptables -L
        Chain INPUT (policy ACCEPT)
        target     prot opt source               destination         
        REJECT     tcp  --  anywhere             anywhere             tcp dpt:9000 reject-with icmp-port-unreachable
        
        Chain FORWARD (policy ACCEPT)
        target     prot opt source               destination         
        
        Chain OUTPUT (policy ACCEPT)
        target     prot opt source               destination   
        

        一旦识别出可疑规则,就可以使用如下命令将其删除:

         sudo iptables -D INPUT -p tcp --dport 9000 -j REJECT 
        

        现在,连接应该可以通过了。

        【讨论】:

        • 嗨!谢谢您的回答!我跑了sudo iptables -A INPUT -p tcp --dport 9000 -j REJECT,在我的情况下它什么也没有。尽管如此,我当时运行了sudo iptables -D INPUT -p tcp --dport 9000 -j REJECT,但仍然遇到Connection refused 的问题。
        【解决方案11】:

        根据我的经验

        15/02/22 18:23:04 WARN util.NativeCodeLoader: Unable to load native-hadoop
        library for your platform... using builtin-java classes where applicable
        

        你可能有 64 位版本的操作系统,而 hadoop 安装 32 位。参考this

        java.net.ConnectException: Call From marta-komputer/127.0.1.1 to
        localhost:9000 failed on connection exception: java.net.ConnectException: 
        connection refused; For more details see:   
        http://wiki.apache.org/hadoop/ConnectionRefused
        

        这个问题是指你的 ssh 公钥授权。请提供有关您的 ssh 设置的详细信息。

        请参考this 链接以查看完整的步骤。

        如果

        也提供信息
        cat $HOME/.ssh/authorized_keys
        

        是否返回任何结果。

        【讨论】:

        • 嗨!谢谢您的回答!让我只关注connection refused 问题。我想我已经正确执行了 ssh 公钥授权(我关注了this post, section "Configuring SSH")。我跑了ssh localhost,收到了Welcome to Ubuntu 14.04.1 LTS (...) Last login: Wed Apr 22 22:40:11 2015 from localhost(见full output)。 cat HOME/.ssh/authorized_keys 的输出似乎也是正确的:ssh-rsa AAAAB3NzaC1y (...) Xrtegbh7 hduser@marta-komputer
        • 从 ssh localhost 结果看来,您的 ssh 在 hadoop 和 java 上也运行良好。您是否尝试检查链接中的所有步骤 - wiki.apache.org/hadoop/ConnectionRefused。 ?.尝试重新设置所有内容。似乎配置问题,因为 ssh、java 和 hadoop 运行良好,但我无法从给定的信息中弄清楚。如果您确实解决了您的问题,请告诉我们。
        【解决方案12】:

        我通过将此属性添加到 hdfs-site.xml 解决了同样的问题

        <property>
          <name>fs.default.name</name>
          <value>hdfs://localhost:9000</value>
          </property>
        

        【讨论】:

          【解决方案13】:
          • 停止-:stop-all.sh

          • 格式化namenode-:hadoop namenode-format

          • 再次启动-:start-all.sh

          【讨论】:

            【解决方案14】:

            我在 Hortonworks 也面临同样的问题

            在我重新启动 Ambari 代理和服务器时,问题已经解决。

             systemctl stop ambari-agent 
            systemctl stop ambari-server
            

            来源:Full Article With Resolution

             systemctl start ambari-agent
            systemctl start ambari-server
            

            【讨论】:

              【解决方案15】:

              我遇到了同样的问题,发现 OpenSSH 服务没有运行,它导致了这个问题。启动 SSH 服务后,它工作了。

              检查 SSH 服务是否正在运行:

              ssh localhost
              

              要启动服务,如果 OpenSSH 已经安装:

              sudo /etc/init.d/ssh start
              

              【讨论】:

                【解决方案16】:

                进入$SPARK_HOME/conf,然后打开文件spark-env.sh并添加:

                SPARK_MASTER_HOST= your-IP
                
                SPARK_LOCAL_IP=127.0.0.1
                

                【讨论】:

                  猜你喜欢
                  • 1970-01-01
                  • 1970-01-01
                  • 1970-01-01
                  • 1970-01-01
                  • 2015-05-18
                  • 2018-06-10
                  • 2023-03-21
                  • 2017-08-10
                  相关资源
                  最近更新 更多