【问题标题】:hbase Regionserver start, zookeeper start but hmaster is not starting (regionserver.HRegionServer: Failed construction RegionServer)hbase Regionserver 启动,zookeeper 启动但 hmaster 没有启动(regionserver.HRegionServer: Failed construction RegionServer)
【发布时间】:2018-09-12 04:32:10
【问题描述】:

Hbase zookeeper 启动,regionserver 在多节点集群上启动,但 hmaster 没有启动并在下面生成日志文件。

hbase-site.xml 快照

<configuration>

        <property>
                <name>hbase.master</name>
                <value>namenode:60000</value>
        </property>

        <property>
                <name>hbase.rootdir</name>
                <value>hdfs://namenode:9001</value>
        </property>

        <property>
                <name>hbase.cluster.distributed</name>
                <value>true</value>
        </property>

        <property>
                <name>hbase.zookeeper.quorum</name>
                <value>datanode</value>
        </property>

        <property>
                <name>hbase.zookeeper.property.dataDir</name>
                <value>/hadoop2/zookeeper</value>
        </property>

        <property>
                <name>hbase.zookeeper.property.clientPort</name>
                <value>2181</value>
        </property>

</configuration>

两台机器datanode和namenode:

在datanode上:jps命令显示

10977 HRegionServer
10810 HQuorumPeer
1675 DataNode

在namenode上:jps命令显示

12017 ResourceManager
2353 NameNode
14904 Jps
11326 Jps

下面是 hbase-root-master-namenode.log 的快照

2018-09-12 09:52:23,430 ERROR [main] regionserver.HRegionServer: Failed construction RegionServer
java.lang.NoClassDefFoundError: org/apache/htrace/SamplerBuilder
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:635)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
        at org.apache.hadoop.hbase.util.CommonFSUtils.getRootDir(CommonFSUtils.java:358)
        at org.apache.hadoop.hbase.util.CommonFSUtils.isValidWALRootDir(CommonFSUtils.java:407)
        at org.apache.hadoop.hbase.util.CommonFSUtils.getWALRootDir(CommonFSUtils.java:383)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeFileSystem(HRegionServer.java:691)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:600)
        at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:484)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2965)
        at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:236)
        at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149)
        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2983)
Caused by: java.lang.ClassNotFoundException: org.apache.htrace.SamplerBuilder
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 25 more
2018-09-12 09:52:23,432 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster.
        at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2972)
        at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:236

我正在尝试从最近 4 天安装多次 hbase 连接它。 请帮帮我,所有设置都在 Ubuntu 16 上。

以下是区域服务器日志文件:

hbase-root-regionserver-datanode.log

2018-09-12 09:52:25,989 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=datanode:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.PendingWatcher@1c12f3ee
2018-09-12 09:52:26,006 INFO  [main-SendThread(datanode:2181)] zookeeper.ClientCnxn: Opening socket connection to server datanode/192.168.1.134:2181. Will not attempt to authenticate using SASL (unknown error)
2018-09-12 09:52:26,023 INFO  [main-SendThread(datanode:2181)] zookeeper.ClientCnxn: Socket connection established to datanode/192.168.1.134:2181, initiating session
2018-09-12 09:52:26,079 INFO  [main-SendThread(datanode:2181)] zookeeper.ClientCnxn: Session establishment complete on server datanode/192.168.1.134:2181, sessionid = 0x165cc0408850000, negotiated timeout = 90000
2018-09-12 09:52:26,149 INFO  [main] util.log: Logging initialized @3383ms
2018-09-12 09:52:26,233 INFO  [main] http.HttpRequestLog: Http request log for http.requests.regionserver is not defined
2018-09-12 09:52:26,252 INFO  [main] http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter)
2018-09-12 09:52:26,252 INFO  [main] http.HttpServer: Added global filter 'clickjackingprevention' (class=org.apache.hadoop.hbase.http.ClickjackingPreventionFilter)
2018-09-12 09:52:26,255 INFO  [main] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context regionserver
2018-09-12 09:52:26,255 INFO  [main] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2018-09-12 09:52:26,255 INFO  [main] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2018-09-12 09:52:26,284 INFO  [main] http.HttpServer: Jetty bound to port 16030
2018-09-12 09:52:26,286 INFO  [main] server.Server: jetty-9.3.19.v20170502
2018-09-12 09:52:26,327 INFO  [main] handler.ContextHandler: Started o.e.j.s.ServletContextHandler@675ffd1d{/logs,file:///hbase/logs/,AVAILABLE}
2018-09-12 09:52:26,328 INFO  [main] handler.ContextHandler: Started o.e.j.s.ServletContextHandler@30506c0d{/static,file:///hbase/hbase-webapps/static/,AVAILABLE}
2018-09-12 09:52:26,488 INFO  [main] handler.ContextHandler: Started o.e.j.w.WebAppContext@6a0ac48e{/,file:///hbase/hbase-webapps/regionserver/,AVAILABLE}{file:/hbase/hbase-webapps/regionserver}
2018-09-12 09:52:26,498 INFO  [main] server.AbstractConnector: Started ServerConnector@f84967f{HTTP/1.1,[http/1.1]}{0.0.0.0:16030}
2018-09-12 09:52:26,498 INFO  [main] server.Server: Started @3733ms

下面是zookeeper日志文件:hbase-root-zookeeper-datanode.log

2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:host.name=datanode
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:java.version=1.8.0_181
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:java.vendor=Oracle Corporation
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:java.home=/usr/local/jdk1.8.0_181/jre
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: l-4.0.23.Final.jar:/hbase/bin/../lib/org.eclipse.jdt.core-3.8.2.v20130121.jar:/hbase/bin/../lib/osgi-resource-locator-1.0.1.jar:/hbase/bin/../lib/paranamer-2.3.jar:/hbase/bin/../lib/protobuf-java-2.5.0.jar:/hbase/bin/../lib/snappy-java-1.0.5.jar:/hbase/bin/../lib/spymemcached-2.12.2.jar:/hbase/bin/../lib/validation-api-1.1.0.Final.jar:/hbase/bin/../lib/xmlenc-0.52.jar:/hbase/bin/../lib/xz-1.0.jar:/hbase/bin/../lib/zookeeper-3.4.10.jar:/hbase/bin/../lib/client-facing-thirdparty/audience-annotations-0.5.0.jar:/hbase/bin/../lib/client-facing-thirdparty/commons-logging-1.2.jar:/hbase/bin/../lib/client-facing-thirdparty/findbugs-annotations-1.3.9-1.jar:/hbase/bin/../lib/client-facing-thirdparty/htrace-core4-4.2.0-incubating.jar:/hbase/bin/../lib/client-facing-thirdparty/log4j-1.2.17.jar:/hbase/bin/../lib/client-facing-thirdparty/slf4j-api-1.7.25.jar:/hbase/bin/../lib/client-facing-thirdparty/htrace-core-3.1.0-incubating.jar:/hbase/bin/../lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:java.io.tmpdir=/tmp
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:java.compiler=<NA>
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:os.name=Linux
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:os.arch=amd64
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:os.version=4.13.0-46-generic
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:user.name=root
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:user.home=/root
2018-09-12 09:52:21,017 INFO  [main] server.ZooKeeperServer: Server environment:user.dir=/root
2018-09-12 09:52:21,026 INFO  [main] server.ZooKeeperServer: tickTime set to 3000
2018-09-12 09:52:21,026 INFO  [main] server.ZooKeeperServer: minSessionTimeout set to -1
2018-09-12 09:52:21,026 INFO  [main] server.ZooKeeperServer: maxSessionTimeout set to 90000
2018-09-12 09:52:21,037 INFO  [main] server.NIOServerCnxnFactory: binding to port 0.0.0.0/0.0.0.0:2181
2018-09-12 09:52:26,016 INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxnFactory: Accepted socket connection from /192.168.1.134:44004
2018-09-12 09:52:26,028 INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.ZooKeeperServer: Client attempting to establish new session at /192.168.1.134:44004
2018-09-12 09:52:26,028 INFO  [SyncThread:0] persistence.FileTxnLog: Creating new log file: log.3
2018-09-12 09:52:26,077 INFO  [SyncThread:0] server.ZooKeeperServer: Established session 0x165cc0408850000 with negotiated timeout 90000 for client /192.168.1.134:44004

【问题讨论】:

  • 我以前在这里看到过这个错误,当他们将版本从 2.1 降级到 2.0.2 时它就消失了

标签: java hadoop hdfs hbase apache-zookeeper


【解决方案1】:

在我的安装中,我有:

  • Ubuntu 18.04
  • Hadoop 3.1.1
  • HBase 2.1.0

this question 中讨论了相同的错误。

在启动 HBase Master Daemon 时,我遇到了同样的错误:

2018-10-17 17:53:17,058 ERROR [main] regionserver.HRegionServer: Failed construction RegionServer java.lang.NoClassDefFoundError: org/apache/htrace/SamplerBuilder
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:635)  at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619) ...

我猜这个错误是因为 HBase master 使用了 HADOOP_HOME 的 jars 并且找不到 htrace-core 库。

我已经检查了 HBASE_HOME 和 HADOOP_HOME 中的 htrace jar:

$ find $HBASE_HOME/ -type f -name 'htrace-core*' -ls
   689258   1472 -rw-r--r--   1 hadoop   hadoop    1506370 июл  6 18:28 /opt/hbase-2.1.0/lib/client-facing-thirdparty/htrace-core4-4.2.0-incubating.jar
   689260   1444 -rw-r--r--   1 hadoop   hadoop    1475955 июл  6 18:33 /opt/hbase-2.1.0/lib/client-facing-thirdparty/htrace-core-3.1.0-incubating.jar

$ find $HADOOP_HOME/ -type f -name 'htrace-core*' -ls
   672815   1444 -rw-r--r--   1 hadoop   hadoop    1475955 авг  2 09:50 /opt/hadoop-3.1.1/share/hadoop/yarn/timelineservice/lib/htrace-core-3.1.0-incubating.jar
   673317   1468 -rw-r--r--   1 hadoop   hadoop    1502280 авг  2 09:28 /opt/hadoop-3.1.1/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar
   656456   1444 -rw-r--r--   1 hadoop   hadoop    1475955 июл  6 18:33 31 /opt/hadoop-3.1.1/share/hadoop/hdfs/lib/htrace-core4-4.1.0-incubating.jar

所以,我将 HBASE_HOME 中的 htrace-core-3.1.0-incubating.jar 放入 $HADDOP_HOME/share/hadoop/common/lib

我得到了下一个错误:

2018-10-17 18:13:18,380 ERROR [Thread-14] master.HMaster: Failed to become active master                                                                       
java.lang.IllegalStateException: The procedure WAL relies on the ability to hsync for proper operation during component failures, but the underlying filesystem
 does not support doing so. Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness and ensure the config
 value of 'hbase.wal.dir' points to a FileSystem mount that can provide it.                                                                                    
        at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1044)                                              
        at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:383)                                             
        at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:545)                                                               
        at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1325)                                                                   
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:871)                                                           
        at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2109)                                                                  
        at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:566)                                                                               
        at java.lang.Thread.run(Thread.java:748)

正如here 所说,我们所需要的只是简单地 自己重新编译HBase 或将属性放入hbase-site.xml

<property>
  <name>hbase.unsafe.stream.capability.enforce</name>
  <value>false</value>
</property>

我选择了第二种方式。并启动了Hbase。

【讨论】:

    【解决方案2】:

    这可能是因为您有两个版本的 htrace-core(3.1.0 和 4.2.0)

    你应该删除 4.2.0:

    cd /usr/local/hbase/lib/client-facing-thirdparty/
    
    rm -rfv htrace-core4-4.2.0-incubating.jar
    

    【讨论】:

    • 我删除了 3.1.0 但即使这次 hregionserver 没有启动它也无法正常工作。通过删除 4.2.0 classNotFoundExecption 发生
    • 我在 redhat 上部署了相同的设置,它工作正常,但在 ubuntu 中它不工作。
    • @NikhilKumawat 你能确保 htrace jar 也在你的 hadoop 类路径中。请尝试一下
    • 请运行bin/hbase-daemon.sh start master 并粘贴日志(不要忘记标记我以获取通知)
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2019-08-20
    • 1970-01-01
    • 2018-01-01
    • 1970-01-01
    • 2012-09-12
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多