【问题标题】:Hbase MasterNotRunningException though Hmaster, regionserver, and Zookeeper are upHbase MasterNotRunningException 尽管 Hmaster、regionserver 和 Zookeeper 已启动
【发布时间】:2013-05-18 07:25:17
【问题描述】:

我已经启动了 hbase 并且所有的守护进程都在运行。

 $ jps
8482 HQuorumPeer
25105 RemoteMavenServer
9133 SecondaryNameNode
11883 HRegionServer
13793 Jps
8545 NameNode
8572 HMaster
11519 Main
25029 Main
8851 DataNode
9435 RunJar

现在让我们尝试列出表格:

hbase(main):004:0* list
        TABLE                                                                                                                                                   

ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 times

Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:

主日志的尾部:

2013-05-17 22:48:35,609 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=localhost,60020,1368856115352

Zookeeper 日志尾部:

$ tail *zoo*.log
2013-05-18 00:14:27,651 INFO org.apache.zookeeper.server.NIOServerCnxnFactory: Accepted socket connection from /127.0.0.1:49826
2013-05-18 00:14:27,652 INFO org.apache.zookeeper.server.ZooKeeperServer: Client attempting to establish new session at /127.0.0.1:49826
2013-05-18 00:14:27,666 INFO org.apache.zookeeper.server.ZooKeeperServer: Established session 0x13eb59ceb22001e with negotiated timeout 180000 for client /127.0.0.1:49826

regionserver 日志的尾部:

2013-05-18 00:08:35,416 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=2.03 MB, free=244.85 MB, max=246.88 MB, blocks=0, accesses=0, hits=0, hitRatio=0cachingAccesses=0, cachingHits=0, cachingHitsRatio=0evictions=0, evicted=0, evictedPerRun=NaN
2013-05-18 00:13:35,416 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=2.03 MB, free=244.85 MB, max=246.88 MB, blocks=0, accesses=0, hits=0, hitRatio=0cachingAccesses=0, cachingHits=0, cachingHitsRatio=0evictions=0, evicted=0, evictedPerRun=NaN
2013-05-18 00:18:35,416 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=2.03 MB, free=244.85 MB, max=246.88 MB, blocks=0, accesses=0, hits=0, hitRatio=0cachingAccesses=0, cachingHits=0, cachingHitsRatio=0evictions=0, evicted=0, evictedPerRun=NaN

更多细节(在下面回复@roman)。安全模式已关闭。

fsck 给出:

hadoop fsck /

.Status: HEALTHY
 Total size:    321466989 B
 Total dirs:    412
 Total files:   446
 Total blocks (validated):  355 (avg. block size 905540 B)
 Minimally replicated blocks:   355 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:   334 (94.08451 %)
 Mis-replicated blocks:     0 (0.0 %)
 Default replication factor:    3
 Average block replication: 1.0
 Corrupt blocks:        0
 Missing replicas:      1109 (312.39438 %)
 Number of data-nodes:      1
 Number of racks:       1
FSCK ended at Sun May 19 13:09:14 PDT 2013 in 147 milliseconds

但是,正如您怀疑 hbase gui 没有在 60030 上运行。我在 hbase 日志中没有看到错误来解释原因。

更多信息@roman:hbase hbck 只是因为 MasterNotRunningException 而超时

stephenb@gondolin:/shared$ hbase hbck 
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48 GMT
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:host.name=gondolin
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_37
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc.
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.home=/shared/jdk1.6.0_37/jre
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/shared/hadoop-1.0.3/libexec/../lib/native/Linux-amd64-64:/shared/hbase/lib/native/Linux-amd64-64
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:os.version=3.2.0-39-generic
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:user.name=stephenb
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/stephenb
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Client environment:user.dir=/shared
  13/05/19 13:16:16 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
  13/05/19 13:16:16 INFO zookeeper.ClientCnxn: Opening socket connection to server /127.0.0.1:2181
  13/05/19 13:16:16 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 24642@gondolin
  13/05/19 13:16:16 WARN client.ZooKeeperSaslClient: SecurityException: java.lang.SecurityException: Unable to locate a login configuration occurred when trying to find JAAS configuration.
  13/05/19 13:16:16 INFO client.ZooKeeperSaslClient: Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
  13/05/19 13:16:16 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
  13/05/19 13:16:16 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13eb59ceb22002f, negotiated timeout = 180000
  13/05/19 13:17:27 INFO client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x13eb59ceb22002f
  13/05/19 13:17:27 INFO zookeeper.ZooKeeper: Session: 0x13eb59ceb22002f closed
  13/05/19 13:17:27 INFO zookeeper.ClientCnxn: EventThread shut down
  13/05/19 13:17:27 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
  13/05/19 13:17:27 INFO zookeeper.ClientCnxn: Opening socket connection to server /127.0.0.1:2181
  13/05/19 13:17:27 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 24642@gondolin
  13/05/19 13:17:27 WARN client.ZooKeeperSaslClient: SecurityException: java.lang.SecurityException: Unable to locate a login configuration occurred when trying to find JAAS configuration.
  13/05/19 13:17:27 INFO client.ZooKeeperSaslClient: Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
  13/05/19 13:17:27 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
  13/05/19 13:17:27 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13eb59ceb220030, negotiated timeout = 180000
  13/05/19 13:18:39 INFO client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x13eb59ceb220030
  13/05/19 13:18:39 INFO zookeeper.ZooKeeper: Session: 0x13eb59ceb220030 closed
  13/05/19 13:18:39 INFO zookeeper.ClientCnxn: EventThread shut down
  13/05/19 13:18:39 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
  13/05/19 13:18:39 INFO zookeeper.ClientCnxn: Opening socket connection to server /127.0.0.1:2181
  13/05/19 13:18:39 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 24642@gondolin
  13/05/19 13:18:39 WARN client.ZooKeeperSaslClient: SecurityException: java.lang.SecurityException: Unable to locate a login configuration occurred when trying to find JAAS configuration.
  13/05/19 13:18:39 INFO client.ZooKeeperSaslClient: Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
  13/05/19 13:18:39 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
  13/05/19 13:18:39 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13eb59ceb220031, negotiated timeout = 180000
  13/05/19 13:18:51 DEBUG client.HConnectionManager$HConnectionImplementation: The connection to null was closed by the finalize method.
  13/05/19 13:18:51 DEBUG client.HConnectionManager$HConnectionImplementation: 
  13/05/19 13:29:18 INFO client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x13eb59ceb220039
    13/05/19 13:29:18 INFO zookeeper.ZooKeeper: Session: 0x13eb59ceb220039 closed
    13/05/19 13:29:18 INFO zookeeper.ClientCnxn: EventThread shut down
    Exception in thread "main" org.apache.hadoop.hbase.MasterNotRunningException: Retried 10 times
        at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:130)
        at org.apache.hadoop.hbase.util.HBaseFsck.connect(HBaseFsck.java:264)
        at org.apache.hadoop.hbase.util.HBaseFsck.exec(HBaseFsck.java:3331)
        at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3192)

【问题讨论】:

    标签: java hadoop hbase hdfs apache-zookeeper


    【解决方案1】:

    HBase Web UI 没有运行,是吗? 在单节点伪分布式集群完全崩溃后,我遇到了类似的情况。 HDFS 无法退出安全模式。

    1. 使用hadoop dfsadmin -safemode get 检查 HDFS 是否处于安全模式。
    2. 如果是这样,手动强制安全模式退出hadoop dfsadmin -safemode leave
    3. 您应该会看到进度 - 至少 HBase Web UI 应该是可见的。
    4. 执行 HDFS fsck:hadoop fsck / -move
    5. 好的,如果一切正常,最好执行hbase hbck 检查。

    您可能需要的其他提示:

    • 检查区域服务器与netstat -n -a绑定的位置(检查端口 在您的配置中)。碰巧它是错误的 界面。也请搜索论坛 - Hadoop 有问题 绑定和 IPv6 (check this for example)。
    • 检查 hadoop 是否真的用hadoop dfsadmin -safemode get 退出了安全模式。 HBase 在完成之前不会完全启动。

    【讨论】:

    • 感谢您的回答,我几乎放弃了这里的任何活动。我将在上面的操作中添加详细信息
    • 在这里很难找到关于 HBase 的任何帮助。我试图通过在找到解决方案后立即提出问题/添加答案来改善这一点 - 希望它会改变情况 - 太多的难题 - 太新技术。
    • 谢谢。顺便说一句,我现在正在尝试 hbase hbck .. 它似乎超时了。我在 OP 中添加了来自 hbase hbck 的 stderr/stdout 消息。
    • 这仍然无法正常工作,但您的回答无论如何都是有帮助的,而且鉴于我不太可能获得任何进一步的帮助,这将被视为一种“可用的最佳答案”。跨度>
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2019-08-20
    • 2018-06-04
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2016-04-08
    • 1970-01-01
    相关资源
    最近更新 更多