【问题标题】:HBase to Use HDFS HAHBase 使用 HDFS HA
【发布时间】:2020-05-02 08:33:04
【问题描述】:
  1. 我正在尝试使用 Hadoop HA 设置 hbase ha。
  2. 我已经设置了 Hadoop HA,并对其进行了测试。
  3. 但在 HBase 设置中,启动时出现以下错误:
2020-05-02 16:11:09,336 INFO  [main] ipc.RpcServer: regionserver/cluster-hadoop-01/172.18.20.3:16020: started 10 reader(s) listening on port=16020
2020-05-02 16:11:09,473 INFO  [main] metrics.MetricRegistries: Loaded MetricRegistries class org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl
2020-05-02 16:11:09,840 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting
java.lang.RuntimeException: Failed construction of Regionserver: class org.apache.hadoop.hbase.regionserver.HRegionServer
    at org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:2896)
    at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:64)
    at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2911)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:2894)
    ... 5 more
Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: hdfscluster
    at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:417)
    at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:132)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:351)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:285)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:160)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2812)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2849)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2831)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
    at org.apache.hadoop.hbase.util.CommonFSUtils.getRootDir(CommonFSUtils.java:309)
    at org.apache.hadoop.hbase.util.CommonFSUtils.isValidWALRootDir(CommonFSUtils.java:358)
    at org.apache.hadoop.hbase.util.CommonFSUtils.getWALRootDir(CommonFSUtils.java:334)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeFileSystem(HRegionServer.java:683)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:626)
    ... 10 more
Caused by: java.net.UnknownHostException: hdfscluster
    ... 26 more
  • 我认为我的 HBase 设置无法识别我的名称服务 hdfscluster
  • 我尝试了 Hadoop 2.X 和 Hadoop 3.X。
    • Hadoop 2.X:Hadoop 2.10.0 & HBase 1.6.0 & JDK 1.8.0_251 & ZooKeeper 3.6.0。
    • Hadoop 3.X:Hadoop 3.2.1 & HBase 2.2.4 & JDK 1.8.0_251 & ZooKeeper 3.6.0。
    • 操作系统版本:Ubuntu 16.04.6

我的 core-site.xml 有

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hdfscluster</value>
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>file:/data/hadoop/tmp</value>
    </property>
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>cluster-hadoop-01:2181,cluster-hadoop-02:2181,cluster-hadoop-03:2181</value>
    </property>
</configuration>

我的 hdfs-site.xml 有

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/data/hadoop/data/hdfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/data/hadoop/data/hdfs/data</value>
    </property>

    <property>
        <name>dfs.nameservices</name>
        <value>hdfscluster</value>
    </property>
    <property>
        <name>dfs.ha.namenodes.hdfscluster</name>
        <value>nn-01,nn-02</value>
    </property>

    <property>
        <name>dfs.namenode.rpc-address.hdfscluster.nn-01</name>
        <value>cluster-hadoop-01:8020</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address.hdfscluster.nn-02</name>
        <value>cluster-hadoop-02:8020</value>
    </property>

    <property>
        <name>dfs.namenode.http-address.hdfscluster.nn-01</name>
        <value>cluster-hadoop-01:9870</value>
    </property>
    <property>
        <name>dfs.namenode.http-address.hdfscluster.nn-02</name>
        <value>cluster-hadoop-02:9870</value>
    </property>

    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://cluster-hadoop-01:8485;cluster-hadoop-02:8485;cluster-hadoop-03:8485/hdfscluster</value>
    </property>
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/data/hadoop/tmp/journalnode</value>
    </property>

    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.client.failover.proxy.provider</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence(hadoop:22)</value>
    </property>
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/home/hadoop/.ssh/id_rsa</value>
    </property>
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>
</configuration>

我的 hbase-site.xml 有

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://hdfscluster/hbase</value>
    </property>

    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>

    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>cluster-hadoop-01,cluster-hadoop-02,cluster-hadoop-03</value>
    </property>
    <property>
        <name>hbase.zookeeper.property.clientPort</name>
        <value>2181</value>
    </property>
    <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/data/zookeeper/data</value>
    </property>

    <property>
        <name>hbase.tmp.dir</name>
        <value>/data/hbase/tmp</value>
    </property>
</configuration>

我的 hbase-env.sh 有

export JAVA_HOME="/opt/jdk"
export HBASE_MANAGES_ZK=false
export HADOOP_HOME="/opt/hadoop"
export HBASE_CLASSPATH=".:${HADOOP_HOME}/etc/hadoop"
export HBASE_LOG_DIR="/data/hbase/log"

我的 HBase 配置路径:

root@cluster-hadoop-01:~# ll /opt/hbase/conf/
total 56
drwxr-xr-x 2 root root 4096 May  2 16:31 ./
drwxr-xr-x 7 root root 4096 May  2 01:18 ../
-rw-r--r-- 1 root root   18 May  2 10:36 backup-masters
lrwxrwxrwx 1 root root   36 May  2 12:04 core-site.xml -> /opt/hadoop/etc/hadoop/core-site.xml
-rw-r--r-- 1 root root 1811 Jan  6 01:24 hadoop-metrics2-hbase.properties
-rw-r--r-- 1 root root 4616 Jan  6 01:24 hbase-env.cmd
-rw-r--r-- 1 root root 7898 May  2 10:36 hbase-env.sh
-rw-r--r-- 1 root root 2257 Jan  6 01:24 hbase-policy.xml
-rw-r--r-- 1 root root  841 May  2 16:10 hbase-site.xml
lrwxrwxrwx 1 root root   36 May  2 12:04 hdfs-site.xml -> /opt/hadoop/etc/hadoop/hdfs-site.xml
-rw-r--r-- 1 root root 1169 Jan  6 01:24 log4j-hbtop.properties
-rw-r--r-- 1 root root 4949 Jan  6 01:24 log4j.properties
-rw-r--r-- 1 root root   54 May  2 10:33 regionservers

【问题讨论】:

标签: hadoop hbase


【解决方案1】:
  • 通过我的不断尝试,我找到了解决方案,但我仍然不知道原因。 修改hdfs-site.xml配置文件:
    <property>
        <name>dfs.client.failover.proxy.provider.hdfscluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>

【讨论】:

    【解决方案2】:

    当我遇到同样的问题时,我知道我们必须为 HBase 和 HDFS 使用相同的机器。 例如 Node-1 -> 应该有 Active Namenode & HBase MAster Node-2 -> 应该有 StandBy Namenode, Datanode & HBase Backup Master, regionserver Node-3 -> 应该有 Datanode & regionserver

    注意:Namenode 和 HBase Master 机器应该相同,Datanode 和 regionserver 机器应该相同。

    或另一种解决方案,如果您需要将它们保留在不同的节点上 只需在 Hbase 集群的每个节点上的 $HBASE_HOME/conf 目录中复制 hdfs-ste.xml 即可。 确保在 /etc/hosts 文件中也有 hdfs 集群的主机名。

    欢迎任何进一步的建议!

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-12-20
      • 1970-01-01
      • 2015-04-02
      • 1970-01-01
      相关资源
      最近更新 更多