【问题标题】:Is there any other configuration to be done left over along with Livy server(livy.conf)?Livy 服务器(livy.conf)是否还有其他配置需要完成?
【发布时间】:2019-08-21 13:48:36
【问题描述】:

我已经为 hadoop 纱线设置了 docker,并且我正在尝试设置 livy apache 服务器来为作业提交进行 API 调用。

下面的日志表示 livy-server 启动一段时间后自动停止

19/08/17 07:09:35 INFO utils.LineBufferedStream: Welcome to
19/08/17 07:09:35 INFO utils.LineBufferedStream:       ____              __
19/08/17 07:09:35 INFO utils.LineBufferedStream:      / __/__  ___ _____/ /__
19/08/17 07:09:35 INFO utils.LineBufferedStream:     _\ \/ _ \/ _ `/ __/  '_/
19/08/17 07:09:35 INFO utils.LineBufferedStream:    /___/ .__/\_,_/_/ /_/\_\   version 2.2.1
19/08/17 07:09:35 INFO utils.LineBufferedStream:       /_/
19/08/17 07:09:35 INFO utils.LineBufferedStream:
19/08/17 07:09:35 INFO utils.LineBufferedStream: Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_222
19/08/17 07:09:35 INFO utils.LineBufferedStream: Branch
19/08/17 07:09:35 INFO utils.LineBufferedStream: Compiled by user felixcheung on 2017-11-24T23:19:45Z
19/08/17 07:09:35 INFO utils.LineBufferedStream: Revision
19/08/17 07:09:35 INFO utils.LineBufferedStream: Url
19/08/17 07:09:35 INFO utils.LineBufferedStream: Type --help for more information.
19/08/17 07:09:35 INFO recovery.StateStore$: Using BlackholeStateStore for recovery.
19/08/17 07:09:35 INFO sessions.BatchSessionManager: Recovered 0 batch sessions. Next session id: 0
19/08/17 07:09:35 INFO sessions.InteractiveSessionManager: Recovered 0 interactive sessions. Next session id: 0
19/08/17 07:09:35 INFO sessions.InteractiveSessionManager: Heartbeat watchdog thread started.
19/08/17 07:09:35 INFO util.log: Logging initialized @1944ms
19/08/17 07:09:36 INFO server.Server: jetty-9.3.24.v20180605, build timestamp: 2018-06-05T17:11:56Z, git hash: xxx0x0x0xx00xxxx0x0x0x0x0x0x0x0xxxx
19/08/17 07:09:36 INFO handler.ContextHandler: Started o.e.j.s.ServletContextHandler@3543df7d{/,file:///livy/apache-livy-0.6.0-incubating-bin/bin/src/main/org/apache/livy/server,AVAILABLE}
19/08/17 07:09:36 INFO server.AbstractNCSARequestLog: Opened /livy/apache-livy-0.6.0-incubating-bin/logs/2019_08_17.request.log
19/08/17 07:09:36 INFO server.AbstractConnector: Started ServerConnector@686449f9{HTTP/1.1,[http/1.1]}{x.x.x.x:8080}
19/08/17 07:09:36 INFO server.Server: Started @2304ms
19/08/17 07:09:36 INFO server.WebServer: Starting server on http://x.x.x.x:8080
19/08/17 07:10:01 INFO server.LivyServer: Shutting down Livy server.
19/08/17 07:10:01 INFO handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler@3543df7d{/,file:///livy/apache-livy-0.6.0-incubating-bin/bin/src/main/org/apache/livy/server,UNAVAILABLE}
19/08/17 07:10:01 INFO server.AbstractConnector: Stopped ServerConnector@686449f9{HTTP/1.1,[http/1.1]}{x.x.x.x:8080}

我已经提供了 livy.conf,其中提到了 livy 运行的服务器 ip 和服务器端口。在尝试 spark yarn submit 时,我也完成了他们的设置,我附上了下面的文件

码头工人撰写


version: "2"

services:
 livy:
  image: namenode/hadoopspark:2.2.1
  command: /livy/apache-livy-0.6.0-incubating-bin/bin/livy-server start
  network_mode: "host"
  ports:
   - 8080:8080


#####################BASE DOCKERFILE#################

FROM ubuntu:14.04

ENV DAEMON_RUN=true
ENV SPARK_VERSION=2.2.1
ENV HADOOP_VERSION=2.7
ENV SPARK_HOME=/spark
ENV HADOOP_HOME=/hadoop

RUN apt-get update \
 && apt-get install -y software-properties-common openssh-server net-tools curl nano vim wget ca-certificates jq gnupg unzip

RUN add-apt-repository ppa:openjdk-r/ppa
RUN apt-get update
RUN apt-get install -y openjdk-8-jdk \
 supervisor

RUN ssh-keygen -q -N "" -t rsa -f /root/.ssh/id_rsa
RUN cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys

RUN wget https://www-eu.apache.org/dist/incubator/livy/0.6.0-incubating/apache-livy-0.6.0-incubating-bin.zip \
 && unzip apache-livy-0.6.0-incubating-bin.zip \
 && mkdir -p livy \
 && mv apache-livy-0.6.0-incubating-bin /livy

RUN wget https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz \
 &&  tar -xzf spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz \
 &&  mv spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION} /spark

RUN wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-2.7.3.tar.gz \
 && tar -xzvf hadoop-2.7.3.tar.gz \
 && mv hadoop-2.7.3 /hadoop

ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
ENV HADOOP_CONF_DIR=/hadoop/etc/hadoop

RUN echo "export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ \
export HADOOP_HOME=/hadoop \
export HADOOP_CONF_DIR=/hadoop/etc/hadoop \
export HADOOP_SSH_OPTS='"-p 22"' \
" >> /hadoop/etc/hadoop/hadoop-env.sh

ENV PATH=$SPARK_HOME/bin:$PATH
ENV PATH=$PATH:/hadoop/bin:/hadoop/sbin


################NAMENODE DOCKERFILE####################

FROM base/hadoopspark:2.2.1

COPY conf/* /tmp/

RUN cp /tmp/hdfs-site.xml $HADOOP_HOME/etc/hadoop/hdfs-site.xml && \
    cp /tmp/core-site.xml $HADOOP_HOME/etc/hadoop/core-site.xml && \
    cp /tmp/mapred-site.xml $HADOOP_HOME/etc/hadoop/mapred-site.xml && \
    cp /tmp/yarn-site.xml $HADOOP_HOME/etc/hadoop/yarn-site.xml && \
    cp /tmp/hdfs-site.xml $SPARK_HOME/conf/ && \
    cp /tmp/core-site.xml $SPARK_HOME/conf/ && \
    cp /tmp/mapred-site.xml $SPARK_HOME/conf/ && \
    cp /tmp/yarn-site.xml $SPARK_HOME/conf/ && \
    cp /tmp/spark-defaults.conf $SPARK_HOME/conf/ && \
    cp /tmp/livy.conf /livy/apache-livy-0.6.0-incubating-bin/conf

COPY Docker_WordCount_Spark-1.0.jar /opt/Docker_WordCount_Spark-1.0.jar
COPY sample.txt /opt/sample.txt

#RUN hdfs dfs -put /opt/Docker_WordCount_Spark-1.0.jar Docker_WordCount_Spark-1.0.jar
#RUN hdfs dfs -put /opt/sample.txt sample.txt

ENV LD_LIBRARY_PATH=/hadoop/lib/native:$LD_LIBRARY_PATH

RUN sudo service ssh restart
RUN sudo /hadoop/bin/hadoop namenode -format

EXPOSE 8998 8080

是否需要其他帮助来启动 livy 服务器。谢谢!

【问题讨论】:

    标签: docker apache-spark docker-compose dockerfile livy


    【解决方案1】:

    Docker 需要命令才能继续在前台运行。否则,它认为应用程序已停止并关闭容器。由于 livy 服务器启动脚本在后台进程中运行,并且稍后不会触发其他前台进程,这就是脚本结束时容器退出的原因。您可以通过多种方式解决,简单的解决方案是在 Dockerfile 中添加以下命令来启动肝脏服务器(从 docker-compose.yml 中删除命令)

    CMD /livy/apache-livy-0.6.0-incubating-bin/bin/livy-server start && /bin/bash
    

    livy 服务器 docker 镜像是:

    FROM base/hadoopspark:2.2.1
    
    COPY conf/* /tmp/
    
    ENV SPARK_HOME=/spark
    ENV HADOOP_HOME=/hadoop
    
    RUN cp /tmp/hdfs-site.xml $HADOOP_HOME/etc/hadoop/hdfs-site.xml && \
        cp /tmp/core-site.xml $HADOOP_HOME/etc/hadoop/core-site.xml && \
        cp /tmp/mapred-site.xml $HADOOP_HOME/etc/hadoop/mapred-site.xml && \
        cp /tmp/yarn-site.xml $HADOOP_HOME/etc/hadoop/yarn-site.xml && \
        cp /tmp/hdfs-site.xml $SPARK_HOME/conf/ && \
        cp /tmp/core-site.xml $SPARK_HOME/conf/ && \
        cp /tmp/mapred-site.xml $SPARK_HOME/conf/ && \
        cp /tmp/yarn-site.xml $SPARK_HOME/conf/ && \
        cp /tmp/spark-defaults.conf $SPARK_HOME/conf/ && \
        cp /tmp/livy.conf /livy/apache-livy-0.6.0-incubating-bin/conf
    
    COPY Docker_WordCount_Spark-1.0.jar /opt/Docker_WordCount_Spark-1.0.jar
    COPY sample.txt /opt/sample.txt
    
    ENV LD_LIBRARY_PATH=/hadoop/lib/native:$LD_LIBRARY_PATH
    
    RUN sudo service ssh restart
    RUN sudo /hadoop/bin/hadoop namenode -format
    
    ENV PATH=$SPARK_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
    
    EXPOSE 8998 8080
    
    CMD /livy/apache-livy-0.6.0-incubating-bin/bin/livy-server start && /bin/bash
    

    【讨论】:

    • 我试过了。它仍然以以下结果退出 - 附加日志:Creating namenode_livy_1 Attaching to namenode_livy_1 livy_1 | starting /usr/lib/jvm/java-8-openjdk-amd64/bin/java -cp /livy/apache-livy-0.6.0-incubating-bin/jars/*:/livy/apache-livy-0.6.0-incubating-bin/conf:/hadoop/etc/hadoop: org.apache.livy.server.LivyServer, logging to /livy/apache-livy-0.6.0-incubating-bin/logs/livy--server.out namenode_livy_1 exited with code 0
    • 请使用 tty 选项,我在另一个答案中添加了 docker-compose 代码格式
    【解决方案2】:

    请尝试将 tty 参数添加到 docker compose 并以分离模式运行容器。

    version: "2"
    
    services:
     livy:
      image: namenode/hadoopspark:2.2.1
      command: /livy/apache-livy-0.6.0-incubating-bin/bin/livy-server start
      network_mode: "host"
      ports:
       - 8080:8080
      tty:  true
    

    启动 docker 容器 docker-compose up -d

    【讨论】:

    • 我仍然面临同样的问题。问题似乎与不确定的 Livy 服务器本身有关?我已经尝试了分离模式和未分离的容器仍然退出。 Livy 服务器在启动并运行一秒钟后自行关闭,这真的很可疑!
    • 我建议,跟踪 livy-server 日志,这样您就可以看到容器处于前台进程以调试问题的原因。 CMD["/livy/apache-livy-0.6.0-incubating-bin/bin/livy-server 开始;tail -f /livy/apache-livy-0.6.0-incubating-bin/logs/livy--server.出;/bin/bash"]
    • 是的,当我进入退出容器和拖尾日志时 -
    • 由于我无法在评论中附加日志,您可以查看问题中已经存在的日志。它们很相似,现在发生的事情也一样
    • 如果你有时间。你能在当地试试这个吗?
    猜你喜欢
    • 2014-06-25
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-08-29
    • 1970-01-01
    • 1970-01-01
    • 2022-08-16
    • 1970-01-01
    相关资源
    最近更新 更多