【发布时间】:2021-08-18 09:07:33
【问题描述】:
我正在尝试使用 pandas udf 提交 pyspark 代码(使用 fbprophet...) 它在本地提交中运行良好,但在集群提交中出现错误,例如
Job aborted due to stage failure: Task 2 in stage 2.0 failed 4 times, most recent failure: Lost task 2.3 in stage 2.0 (TID 41, ip-172-31-11-94.ap-northeast-2.compute.internal, executor 2): java.io.IOException: Cannot run program
"/mnt/yarn/usercache/hadoop/appcache/application_1620263926111_0229/container_1620263926111_0229_01_000001/environment/bin/python": error=2, No such file or directory
我的 spark-submit 代码:
PYSPARK_PYTHON=./environment/bin/python \
spark-submit \
--master yarn \
--deploy-mode cluster \
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=./environment/bin/python \
--conf spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON=./environment/bin/python \
--jars jars/org.elasticsearch_elasticsearch-spark-20_2.11-7.10.2.jar \
--py-files dependencies.zip \
--archives ./environment.tar.gz#environment \
--files config.ini \
$1
我通过 conda pack 制作 environment.tar.gz,dependencies.zip 作为我的本地包和 config.ini 加载设置
有没有办法解决这个问题?
【问题讨论】:
标签: apache-spark pyspark cluster-computing hadoop-yarn