【发布时间】:2021-03-11 17:52:52
【问题描述】:
我设法在 Mac v10.15.7 和我的一个 Pycharm 项目(我们称之为项目 A)上本地设置了 Spark。但是,我无法在我刚刚使用与项目 A 相同的解释器设置的另一个 Pycharm 项目(项目 B)中启动 Spark。
在项目 B 环境中,我似乎能够调用 spark 会话。当我去http://localhost:4040/ 时,一个 spark 会话已经建立。但是,当我开始执行命令时,我收到了类似的消息
Exception: Python in worker has different version 2.7 than that in driver 3.7, PySpark cannot run with different minor versions. Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.
当我在 Project B pycharm 终端中调用 pyspark 时,我收到以下错误消息。虽然我确实设法通过从 Project A pycharm 终端和 Macbook Terminal 运行相同的命令来调用 spark。
macbook:projectB byc$ pyspark
Could not find valid SPARK_HOME while searching ['/Users/byc/PycharmProjects', '/Library/Frameworks/Python.framework/Versions/3.7/bin']
Did you install PySpark via a package manager such as pip or Conda? If so,
PySpark was not found in your Python environment. It is possible your
Python environment does not properly bind with your package manager.
Please check your default 'python' and if you set PYSPARK_PYTHON and/or
PYSPARK_DRIVER_PYTHON environment variables, and see if you can import
PySpark, for example, 'python -c 'import pyspark'.
If you cannot import, you can install by using the Python executable directly,
for example, 'python -m pip install pyspark [--user]'. Otherwise, you can also
explicitly set the Python executable, that has PySpark installed, to
PYSPARK_PYTHON or PYSPARK_DRIVER_PYTHON environment variables, for example,
'PYSPARK_PYTHON=python3 pyspark'.
/Library/Frameworks/Python.framework/Versions/3.7/bin/pyspark: line 24: /bin/load-spark-env.sh: No such file or directory
/Library/Frameworks/Python.framework/Versions/3.7/bin/pyspark: line 68: /bin/spark-submit: No such file or directory
/Library/Frameworks/Python.framework/Versions/3.7/bin/pyspark: line 68: exec: /bin/spark-submit: cannot execute: No such file or directory
在这里查看各种帖子,我添加了我的环境变量
PYTHONUNBUFFERED=1
PYSPARK_PYTHON=/Download/spark-3.0.1-bin-hadoop2.7
PYSPARK_DRIVER_PYTHON=/Download/spark-3.0.1-bin-hadoop2.7
SPARK_HOME=/usr/local/Cellar/apache-spark/3.0.1/libexec
PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.9-src.zip:$PYTHONPATH
再次关闭 Project B Pycharm,重新打开并再次运行命令。仍然没有运气。
我敢肯定我在这里错过了一些明显的部分,但就是不知道它们是什么!任何指针都非常感谢!
【问题讨论】:
标签: python-3.x apache-spark pyspark pycharm environment-variables