【问题标题】:WindowsError: [Error 6] The handle is invalid Error while creating a spark sessionWindowsError: [Error 6] The handle is invalid 创建火花会话时出错
【发布时间】:2017-05-09 01:53:50
【问题描述】:

我正在尝试使用以下代码安装 Spark 会话:

#Initialize SparkSession and SparkContext
from pyspark.sql import SparkSession
from pyspark import SparkContext

#Create a Spark Session
SpSession = SparkSession \
    .builder \
    .master("local[2]") \
    .appName("V2 Maestros") \
    .config("spark.executor.memory", "1g") \
    .config("spark.cores.max","2") \
    .config("spark.sql.warehouse.dir", "file:///c:/temp/spark-warehouse") \
    .getOrCreate() 


#Get the Spark Context from Spark Session    
SpContext = SpSession.sparkContext

我收到以下错误:

SpSession = SparkSession \
    .builder \
    .master("local[2]") \
    .appName("V2 Maestros") \
    .config("spark.executor.memory", "1g") \
    .config("spark.cores.max","2") \
    .config("spark.sql.warehouse.dir", "file:///c:/temp/spark-warehouse") \
    .getOrCreate() 

Traceback (most recent call last):

  File "<ipython-input-17-caf81cda545e>", line 1, in <module>
    SpSession = SparkSession     .builder     .master("local[2]")     .appName("V2 Maestros")     .config("spark.executor.memory", "1g")     .config("spark.cores.max","2")     .config("spark.sql.warehouse.dir", "file:///c:/temp/spark-warehouse")     .getOrCreate()

  File "D:\Udemy - Spark\Module\spark-2.0.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\sql\session.py", line 166, in getOrCreate
    sparkConf = SparkConf()

  File "D:\Udemy - Spark\Module\spark-2.0.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\conf.py", line 104, in __init__
    SparkContext._ensure_initialized()

  File "D:\Udemy - Spark\Module\spark-2.0.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\context.py", line 243, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway()

  File "D:\Udemy - Spark\Module\spark-2.0.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\java_gateway.py", line 79, in launch_gateway
    proc = Popen(command, stdin=PIPE, env=env)

  File "C:\Users\Rahul\Anaconda2\lib\subprocess.py", line 382, in __init__
    errread, errwrite), to_close = self._get_handles(stdin, stdout, stderr)

  File "C:\Users\Rahul\Anaconda2\lib\subprocess.py", line 532, in _get_handles
    c2pwrite = self._make_inheritable(c2pwrite)

  File "C:\Users\Rahul\Anaconda2\lib\subprocess.py", line 566, in _make_inheritable
    _subprocess.DUPLICATE_SAME_ACCESS)

WindowsError: [Error 6] The handle is invalid

这里是 subprocess.py 文件的相关部分:

if mswindows:
        #
        # Windows methods
        #
        def _get_handles(self, stdin, stdout, stderr):
            """Construct and return tuple with IO objects:
            p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite
            """
            to_close = set()
            if stdin is None and stdout is None and stderr is None:
                return (None, None, None, None, None, None), to_close

            p2cread, p2cwrite = None, None
            c2pread, c2pwrite = None, None
            errread, errwrite = None, None

            if stdin is None:
                p2cread = _subprocess.GetStdHandle(_subprocess.STD_INPUT_HANDLE)
                if p2cread is None:
                    p2cread, _ = _subprocess.CreatePipe(None, 0)
            elif stdin == PIPE:
                p2cread, p2cwrite = _subprocess.CreatePipe(None, 0)
            elif isinstance(stdin, int):
                p2cread = msvcrt.get_osfhandle(stdin)
            else:
                # Assuming file-like object
                p2cread = msvcrt.get_osfhandle(stdin.fileno())
            p2cread = self._make_inheritable(p2cread)
            # We just duplicated the handle, it has to be closed at the end
            to_close.add(p2cread)
            if stdin == PIPE:
                to_close.add(p2cwrite)

            if stdout is None:
                c2pwrite = _subprocess.GetStdHandle(_subprocess.STD_OUTPUT_HANDLE)
                if c2pwrite is None:
                    _, c2pwrite = _subprocess.CreatePipe(None, 0)
            elif stdout == PIPE:
                c2pread, c2pwrite = _subprocess.CreatePipe(None, 0)
            elif isinstance(stdout, int):
                c2pwrite = msvcrt.get_osfhandle(stdout)
            else:
                # Assuming file-like object
                c2pwrite = msvcrt.get_osfhandle(stdout.fileno())
            c2pwrite = self._make_inheritable(c2pwrite)
            # We just duplicated the handle, it has to be closed at the end
            to_close.add(c2pwrite)
            if stdout == PIPE:
                to_close.add(c2pread)

            if stderr is None:
                errwrite = _subprocess.GetStdHandle(_subprocess.STD_ERROR_HANDLE)
                if errwrite is None:
                    _, errwrite = _subprocess.CreatePipe(None, 0)
            elif stderr == PIPE:
                errread, errwrite = _subprocess.CreatePipe(None, 0)
            elif stderr == STDOUT:
                errwrite = c2pwrite
            elif isinstance(stderr, int):
                errwrite = msvcrt.get_osfhandle(stderr)
            else:
                # Assuming file-like object
                errwrite = msvcrt.get_osfhandle(stderr.fileno())
            errwrite = self._make_inheritable(errwrite)
            # We just duplicated the handle, it has to be closed at the end
            to_close.add(errwrite)
            if stderr == PIPE:
                to_close.add(errread)

            return (p2cread, p2cwrite,
                    c2pread, c2pwrite,
                    errread, errwrite), to_close


        def _make_inheritable(self, handle):
            """Return a duplicate of handle, which is inheritable"""
            return _subprocess.DuplicateHandle(_subprocess.GetCurrentProcess(),
                                handle, _subprocess.GetCurrentProcess(), 0, 1,
                                _subprocess.DUPLICATE_SAME_ACCESS)


        def _find_w9xpopen(self):
            """Find and return absolut path to w9xpopen.exe"""
            w9xpopen = os.path.join(
                            os.path.dirname(_subprocess.GetModuleFileName(0)),
                                    "w9xpopen.exe")
            if not os.path.exists(w9xpopen):
                # Eeek - file-not-found - possibly an embedding
                # situation - see if we can locate it in sys.exec_prefix
                w9xpopen = os.path.join(os.path.dirname(sys.exec_prefix),
                                        "w9xpopen.exe")
                if not os.path.exists(w9xpopen):
                    raise RuntimeError("Cannot locate w9xpopen.exe, which is "
                                       "needed for Popen to work with your "
                                       "shell or platform.")
            return w9xpopen

请帮忙。新火花。我使用的是 64 位系统,运行 python 2.7。

【问题讨论】:

  • 显然StandardOutput 句柄无效,可能还有StandardError。您无法控制 spark 代码,因此您必须事先使用 ctypes 重置无效的标准句柄。
  • import ctypes; kernel32 = ctypes.WinDLL('kernel32', use_last_error=True); kernel32.SetStdHandle(-11, None); kernel32.SetStdHandle(-12, None).
  • 谢谢@eryksun 我在哪里把这段代码放在 subprocess.py 中?
  • 我试图将这段代码放在 if mswindows: 之后,但我一直收到同样的错误
  • 不,不要修改 subprocess.py 之类的标准库模块,也不要修改 pyspark 等其他模块——除非你真的必须这样做。在分配SpSession 之前,应在您自己的代码中进行此更改。

标签: python windows python-2.7 apache-spark pyspark


【解决方案1】:

我遇到了同样的问题。对于将来阅读本文的任何人,我可以通过从桌面快捷方式启动 Spyder IDE 来解决这个问题,而不是从 Anaconda 命令提示符(Windows 7 64 位)执行spyder

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2022-11-20
    • 2019-09-22
    • 1970-01-01
    • 2021-12-30
    • 2014-11-23
    • 2011-12-30
    • 2017-04-27
    • 2018-09-30
    相关资源
    最近更新 更多