【问题标题】:pandas-profiling in databricks数据块中的熊猫分析
【发布时间】:2020-06-09 18:25:53
【问题描述】:

我正在尝试在我的数据集上运行基本数据框配置文件。我正在使用databricks python笔记本。

pip install --upgrade pip
pip install --upgrade setuptools
pip install pandas-profiling

import numpy as np
import pandas as pd
from pandas_profiling import ProfileReport
df = sql("select * from table").cache()
prof = ProfileReport(df)
prof.to_file(output_file='output.html')
output 
Successfully installed pip-20.1.1

Successfully installed setuptools-47.1.1

Successfully installed MarkupSafe-1.1.1 Pillow-7.1.2 PyWavelets-1.1.1 Send2Trash-1.5.0 astropy-4.0.1.post1 attrs-19.3.0 bleach-3.1.5 confuse-1.1.0 defusedxml-0.6.0 entrypoints-0.3 htmlmin-0.1.12 imagehash-4.1.0 importlib-metadata-1.6.1 ipywidgets-7.5.1 jinja2-2.11.2 joblib-0.15.1 jsonschema-3.2.0 llvmlite-0.32.1 matplotlib-3.2.1 missingno-0.4.2 mistune-0.8.4 nbconvert-5.6.1 nbformat-5.0.6 networkx-2.4 notebook-6.0.3 numba-0.49.1 packaging-20.4 pandas-1.0.4 pandas-profiling-2.8.0 pandocfilters-1.4.2 phik-0.10.0 prometheus-client-0.8.0 pyrsistent-0.16.0 pyyaml-5.3.1 requests-2.23.0 scipy-1.4.1 tangled-up-in-unicode-0.0.6 terminado-0.8.3 testpath-0.4.4 tqdm-4.46.1 visions-0.4.4 webencodings-0.5.1 widgetsnbextension-3.5.1 zipp-3.1.0


我收到以下错误:-

ImportError: cannot import name 'PY2' from 'scipy._lib.six' (/databricks/python/lib/python3.7/site-packages/scipy/_lib/six.py)

我该如何解决这个错误?

【问题讨论】:

    标签: python pyspark profiling databricks


    【解决方案1】:

    问题在于 scipy 包。 这对我有用。

    %sh
    /databricks/python/bin/pip install --upgrade pip
    /databricks/python/bin/pip install scipy
    /databricks/python/bin/pip install pandas_profiling
    
    dbutils.library.restartPython()
    
    import pandas_profiling
    

    !pip install --upgrade pip
    !pip install --upgrade setuptools
    !pip install scipy
    !pip install pandas-profiling
    dbutils.library.restartPython()
    import pandas_profiling
    

    【讨论】:

      猜你喜欢
      • 2023-02-05
      • 2022-11-11
      • 1970-01-01
      • 2023-01-02
      • 2019-08-01
      • 2021-08-29
      • 1970-01-01
      • 2012-09-17
      • 1970-01-01
      相关资源
      最近更新 更多