读取保存在 Jupyter 文件夹中的 Excel 文件答案

【问题标题】：Reading Excel files that are saved in Jupyter folder读取保存在 Jupyter 文件夹中的 Excel 文件
【发布时间】：2019-04-06 09:56:39
【问题描述】：

我正在尝试使用拖到 Jupyter 实验室文件夹（在本例中为 ...Tabs.xlsx）的 R 读取我的 excel 文件。如何使用 R 或 Python 读取该文件？

【问题讨论】：

标签： python r jupyter-notebook

【解决方案1】：

在 python 中，您可以使用 pandas，它有一个内置函数来简化此操作：

import pandas as pd
pd.read_excel("my_excel.xlsx", sheet_name="my_sheet_name")

【讨论】：

【解决方案2】：

require(openxlsx)

# I wrote a function to read-in all sheets of a excel file
# assuming the excel sheets reflect 1 simple data frame each.
# I hope your excel sheets are very simple and don't need skipping
# data or leaving out some areas etc. Otherwise, you have to modify
# or use plain `read.xlsx` from `openxlsx`.
# This function returns a list of data frames 
# (for each sheet 1 data frame)
# the names of the elements of the list being the sheet-titles.

#############################
# read xlsx files to dfs list
#############################

xlsx2df.list <- function(xlsx.path, rowNames = TRUE, colNames = TRUE, ...) {
  wb <- loadWorkbook(xlsx.path)
  sheetNames <- names(wb)
  res <- lapply(sheetNames, function(sheetName) {
    read.xlsx(wb, sheet = sheetName, rowNames = rowNames, colNames = colNames, ...)
  })
  names(res) <- sheetNames
  res
}

dfs <- xlsx2df.list("path/to/my_excel.xlsx")

first.sheet.df <- dfs[[1]] # or dfs[["sheet1-title"]]
second.sheet.df <- dfs[[2]] # ...

我写这篇文章是为了不必检查工作表名称是什么因此我必须阅读哪张纸。这是我工作中最常用的功能之一，因为我为之做分析的生物学家，所以喜欢 Excel 表格。

此函数通过为您调用 openxlsx 函数来节省您的时间。（因此，您不必学习它们，只要您的床单很简单并且足够规律......）。

注意：openxlsx 比xlsx 更不容易出错，因为它避免了 Java。我遇到了 Java 的内存限制问题。 xlsx-dependent functions got memory errors when the excel files were huge (Gbs). So: useopenxslx, avoidxlsx`（Java依赖）！

【讨论】：