如何在 Openpyxl 中找到最后一个非空白单元格？答案

【问题标题】：How do I find the last nonblank cell in Openpyxl?如何在 Openpyxl 中找到最后一个非空白单元格？
【发布时间】：2018-12-19 13:21:44
【问题描述】：

Openpyxl 可以告诉我max_row 和max_col，即 Excel 工作表的“使用范围”。但是，此范围可以包括没有内容的单元格，前提是它们以前被选中或更改过。

我想知道有 content 的最后一列和最后一行。

(Discussion for VBA, here.)

例如，如果这里的- 表示已用范围内的空白，_ 表示已用范围外的空白，我想选择标有b 的列和标有c 的行，即使 Openpyxl 和在计算 max_row 和 max_col 时，将包含带有破折号的行/列。

aaaaa---__
aaaaa-b-__
aaaaa---__
--------__
--c-----__
--------__
__________
__________

【问题讨论】：

Openpyxl max_row and max_column wrongly reports a larger figure的可能重复

标签： python excel openpyxl

【解决方案1】：

我发现 openpyxl 确实为已保存的文件报告了正确的 max_row 和 max_col 值，但是如果您操作工作表的内容并在保存之前需要这些值，问题仍然存在。

没有这样做的内置方法，因此您最好的选择是自己搜索行和列，最好通过从报告的值开始并向上和向左搜索来限制搜索。

工作表对象允许您单独访问行，但只能通过.itercols() 访问各个列。在一个循环中扫描所有列是否更快取决于您期望工作表的空白程度。

from openpyxl import load_workbook
wb = load_workbook('test.xlsx')
wb.worksheets[0]['h6'] = None

print((wb.worksheets[0].max_row, wb.worksheets[0].max_column))

def find_edges(sheet):
    row = sheet.max_row
    while row > 0:
        cells = sheet[row]
        if all([cell.value is None for cell in cells]):
            row -= 1
        else:
            break
    if row == 0:
        return 0, 0

    column = sheet.max_column
    while column > 0:
        cells = next(sheet.iter_cols(min_col=column, max_col=column, max_row=row))
        if all([cell.value is None for cell in cells]):
            column -= 1
        else:
            break
    return row, column

print(find_edges(wb.worksheets[0]))

在此示例中，我加载了一个 Excel 工作表，其中包含您建议的数据，其值仍在 H6 中，已在第 3 行删除。

它首先打印openpyxl 报告的max_row 和max_column，然后使用工作表调用find_edges，以查找所需的实际值。

对于数据非常少的大型工作表，您可能希望在确定最后一行（以限制大小）后通过简单地迭代所有列来替换列扫描来测试速度，如下所示：

columns = sheet.iter_cols(max_row=row)
column = 1
ci = 1
while True:
    try:
        cells = next(columns)
        if not all([cell.value is None for cell in cells]):
            column = ci
        ci += 1
    except StopIteration:
        break

但我希望第一种方法对于大多数有用的用例来说是最快的。

如果你更喜欢简短而不是可读：

def find_edges2(sheet):
    def row():
        for r in range(sheet.max_row, 0, -1):
            if not all([cell.value is None for cell in sheet[r]]):
               return r

    row = row()
    if not row:
        return 0, 0

    def column():
        for c in range(sheet.max_column, 0, -1):
            if not all([cell.value is None for cell in next(sheet.iter_cols(min_col=c, max_col=c, max_row=row))]):
                return c

    return row, column()

【讨论】：