Python pandas 将结果数据帧写入 xlsm 而不会丢失宏答案

【问题标题】：Python pandas write resulted dataframe to xlsm without losing the macroPython pandas 将结果数据帧写入 xlsm 而不会丢失宏
【发布时间】：2017-05-17 06:53:01
【问题描述】：

我有很多excel文件需要编译成一个excel文件，然后将编译好的一个复制到某个工作表中现有的excel文件（带有宏/.xlsm）中。

我解决了第一个问题（将多个 excel 文件编译成一个 excel 文件）。结果数据框以 .csv 格式保存。结果文件如下所示。

直到这里没有问题。下一步，我正在努力找出如何去做。

从结果数据框中，我想将数据框“复制并粘贴”到现有 excel 文件中，并在相应标题的“源”表中使用宏 (.xlsm)。现有的 excel 文件如下所示。

从上图可以看出，我想跳过在 A 列中写入任何数据，因为该列中的单元格充满了公式。我想将 B 列中的结果数据框写入现有 excel 文件中的 Q 列。但是，在写入数据之前，我想删除所有单元格中的所有现有数据（A 列中的单元格除外）。

所以基本上我想做以下事情：

删除B列到Q列单元格中的所有值现有的 xlsm 文件（在“源”工作表中）
将结果数据框中的新值写入 B 列直到 Q 列
用相同的名称保存excel文件而不会丢失宏

任何反馈将不胜感激！谢谢！

问候，

阿诺德

【问题讨论】：

在执行您列出的操作时开始录制宏。然后获取生成的代码并对其进行处理。
基本上，我从您的问题中了解到，您想将数据框中 B 列的值替换为 Q 对吗？如果是这种情况，那么您可以使用 df.drop(<column name>) 并通过 df[<column name>] = <your values> 添加新列
@user3598756 感谢您的评论。但是，我不是精通excel，所以我不完全理解您的建议。似乎您建议手动处理复制和粘贴数据。虽然我正在尝试自动化我的编译工作。不过，感谢您的建议！
@ShubhamNamdeo 是的，你是对的。我需要用我已经生成的新值将 B 列中的值替换为 Q 列。所以我需要知道如何将生成的值（新值）“写入”到现有的 xlsm 文件中，而不会丢失 excel 中的宏。

标签： python excel vba pandas

【解决方案1】：

我找到了基于 openpyxl 的以下解决方案。我了解到的是 xlsxwriter 无法打开现有的 excel 文件。因此，我的做法是基于openpyxl。

import pandas as pd 
import openpyxl # one excel reader/writer compatible with pandas

book = openpyxl.load_workbook('input.xlsm', keep_vba = True) # Load existing .xlsm file

with pd.ExcelWriter('output.xlsm', engine='openpyxl') as writer: # open a writer instance with the filename of the 
    
    writer.book = book # Hand over input workbook
    writer.sheets = dict((ws.title, ws) for ws in book.worksheets) # Hand over worksheets
    writer.vba_archive = book.vba_archive # Hand over VBA information 


    df_write.to_excel(writer, sheet_name = 'Sheet1', columns = ['A'],
                  header = False, index = False,
                  startrow = 1, startcol = 0)
    # Writes a column A of the Dataframe into the excel sheet 'Sheet1', which can 
    # already be present in input.xlsm, to row 1, col 0

    
    writer.save()

【讨论】：

【解决方案2】：

很抱歉回来更新我的问题有点晚了。最后我用 openpyxl 包解决了我的问题。

这是我的最终代码：

import openpyxl
import os
import string
import pandas as pd
import numpy as np

path = #folder directory
target_file = #excel filename
sheetname = #working sheet that you wish to work on with

filename = os.path.join(path, target_file)

wb = openpyxl.load_workbook(filename, keep_vba=True)
sheet = wb.get_sheet_by_name(sheetname)

# To Erase All Values within Selected Columns
d = dict()
for x, y in zip(range(1, 27), string.ascii_lowercase):
    d[x] = y.upper()

max_row = sheet.max_row
max_col = sheet.max_column

for row in range(max_row):
    row += 1
    if row == 1: continue
    for col in range(max_col):
        col += 1
        if col == 1: continue
        sheet['{}{}'.format(d[col], row)] = None

# To Write Values to the Blank Worksheet
path_dataframe = # folder directory to the csv file
target_compiled = # csv filename
filename_compiled = os.path.join(path_compiled, target_compiled)

compiled = pd.read_csv(filename_compiled, low_memory=False, encoding = "ISO-8859-1")

for row in range(len(compiled.index)):
    row += 1
    if row == 1: continue # I do not want to change the value in row 1 in excel file because they are headers
    for col in range(max_col): 
        col += 1
        if col == 1: continue # I do not want to change the values in column 1 in excel file since they contain formula
        value = compiled.iloc[row-2][col-2]
        if type(value) is str: value = value
        elif type(value) is np.float64: value = float(value)
        elif type(value) is np.int64: value = int(value)
        sheet['{}{}'.format(d[col], row)] = value

wb.save(filename)

【讨论】：

【解决方案3】：

由于您可以使用QueryTables 使用 Excel VBA 宏处理将 csv 导入电子表格，因此请考虑让 Python 使用 Excel 对象库的 COM 接口复制 VBA。所有以前的宏代码都保持不变，因为除了单元格数据之外什么都没有被覆盖。注意：以下假设您使用的是 Excel for Windows。

使用win32com 库，Python 几乎可以复制 VBA 所做的任何事情。事实上，您会知道 VBA 是 Office 应用程序中的附加引用，而不是本机的内置对象，并且执行相同的 COM 接口！在您的 IDE 中查看 Tools\References 中的第一个选定项目。

import pandas as pd
import win32com.client as win32

# ...same pandas code...    
macrofile = "C:\\Path\\To\\Macro\\Workbook.xlsm"
strfile = "C:\\Path\\To\\CSV\\Output.csv"
df.to_csv(strfile)

try:
    xl = win32.gencache.EnsureDispatch('Excel.Application')
    wb = xl.Workbooks.Open(macrofile)

    # DELETE PREVIOUS DATA
    wb.Sheets("Source").Range("B:Q").EntireColumn.Delete()

    # ADD QUERYTABLE (SPECIFYING DESTINATION CELL START)
    qt = wb.Sheets("Source").QueryTables.Add(Connection="TEXT;" + strfile, 
                                             Destination=wb.Sheets(1).Cells(2, 2))
    qt.TextFileParseType = 1
    qt.TextFileConsecutiveDelimiter = False
    qt.TextFileTabDelimiter = False
    qt.TextFileSemicolonDelimiter = False
    qt.TextFileCommaDelimiter = True
    qt.TextFileSpaceDelimiter = False
    qt.Refresh(BackgroundQuery=False)

    # REMOVE QUERYTABLE
    for qt in wb.Sheets("Source").QueryTables:
        qt.Delete()

    # CLOSES WORKBOOK AND SAVES CHANGES
    wb.Close(True)

except Exception as e:
    print(e)

finally:    
    qt = None
    wb = None
    xl = None

或者，在 VBA 中创建一个新宏（放置在独立模块中）并让 Python 调用它，将 csv 文件路径作为参数传递：

VBA

Public Sub ImportCSV(strfile As String)
    Dim qt As QueryTable

    ThisWorkbook.Sheets("Source").Range("B:Q").EntireColumn.Delete

    ' ADD QUERYTABLE
    With ThisWorkbook.Sheets("Source").QueryTables.Add(Connection:="TEXT;" & strfile, _
        Destination:=ThisWorkbook.Sheets(1).Cells(2, 2))
            .TextFileParseType = xlDelimited
            .TextFileConsecutiveDelimiter = False
            .TextFileTabDelimiter = False
            .TextFileSemicolonDelimiter = False
            .TextFileCommaDelimiter = True
            .TextFileSpaceDelimiter = False

            .Refresh BackgroundQuery:=False
    End With

    ' REMOVE QUERYTABLE
    For Each qt In ThisWorkbook.Sheets(1).QueryTables
        qt.Delete
    Next qt

    Set qt = Nothing
End Sub

Python

import pandas as pd
import win32com.client as win32

# ...same pandas code...    
macrofile = "C:\\Path\\To\\Macro\\Workbook.xlsm"
strfile = "C:\\Path\\To\\CSV\\Output.csv"
df.to_csv(strfile)

try:
    xl = win32.gencache.EnsureDispatch('Excel.Application')

    wb = xl.Workbooks.Open(macrofile)
    xl.Application.Run('ImportCSV', strfile)

    wb.Close(True)
    xl.Quit

except Exception as e:
    print(e)

finally:    
    wb = None
    xl = None

【讨论】：

感谢您的解释！一旦我得到结果，我会尝试回来！感谢您花费时间和精力解决我的问题！