【问题标题】:How to copy worksheet from one workbook to another one using openpyxl?如何使用openpyxl将工作表从一个工作簿复制到另一个工作簿?
【发布时间】:2017-07-09 16:50:17
【问题描述】:

我有大量的 EXCEL 文件(即 200 个)我想将一个特定的工作表从一个工作簿复制到另一个工作簿。我进行了一些调查,但找不到使用 Openpyxl 的方法

这是我目前开发的代码

def copy_sheet_to_different_EXCEL(path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name):
''' Function used to copy one EXCEL sheet into another file.
    
    def path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name
    
Input data:
    1.) path_EXCEL_read: the location of the EXCEL file along with the name where the information is going to be saved
    2.) Sheet_name_to_copy= The name of the EXCEL sheet to copy
    3.) path_EXCEL_Save: The path of the EXCEL file where the sheet is going to be copied
    3.) Sheet_new_name: The name of the new EXCEL sheet
    
Output data:
    1.) Status= If 0, everything went OK. If 1, one error occurred.

Version History:
1.0 (2017-02-20): Initial version.

'''
status=0

if(path_EXCEL_read.endswith('.xls')==1): 
    print('ERROR - EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
    status=1
    return status
    
try:
   wb = openpyxl.load_workbook(path_EXCEL_read,read_only=True)
except:
    print('ERROR - EXCEL file does not exist in the following location:\n  {0}'.format(path_EXCEL_read))
    status=1
    return status

Sheet_names=wb.get_sheet_names()    # We copare against the sheet name we would like to cpy

if ((Sheet_name_to_copy in Sheet_names)==0):
    print('ERROR - EXCEL sheet does not exist'.format(Sheet_name_to_copy))
    status=1
    return status   

# We checking if the destination file exists

if (os.path.exists(path_EXCEL_Save)==1):
    #If true, file exist so we open it
    
    if(path_EXCEL_Save.endswith('.xls')==1): 
        print('ERROR - Destination EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
        status=1
    return status
    
    try:
        wdestiny = openpyxl.load_workbook(path_EXCEL_Save)
    except:
        print('ERROR - Destination EXCEL file does not exist in the following location:\n  {0}'.format(path_EXCEL_read))
        status=1
    return status

    #we check if the destination sheet exists. If so, we will delete it
    
    destination_list_sheets = wdestiny.get_sheet_names()
    
    if((Sheet_new_name in destination_list_sheets) ==True):
        print('WARNING - Sheet "{0}" exists in: {1}. It will be deleted!'.format(Sheet_new_name,path_EXCEL_Save))
        wdestiny.remove_sheet(Sheet_new_name) 

else:
    wdestiny=openpyxl.Workbook()
# We copy the Excel sheet
    
try:
    sheet_to_copy = wb.get_sheet_by_name(Sheet_name_to_copy) 
    target = wdestiny.copy_worksheet(sheet_to_copy)
    target.title=Sheet_new_name
except:
    print('ERROR - Could not copy the EXCEL sheet. Check the file')
    status=1
    return status

try:
    wdestiny.save(path_EXCEL_Save)
except:
    print('ERROR - Could not save the EXCEL sheet. Check the file permissions')
    status=1
    return status

#Program finishes
return status

【问题讨论】:

标签: python excel copy openpyxl worksheet


【解决方案1】:

我遇到了同样的问题。对我来说,风格、格式和布局非常重要。此外,我不想复制公式,而只想复制(公式的)值。经过大量的跟踪、错误和 stackoverflow,我想出了以下函数。它可能看起来有点吓人,但代码将工作表从一个 Excel 文件复制到另一个(可能存在的文件),同时保留:

  1. 文字的字体和颜色
  2. 单元格的填充颜色
  3. 合并单元格
  4. 评论和超链接
  5. 单元格值的格式
  6. 每行每列的宽度
  7. 行列是否隐藏
  8. 冻结行

当您想从许多工作簿中收集工作表并将它们绑定到一个工作簿中时,它很有用。我复制了大多数属性,但可能还有更多。在这种情况下,您可以将此脚本用作添加更多内容的起点。

###############
## Copy a sheet with style, format, layout, ect. from one Excel file to another Excel file
## Please add the ..path\\+\\file..  and  ..sheet_name.. according to your desire.

import openpyxl
from copy import copy

def copy_sheet(source_sheet, target_sheet):
    copy_cells(source_sheet, target_sheet)  # copy all the cel values and styles
    copy_sheet_attributes(source_sheet, target_sheet)


def copy_sheet_attributes(source_sheet, target_sheet):
    target_sheet.sheet_format = copy(source_sheet.sheet_format)
    target_sheet.sheet_properties = copy(source_sheet.sheet_properties)
    target_sheet.merged_cells = copy(source_sheet.merged_cells)
    target_sheet.page_margins = copy(source_sheet.page_margins)
    target_sheet.freeze_panes = copy(source_sheet.freeze_panes)

    # set row dimensions
    # So you cannot copy the row_dimensions attribute. Does not work (because of meta data in the attribute I think). So we copy every row's row_dimensions. That seems to work.
    for rn in range(len(source_sheet.row_dimensions)):
        target_sheet.row_dimensions[rn] = copy(source_sheet.row_dimensions[rn])

    if source_sheet.sheet_format.defaultColWidth is None:
        print('Unable to copy default column wide')
    else:
        target_sheet.sheet_format.defaultColWidth = copy(source_sheet.sheet_format.defaultColWidth)

    # set specific column width and hidden property
    # we cannot copy the entire column_dimensions attribute so we copy selected attributes
    for key, value in source_sheet.column_dimensions.items():
        target_sheet.column_dimensions[key].min = copy(source_sheet.column_dimensions[key].min)   # Excel actually groups multiple columns under 1 key. Use the min max attribute to also group the columns in the targetSheet
        target_sheet.column_dimensions[key].max = copy(source_sheet.column_dimensions[key].max)  # https://stackoverflow.com/questions/36417278/openpyxl-can-not-read-consecutive-hidden-columns discussed the issue. Note that this is also the case for the width, not onl;y the hidden property
        target_sheet.column_dimensions[key].width = copy(source_sheet.column_dimensions[key].width) # set width for every column
        target_sheet.column_dimensions[key].hidden = copy(source_sheet.column_dimensions[key].hidden)


def copy_cells(source_sheet, target_sheet):
    for (row, col), source_cell in source_sheet._cells.items():
        target_cell = target_sheet.cell(column=col, row=row)

        target_cell._value = source_cell._value
        target_cell.data_type = source_cell.data_type

        if source_cell.has_style:
            target_cell.font = copy(source_cell.font)
            target_cell.border = copy(source_cell.border)
            target_cell.fill = copy(source_cell.fill)
            target_cell.number_format = copy(source_cell.number_format)
            target_cell.protection = copy(source_cell.protection)
            target_cell.alignment = copy(source_cell.alignment)

        if source_cell.hyperlink:
            target_cell._hyperlink = copy(source_cell.hyperlink)

        if source_cell.comment:
            target_cell.comment = copy(source_cell.comment)


wb_target = openpyxl.Workbook()
target_sheet = wb_target.create_sheet(..sheet_name..)

wb_source = openpyxl.load_workbook(..path\\+\\file_name.., data_only=True)
source_sheet = wb_source[..sheet_name..]

copy_sheet(source_sheet, target_sheet)

if 'Sheet' in wb_target.sheetnames:  # remove default sheet
    wb_target.remove(wb_target['Sheet'])

wb_target.save('out.xlsx')

【讨论】:

  • 有效。要是它也处理图像就好了……
  • 更新:要让这个魔法处理图像,你唯一需要做的就是:在copy_sheet_attributes 中的target_sheet.freeze_panes = copy(source_sheet.freeze_panes) 行之后添加行target_sheet._images = copy(source_sheet._images)。灿烂!顺便说一句,这是我想知道原因不明的第二种情况(请参阅,我克制自己,并没有说 dumbopenpyxl 开发人员在用户面前隐藏功能。对于第一种情况,请查看here
  • 谢谢!它在 excel 中运行良好,但是当我在 LibreOffice 中打开文档时出现错误:“数据无法完全加载,因为超出了每张工作表的最大行数。” - 但随后它继续加载输出文件就好了所以¯_(ツ)_/¯
【解决方案2】:

您不能使用copy_worksheet() 在工作簿之间进行复制,因为它取决于工作簿之间可能不同的全局常量。唯一安全可靠的方法是逐行逐个单元格地进行。

您可能想阅读discussions about this feature

【讨论】:

    【解决方案3】:

    我找到了一个玩弄它的方法

    import openpyxl
    
    xl1 = openpyxl.load_workbook('workbook1.xlsx')
    # sheet you want to copy
    s = openpyxl.load_workbook('workbook2.xlsx').active
    s._parent = xl1
    xl1._add_sheet(s)
    xl1.save('some_path/name.xlsx')
    

    【讨论】:

    • 这对我来说符合我的期望......非常感谢
    • 太棒了。它就像一个魅力。
    • 对我不起作用。我在保存时收到write_stylesheet; xf.alignment = wb._alignments[style.alignmentId]; IndexError: list index out of range
    【解决方案4】:

    我有类似的要求,将多个工作簿中的数据整理到一个工作簿中。由于 openpyxl 中没有可用的内置方法。

    我创建了以下脚本来为我完成这项工作。

    注意:在我的用例中,所有工作簿都包含相同格式的数据。

    from openpyxl import load_workbook
    import os
    
    
    # The below method is used to read data from an active worksheet and store it in memory.
    def reader(file):
        global path
        abs_file = os.path.join(path, file)
        wb_sheet = load_workbook(abs_file).active
        rows = []
        # min_row is set to 2, to ignore the first row which contains the headers
        for row in wb_sheet.iter_rows(min_row=2):
            row_data = []
            for cell in row:
                row_data.append(cell.value)
            # custom column data I am adding, not needed for typical use cases
            row_data.append(file[17:-6])
            # Creating a list of lists, where each list contain a typical row's data
            rows.append(row_data)
        return rows
    
    
    if __name__ == '__main__':
        # Folder in which my source excel sheets are present
        path = r'C:\Users\tom\Desktop\Qt'
        # To get the list of excel files
        files = os.listdir(path)
        for file in files:
            rows = reader(file)
            # below mentioned file name should be already created
            book = load_workbook('new.xlsx')
            sheet = book.active
            for row in rows:
                sheet.append(row)
            book.save('new.xlsx')
    

    【讨论】:

      【解决方案5】:

      我的解决方法是这样的:

      您有一个模板文件,假设它是“template.xlsx”。 您打开它,根据需要对其进行更改,将其另存为新文件,然后关闭文件。 根据需要重复。只需确保在测试/弄乱时保留原始模板的副本。

      【讨论】:

        【解决方案6】:

        我刚刚发现了这个问题。正如here 所提到的,一个好的解决方法可以包括修改内存中的原始wb,然后用另一个名称保存它。例如:

        import openpyxl
        
        # your starting wb with 2 Sheets: Sheet1 and Sheet2
        wb = openpyxl.load_workbook('old.xlsx')
        
        sheets = wb.sheetnames # ['Sheet1', 'Sheet2']
        
        for s in sheets:
        
            if s != 'Sheet2':
                sheet_name = wb.get_sheet_by_name(s)
                wb.remove_sheet(sheet_name)
        
        # your final wb with just Sheet1
        wb.save('new.xlsx')
        

        【讨论】:

        • 缺少单元格格式,如果单元格有多种文本格式(Richtext)
        • 此答案仅有助于删除旧工作表,而不是复制新工作表
        • 这个答案只是修改原件并创建一个具有另一个名称/路径的副本。如果我不想更新原件怎么办?
        【解决方案7】:

        为了提高速度,我在打开工作簿时使用了data_onlyread_only 属性。 iter_rows() 也非常快。

        @Oscar 的出色答案需要进行一些更改以支持 ReadOnlyWorksheet 和 EmptyCell

        # Copy a sheet with style, format, layout, ect. from one Excel file to another Excel file
        # Please add the ..path\\+\\file..  and  ..sheet_name.. according to your desire.
        import openpyxl
        from copy import copy
        
        
        def copy_sheet(source_sheet, target_sheet):
            copy_cells(source_sheet, target_sheet)  # copy all the cel values and styles
            copy_sheet_attributes(source_sheet, target_sheet)
        
        
        def copy_sheet_attributes(source_sheet, target_sheet):
            if isinstance(source_sheet, openpyxl.worksheet._read_only.ReadOnlyWorksheet):
                return
            target_sheet.sheet_format = copy(source_sheet.sheet_format)
            target_sheet.sheet_properties = copy(source_sheet.sheet_properties)
            target_sheet.merged_cells = copy(source_sheet.merged_cells)
            target_sheet.page_margins = copy(source_sheet.page_margins)
            target_sheet.freeze_panes = copy(source_sheet.freeze_panes)
        
            # set row dimensions
            # So you cannot copy the row_dimensions attribute. Does not work (because of meta data in the attribute I think). So we copy every row's row_dimensions. That seems to work.
            for rn in range(len(source_sheet.row_dimensions)):
                target_sheet.row_dimensions[rn] = copy(source_sheet.row_dimensions[rn])
        
            if source_sheet.sheet_format.defaultColWidth is None:
                print('Unable to copy default column wide')
            else:
                target_sheet.sheet_format.defaultColWidth = copy(source_sheet.sheet_format.defaultColWidth)
        
            # set specific column width and hidden property
            # we cannot copy the entire column_dimensions attribute so we copy selected attributes
            for key, value in source_sheet.column_dimensions.items():
                target_sheet.column_dimensions[key].min = copy(source_sheet.column_dimensions[key].min)   # Excel actually groups multiple columns under 1 key. Use the min max attribute to also group the columns in the targetSheet
                target_sheet.column_dimensions[key].max = copy(source_sheet.column_dimensions[key].max)  # https://stackoverflow.com/questions/36417278/openpyxl-can-not-read-consecutive-hidden-columns discussed the issue. Note that this is also the case for the width, not onl;y the hidden property
                target_sheet.column_dimensions[key].width = copy(source_sheet.column_dimensions[key].width) # set width for every column
                target_sheet.column_dimensions[key].hidden = copy(source_sheet.column_dimensions[key].hidden)
        
        
        def copy_cells(source_sheet, target_sheet):
            for r, row in enumerate(source_sheet.iter_rows()):
                for c, cell in enumerate(row):
                    source_cell = cell
                    if isinstance(source_cell, openpyxl.cell.read_only.EmptyCell):
                        continue
                    target_cell = target_sheet.cell(column=c+1, row=r+1)
        
                    target_cell._value = source_cell._value
                    target_cell.data_type = source_cell.data_type
        
                    if source_cell.has_style:
                        target_cell.font = copy(source_cell.font)
                        target_cell.border = copy(source_cell.border)
                        target_cell.fill = copy(source_cell.fill)
                        target_cell.number_format = copy(source_cell.number_format)
                        target_cell.protection = copy(source_cell.protection)
                        target_cell.alignment = copy(source_cell.alignment)
        
                    if not isinstance(source_cell, openpyxl.cell.ReadOnlyCell) and source_cell.hyperlink:
                        target_cell._hyperlink = copy(source_cell.hyperlink)
        
                    if not isinstance(source_cell, openpyxl.cell.ReadOnlyCell) and source_cell.comment:
                        target_cell.comment = copy(source_cell.comment)
        
        

        类似的用法

            wb = Workbook()
            
            wb_source = load_workbook(filename, data_only=True, read_only=True)
            for sheetname in wb_source.sheetnames:
                source_sheet = wb_source[sheetname]
                ws = wb.create_sheet("Orig_" + sheetname)
                copy_sheet(source_sheet, ws)
        
            wb.save(new_filename)
        

        【讨论】:

          【解决方案8】:

          我使用的解决方法是将当前工作表保存为 pandas 数据框并将其加载到您需要的 excel 工作簿中

          【讨论】:

          • 你能用你使用的源代码扩展这个答案吗?
          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多