【问题标题】:melt / reshape in excel using VBA?使用VBA在excel中融化/重塑?
【发布时间】:2012-06-06 20:32:26
【问题描述】:

我目前正在适应一份新工作,我与同事分享的大部分工作都是通过 MS Excel 完成的。我经常使用数据透视表,因此需要“堆叠”数据,正是我依赖的 R 中 reshape (reshape2) 包中的 melt() 函数的输出。

谁能让我开始使用 VBA 宏来完成此任务,或者是否已经存在?

宏的轮廓是:

  1. 在 Excel 工作簿中选择一系列单元格。
  2. 启动“melt”宏。
  3. 宏将创建一个提示“输入 id 列数”,您可以在其中输入识别信息前面的列数。 (对于下面的示例 R 代码,它是 4)。
  4. 在名为“melt”的 Excel 文件中创建一个新工作表 这将堆叠数据,并创建一个名为“变量”的新列 等于原始选择的数据列标题。

换句话说,输出看起来与在 R 中简单地执行这两行的输出完全相同:

require(reshape)
melt(your.unstacked.dataframe, id.vars = 1:4)

这是一个例子:

# unstacked data
> df1
  Year Month Country  Sport No_wins No_losses High_score Total_games
2 2010     5     USA Soccer       4         3          5           9
3 2010     6     USA Soccer       5         3          4           8
4 2010     5     CAN Soccer       2         9          7          11
5 2010     6     CAN Soccer       4         8          4          13
6 2009     5     USA Soccer       8         1          4           9
7 2009     6     USA Soccer       0         0          3           2
8 2009     5     CAN Soccer       2         0          6           3
9 2009     6     CAN Soccer       3         0          8           3

# stacking the data
> require(reshape)
> melt(df1, id.vars=1:4)

  Year Month Country  Sport    variable value
1  2010     5     USA Soccer     No_wins     4
2  2010     6     USA Soccer     No_wins     5
3  2010     5     CAN Soccer     No_wins     2
4  2010     6     CAN Soccer     No_wins     4
5  2009     5     USA Soccer     No_wins     8
6  2009     6     USA Soccer     No_wins     0
7  2009     5     CAN Soccer     No_wins     2
8  2009     6     CAN Soccer     No_wins     3
9  2010     5     USA Soccer   No_losses     3
10 2010     6     USA Soccer   No_losses     3
11 2010     5     CAN Soccer   No_losses     9
12 2010     6     CAN Soccer   No_losses     8
13 2009     5     USA Soccer   No_losses     1
14 2009     6     USA Soccer   No_losses     0
15 2009     5     CAN Soccer   No_losses     0
16 2009     6     CAN Soccer   No_losses     0
17 2010     5     USA Soccer  High_score     5
18 2010     6     USA Soccer  High_score     4
19 2010     5     CAN Soccer  High_score     7
20 2010     6     CAN Soccer  High_score     4
21 2009     5     USA Soccer  High_score     4
22 2009     6     USA Soccer  High_score     3
23 2009     5     CAN Soccer  High_score     6
24 2009     6     CAN Soccer  High_score     8
25 2010     5     USA Soccer Total_games     9
26 2010     6     USA Soccer Total_games     8
27 2010     5     CAN Soccer Total_games    11
28 2010     6     CAN Soccer Total_games    13
29 2009     5     USA Soccer Total_games     9
30 2009     6     USA Soccer Total_games     2
31 2009     5     CAN Soccer Total_games     3
32 2009     6     CAN Soccer Total_games     3

【问题讨论】:

  • 我的首选方法是: 1. 将 excel 文件另存为 csv; 2.读入R并正常、理智地进行; 3. 将融化/重塑的 csv 写回; 4. 在 Excel 中打开,就像什么都没发生过一样。
  • 有一个插件 (RExcel) 可以让您从 Excel 内部调用 R。
  • 是的,这正是我一直在做的事情(将单元格复制到剪贴板,在 R 中处理,输出为 .csv)。但是,我想制作一个可以与同事分享的解决方案,因此必须在 VBA 中。

标签: r vba excel pivot-table reshape2


【解决方案1】:

我的博客上有两篇关于在 Excel/VBA 中执行此操作的帖子,其中包含可用代码和可下载的工作簿:

http://yoursumbuddy.com/data-normalizer

http://yoursumbuddy.com/data-normalizer-the-sql/

代码如下:

'Arguments
'List: The range to be normalized.
'RepeatingColsCount: The number of columns, starting with the leftmost,
'   whose headings remain the same.
'NormalizedColHeader: The column header for the rolled-up category.
'DataColHeader: The column header for the normalized data.
'NewWorkbook: Put the sheet with the data in a new workbook?
'
'NOTE: The data must be in a contiguous range and the
'columns that will be repeated must be to the left,
'with the columns to be normalized to the right.

Sub NormalizeList(List As Excel.Range, RepeatingColsCount As Long, _
    NormalizedColHeader As String, DataColHeader As String, _
    Optional NewWorkbook As Boolean = False)

Dim FirstNormalizingCol As Long, NormalizingColsCount As Long
Dim ColsToRepeat As Excel.Range, ColsToNormalize As Excel.Range
Dim NormalizedRowsCount As Long
Dim RepeatingList() As String
Dim NormalizedList() As Variant
Dim ListIndex As Long, i As Long, j As Long
Dim wbSource As Excel.Workbook, wbTarget As Excel.Workbook
Dim wsTarget As Excel.Worksheet

With List
    'If the normalized list won't fit, you must quit.
   If .Rows.Count * (.Columns.Count - RepeatingColsCount) > .Parent.Rows.Count Then
        MsgBox "The normalized list will be too many rows.", _
               vbExclamation + vbOKOnly, "Sorry"
        Exit Sub
    End If

    'You have the range to be normalized and the count of leftmost rows to be repeated.
   'This section uses those arguments to set the two ranges to parse
   'and the two corresponding arrays to fill
   FirstNormalizingCol = RepeatingColsCount + 1
    NormalizingColsCount = .Columns.Count - RepeatingColsCount
    Set ColsToRepeat = .Cells(1).Resize(.Rows.Count, RepeatingColsCount)
    Set ColsToNormalize = .Cells(1, FirstNormalizingCol).Resize(.Rows.Count, NormalizingColsCount)
    NormalizedRowsCount = ColsToNormalize.Columns.Count * .Rows.Count
    ReDim RepeatingList(1 To NormalizedRowsCount, 1 To RepeatingColsCount)
    ReDim NormalizedList(1 To NormalizedRowsCount, 1 To 2)
End With

'Fill in every i elements of the repeating array with the repeating row labels.
For i = 1 To NormalizedRowsCount Step NormalizingColsCount
    ListIndex = ListIndex + 1
    For j = 1 To RepeatingColsCount
        RepeatingList(i, j) = List.Cells(ListIndex, j).Value2
    Next j
Next i

'We stepped over most rows above, so fill in other repeating array elements.
For i = 1 To NormalizedRowsCount
    For j = 1 To RepeatingColsCount
        If RepeatingList(i, j) = "" Then
            RepeatingList(i, j) = RepeatingList(i - 1, j)
        End If
    Next j
Next i

'Fill in each element of the first dimension of the normalizing array
'with the former column header (which is now another row label) and the data.
With ColsToNormalize
    For i = 1 To .Rows.Count
        For j = 1 To .Columns.Count
            NormalizedList(((i - 1) * NormalizingColsCount) + j, 1) = .Cells(1, j)
            NormalizedList(((i - 1) * NormalizingColsCount) + j, 2) = .Cells(i, j)
        Next j
    Next i
End With

'Put the normal data in the same workbook, or a new one.
If NewWorkbook Then
    Set wbTarget = Workbooks.Add
    Set wsTarget = wbTarget.Worksheets(1)
Else
    Set wbSource = List.Parent.Parent
    With wbSource.Worksheets
        Set wsTarget = .Add(after:=.Item(.Count))
    End With
End If

With wsTarget
    'Put the data from the two arrays in the new worksheet.
   .Range("A1").Resize(NormalizedRowsCount, RepeatingColsCount) = RepeatingList
    .Cells(1, FirstNormalizingCol).Resize(NormalizedRowsCount, 2) = NormalizedList

    'At this point there will be repeated header rows, so delete all but one.
   .Range("1:" & NormalizingColsCount - 1).EntireRow.Delete

    'Add the headers for the new label column and the data column.
   .Cells(1, FirstNormalizingCol).Value = NormalizedColHeader
    .Cells(1, FirstNormalizingCol + 1).Value = DataColHeader
End With
End Sub

你可以这样称呼它:

Sub TestIt()
NormalizeList ActiveSheet.UsedRange, 4, "Variable", "Value", False
End Sub

【讨论】:

  • 这很棒;但是我无法选择有效范围。当我突出显示要堆叠的单元格时,它会拉入额外的空白列(可能来自您的棒球示例)。如何在运行宏之前划定相关范围?
  • 您可以将呼叫线路更改为:NormalizeList Selection, 4, "Variable", "Value", FalseNormalizeList ActiveSheet.Range("A1:D100"), 4, "Variable", "Value", False 或任何范围。
【解决方案2】:

Microsoft 最近推出了 Power Query,这是一个 Excel 插件,它为 Excel 中的数据操作添加了许多有趣的功能和能力,包括您正在寻找的内容。

加载项中的实际功能称为“Unpivot Columns”,在in this article 中进行了解释。这是它的要点:

  1. Download and install the add-in
  2. 打开您的 Excel/CSV 文件
  3. 选择要融化/重塑的表格/范围
  4. 在“Power Query”选项卡中,单击“From Table”,这将打开“Query Editor”
  5. 选择要融化/重塑的列(ctrl 或 shift-select,不要拖动)
  6. 在“转换”选项卡中单击“取消透视列”(您也可以在返回 Excel 之前在此处应用其他转换)
  7. 在“主页”选项卡中单击“关闭并加载”。这将在 Excel 中创建一个具有所需结果的新表/查询对象。

【讨论】:

    【解决方案3】:

    对于任何寻找可视化方式来规范化 excel 数据的人,请观看此视频教程:

    http://www.youtube.com/watch?v=xmqTN0X-AgY

    【讨论】:

      【解决方案4】:

      或使用:

      Sub M_snb_000()
        With sheet1.Cells(1).CurrentRegion
          sn = .Resize(, .Columns.Count + 1)
        End With
      
        For j = 4 To UBound(sn, 2) - 1
          With Sheet2.Cells(2 + (UBound(sn) - 1) * (j - 4), 1)
             .Resize(UBound(sn) - 1, 5) = Application.Index(sn, Evaluate("row(2:" 
                   & UBound(sn) & ")"), Array(1, 2, 3,UBound(sn, 2), j))
             .Resize(UBound(sn) - 1, 1).Offset(, 3) = sn(1, j)
          End With
        Next
      End Sub
      

      【讨论】:

      • 这似乎不是一个正确的答案,因为它不接受 id 列数的变量。
      • @KHeaney 实际上,这个答案达到了 Excel 专家级别。你试过了吗?这个特定问题的一个小改动......将For j = 4... 行更改为For j = 5... 它完美运行。这是对 Excel-VBA 细微差别的惊人掌握。出色的工作,@snb。
      • @ExcelHero 我没有测试它,只是在分类队列中快速评估它。老实说,它可能会运行良好,因为它似乎会正确地迭代所需的范围。正如我在评论中所说,我不相信它接受输入,而且我不知道用户的用例,所以我回答了他们的问题,即它需要用户的输入。
      【解决方案5】:

      首先创建一个用户表单并将其命名为 Unpivot_Form,其中包含两个 RefEdit 字段 - rng_id 和 value_id 以及一个提交/执行按钮。我也是 R 用户, rng_id 是包含 id 的范围,而 value_id 包含值;两个范围都包括标题。

      做两个宏:

      Sub unpivot()
      Unpivot_Form.Show
      End Sub
      

      另一个宏在该字段的提交/执行按钮内:

      Private Sub submit_Click()
      'Code to unpivot (convert wide to long for excel)
      
      Dim rng_id, rng_id_header, val_id As Range
      Dim colvar, emptyrow, col As Integer
      Dim new_sheet As Worksheet
      
      'Put val_id range into a range object
      Set val_id = Range(value_id.Value)
      
      'Determine the parameter for the value id range
      'This is used for the looping later on
      numrows = val_id.Rows.Count
      numcols = val_id.Columns.Count
      
      'Resize changes the "block" to the size defined by the row and column
      'Offset moves the "block"
      Set rng_id_header = Range(range_id.Value).Resize(1)
      Set rng_id = Range(range_id.Value).Offset(1, 0).Resize(numrows - 1)
      
      Set new_sheet = Worksheets.Add
      
      'Set up the first column and first batch of id vars
      new_sheet.Activate
      Range("A65535").End(xlUp).Activate
      rng_id_header.Copy ActiveCell
      colvar = Range("XFD1").End(xlToLeft).Column + 1
      Range("XFD1").End(xlToLeft).Offset(, 1).Value = "Variable"
      Range("XFD1").End(xlToLeft).Offset(, 1).Value = "Value"
      
      'Start populating the value ids
      For col = 1 To numcols
      
        'populate var_id
        'determine last row
         emptyrow = Range("A65535").End(xlUp).Row + 1
         'no need to activate to source to copy
         rng_id.Copy new_sheet.Cells(emptyrow, 1)
        'copy the variable
        val_id.Offset(, col - 1).Resize(1, 1).Copy new_sheet.Range(Cells(emptyrow, colvar), Cells(emptyrow + numrows - 2, colvar))
        'copy the value
        val_id.Offset(1, col - 1).Resize(numrows - 1, 1).Copy new_sheet.Range(Cells(emptyrow, colvar + 1), Cells(emptyrow + numrows - 2, colvar + 1))
      
      Next
      
      Unload Me
      
      End Sub
      

      享受吧!

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2017-03-30
        • 1970-01-01
        • 1970-01-01
        • 2013-01-11
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多