【问题标题】:ruby array of array with repeated values to hash of hash具有重复值的数组的 ruby​​ 数组以散列的散列
【发布时间】:2026-02-14 03:30:01
【问题描述】:

我是 ruby​​ 新手,很难弄清楚如何将数组数组转换为数组哈希的哈希值。

例如,假设我有:

[ [38, "s", "hum"], 
  [38, "t", "foo"], 
  [38, "t", "bar"], 
  [45, "s", "hum"], 
  [45, "t", "ram"], 
  [52, "s", "hum"], 
  [52, "t", "cat"], 
  [52, "t", "dog"]
]

我到底想要:

{38 => {"s" => ["hum"],
        "t" => ["foo", "bar"]
       },
 45 => {"s" => ["hum"],
        "t" => ["ram"]
       },
 52 => {"s" => ["hum"],
        "t" => ["cat", "dog"]
       }
 }

我尝试了 group_by 和 Hash,但都没有给我我想要的东西。

【问题讨论】:

    标签: ruby arrays hash


    【解决方案1】:

    也许有更简洁的方法,但我决定直接走:

    input = [ [38, "s", "hum"],
      [38, "t", "foo"],
      [38, "t", "bar"],
      [45, "s", "hum"],
      [45, "t", "ram"],
      [52, "s", "hum"],
      [52, "t", "cat"],
      [52, "t", "dog"]
    ]
    
    output = {}
    
    # I'll talk through the first iteration in the comments.
    
    input.each do |outer_key, inner_key, value|
      # Set output[38] to a new hash, since output[38] isn't set yet.
      # If it were already set, this line would do nothing, so
      # output[38] would keep its previous data.
      output[outer_key] ||= {}
    
      # Set output[38]["s"] to a new array, since output[38]["s"] isn't set yet.
      # If it were already set, this line would do nothing, so
      # output[38]["s"] would keep its previous data.
      output[outer_key][inner_key] ||= []
    
      # Add "hum" to the array at output[38]["s"].
      output[outer_key][inner_key] << value
    end
    

    所以,你实际使用的部分,都整理好了:

    output = {}
    
    input.each do |outer_key, inner_key, value|
      output[outer_key] ||= {}
      output[outer_key][inner_key] ||= []
      output[outer_key][inner_key] << value
    end
    

    【讨论】:

    • 精简版是input.reduce({}) {|h,(x,y,z)| ((h[x] ||= {})[y] ||= []) &lt;&lt; z; h}
    • 谢谢!这正在生产我正在寻找的东西。我选择了这个解决方案(Matchu's),因为在我的红宝石知识的这个阶段,它对我来说很有意义。
    【解决方案2】:

    这可以被认为是可怕的或优雅的,这取决于你的感受:

    input.inject(Hash.new {|h1,k1| h1[k1] = Hash.new {|h2,k2| h2[k2] = Array.new}}) {|hash,elem| hash[elem[0]][elem[1]].push(elem[2]); hash}
    => {38=>{"s"=>["hum"], "t"=>["foo", "bar"]}, 45=>{"s"=>["hum"], "t"=>["ram"]}, 52=>{"s"=>["hum"], "t"=>["cat", "dog"]}}
    

    一个更易读的版本最好是:

    input.inject(Hash.new(Hash.new(Array.new))) {|hash,elem| hash[elem[0]][elem[1]].push(elem[2]); hash}
    

    也就是说,从一个空散列开始,默认值等于一个空散列,默认值等于一个空数组。然后遍历输入,将元素存储在适当的位置。

    后一种语法的问题是 Hash.new(Hash.new(Array.new)) 会导致所有哈希和数组在内存中的位置相同,因此这些值将被覆盖。前一种语法每次都会创建一个新对象,从而得到想要的结果。

    【讨论】:

    • 请记住,如果我想稍后在代码中使用此哈希,我可能希望哈希的默认行为不会被修改。我可能不应该这样做,但改变我的期望是有风险的。
    【解决方案3】:

    在这种情况下,inject(在 1.9 中也称为 reduce)是一个很棒的工具:

    input.inject({}) do |acc, (a, b, c)|
      acc[a] ||= {}
      acc[a][b] ||= []
      acc[a][b] << c
      acc
    end
    

    它将为input 中的每个项目调用一次块,并传递一个累加器和项目。第一次将参数作为累加器传递,后续调用将最后一次调用的返回值作为累加器。

    【讨论】:

    • 查看 glenn 对 Matchu 回答的评论。
    【解决方案4】:

    问题中给出的示例每个元素数组的长度为三个,但下面的方法使用递归,并且可以用于任意长度。

    a = [ [38, "s", "hum", 1], 
        [38, "t", "foo", 2],
        [38, "t", "bar", 3], 
        [45, "s", "hum", 1], 
        [45, "t", "ram", 1], 
        [52, "s", "hum", 3], 
        [52, "t", "cat", 3], 
        [52, "t", "dog", 2]
    ]
    
    class Array
      def rep
        group_by{|k, _| k}.
        each_value{|v| v.map!{|_, *args| args}}.
        tap{|h| h.each{|k, v| h[k] = (v.first.length > 1 ? v.rep : v.flatten(1))}}
      end
    end
    
    p a.rep
    

    【讨论】: