Ruby 计算字符串中唯一字符的数量答案

【问题标题】：Ruby Counting The Number of Unique Chars in a StringRuby 计算字符串中唯一字符的数量
【发布时间】：2020-11-09 07:41:52
【问题描述】：

我正在处理一串字符 alphabet ="AABBBCCCCDDDDDEFGHIJKLMNOPQRSTUVWXYZZZZZZ" 我想创建一个 def 来计算字符串中唯一字符的数量和唯一字符的百分比，而不必使用 alphabet.count("A"), alphabet.count"("B"), alphabet.count("C"), etc etc 所以我不必浪费时间将每个字符繁琐地输入.count() 方法。

从某种意义上说，我取得了成功，我得到了我想要的输出，但是由于我如何构建 for 循环，输出会多次重复每个结果

这是我的代码：

alphabet ="AABBBCCCCDDDDDEFGHIJKLMNOPQRSTUVWXYZZZZZZ"

def count_num_of_uniq_chars(string)
  len = string.length
  len = len.to_f
  for i in 0..len-1

    uniq_char=string[i]
    puts "uniq_chars --> #{uniq_char}"

    count_of_uniq_char = string.count(string[i])

    puts "count_of_uniq_char--> #{count_of_uniq_char}"


    percent_of_uniq_char = ( (count_of_uniq_char / len) * 100 )
    percent_of_uniq_char=percent_of_uniq_char.to_f

    puts "there are #{count_of_uniq_char} letter '#{uniq_char}'s in the string which is #{percent_of_uniq_char}% of strings length "
    puts
  end # loop end

end #def end

count_num_of_uniq_chars(alphabet)

输出为：

uniq_chars --> A
count_of_uniq_char--> 2
there are 2 letter 'A's in the string which is 4.878048780487805% of strings length

uniq_chars --> A
count_of_uniq_char--> 2
there are 2 letter 'A's in the string which is 4.878048780487805% of strings length

uniq_chars --> B
count_of_uniq_char--> 3
there are 3 letter 'B's in the string which is 7.317073170731707% of strings length

uniq_chars --> B
count_of_uniq_char--> 3
there are 3 letter 'B's in the string which is 7.317073170731707% of strings length

uniq_chars --> B
count_of_uniq_char--> 3
there are 3 letter 'B's in the string which is 7.317073170731707% of strings length

uniq_chars --> C
count_of_uniq_char--> 4
there are 4 letter 'C's in the string which is 9.75609756097561% of strings length

uniq_chars --> C
count_of_uniq_char--> 4
there are 4 letter 'C's in the string which is 9.75609756097561% of strings length

uniq_chars --> C
count_of_uniq_char--> 4
there are 4 letter 'C's in the string which is 9.75609756097561% of strings length

uniq_chars --> C
count_of_uniq_char--> 4
there are 4 letter 'C's in the string which is 9.75609756097561% of strings length

uniq_chars --> D
count_of_uniq_char--> 5
there are 5 letter 'D's in the string which is 12.195121951219512% of strings length

uniq_chars --> D
count_of_uniq_char--> 5
there are 5 letter 'D's in the string which is 12.195121951219512% of strings length

uniq_chars --> D
count_of_uniq_char--> 5
there are 5 letter 'D's in the string which is 12.195121951219512% of strings length

uniq_chars --> D
count_of_uniq_char--> 5
there are 5 letter 'D's in the string which is 12.195121951219512% of strings length

uniq_chars --> D
count_of_uniq_char--> 5
there are 5 letter 'D's in the string which is 12.195121951219512% of strings length

uniq_chars --> E
count_of_uniq_char--> 1
there are 1 letter 'E's in the string which is 2.4390243902439024% of strings length

uniq_chars --> F
count_of_uniq_char--> 1
there are 1 letter 'F's in the string which is 2.4390243902439024% of strings length

uniq_chars --> G
count_of_uniq_char--> 1
there are 1 letter 'G's in the string which is 2.4390243902439024% of strings length

uniq_chars --> H
count_of_uniq_char--> 1
there are 1 letter 'H's in the string which is 2.4390243902439024% of strings length

uniq_chars --> I
count_of_uniq_char--> 1
there are 1 letter 'I's in the string which is 2.4390243902439024% of strings length

uniq_chars --> J
count_of_uniq_char--> 1
there are 1 letter 'J's in the string which is 2.4390243902439024% of strings length

uniq_chars --> K
count_of_uniq_char--> 1
there are 1 letter 'K's in the string which is 2.4390243902439024% of strings length

uniq_chars --> L
count_of_uniq_char--> 1
there are 1 letter 'L's in the string which is 2.4390243902439024% of strings length

uniq_chars --> M
count_of_uniq_char--> 1
there are 1 letter 'M's in the string which is 2.4390243902439024% of strings length

uniq_chars --> N
count_of_uniq_char--> 1
there are 1 letter 'N's in the string which is 2.4390243902439024% of strings length

uniq_chars --> O
count_of_uniq_char--> 1
there are 1 letter 'O's in the string which is 2.4390243902439024% of strings length

uniq_chars --> P
count_of_uniq_char--> 1
there are 1 letter 'P's in the string which is 2.4390243902439024% of strings length

uniq_chars --> Q
count_of_uniq_char--> 1
there are 1 letter 'Q's in the string which is 2.4390243902439024% of strings length

uniq_chars --> R
count_of_uniq_char--> 1
there are 1 letter 'R's in the string which is 2.4390243902439024% of strings length

uniq_chars --> S
count_of_uniq_char--> 1
there are 1 letter 'S's in the string which is 2.4390243902439024% of strings length

uniq_chars --> T
count_of_uniq_char--> 1
there are 1 letter 'T's in the string which is 2.4390243902439024% of strings length

uniq_chars --> U
count_of_uniq_char--> 1
there are 1 letter 'U's in the string which is 2.4390243902439024% of strings length

uniq_chars --> V
count_of_uniq_char--> 1
there are 1 letter 'V's in the string which is 2.4390243902439024% of strings length

uniq_chars --> W
count_of_uniq_char--> 1
there are 1 letter 'W's in the string which is 2.4390243902439024% of strings length

uniq_chars --> X
count_of_uniq_char--> 1
there are 1 letter 'X's in the string which is 2.4390243902439024% of strings length

uniq_chars --> Y
count_of_uniq_char--> 1
there are 1 letter 'Y's in the string which is 2.4390243902439024% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length

请注意，每个字母的输出语句根据该字母在字符串中出现的次数重复。无论字符串中出现多少次，如何让每个字母输出一次？

【问题讨论】：

使用String#count 效率非常低，因为它需要检查字符串中的每个字符是否存在每个唯一字符。您应该使用一种通过字符串进行单次传递的方法。

标签： ruby

【解决方案1】：

我想创建一个计算字符串中唯一字符数的 def [...]

您可以通过String#each_char 获取字符串的字符并让Enumerable#tally 计算出现次数：（tally 需要 Ruby 2.7）

alphabet.each_char.tally
#=> {
#     "A"=>2, "B"=>3, "C"=>4, "D"=>5, "E"=>1, "F"=>1, "G"=>1,
#     "H"=>1, "I"=>1, "J"=>1, "K"=>1, "L"=>1, "M"=>1, "N"=>1,
#     "O"=>1, "P"=>1, "Q"=>1, "R"=>1, "S"=>1, "T"=>1, "U"=>1,
#     "V"=>1, "W"=>1, "X"=>1, "Y"=>1, "Z"=>6
#   }

要获得百分比，您只需将字符的出现次数除以字符的总数，例如：

hash = alphabet.each_char.tally
hash.each do |char, count|
  q = count.quo(hash.size)
  puts format(" %s | %d | %4.1f%%", char, count, q * 100)
end

输出：

 A | 2 |  7.7%
 B | 3 | 11.5%
 C | 4 | 15.4%
 D | 5 | 19.2%
 E | 1 |  3.8%
 F | 1 |  3.8%
 G | 1 |  3.8%
 H | 1 |  3.8%
 I | 1 |  3.8%
 J | 1 |  3.8%
 K | 1 |  3.8%
 L | 1 |  3.8%
 M | 1 |  3.8%
 N | 1 |  3.8%
 O | 1 |  3.8%
 P | 1 |  3.8%
 Q | 1 |  3.8%
 R | 1 |  3.8%
 S | 1 |  3.8%
 T | 1 |  3.8%
 U | 1 |  3.8%
 V | 1 |  3.8%
 W | 1 |  3.8%
 X | 1 |  3.8%
 Y | 1 |  3.8%
 Z | 6 | 23.1%

除了hash.size（唯一字符数），您还可以除以alphabet.size（字符串中的字符数），具体取决于您想要的。

【讨论】：

【解决方案2】：

这里有三种方法可以做到这一点。

alphabet = "AABBBCCCCDDDDDEFGHIJKLMNOPQRSTUVWXYZZZZZZ"

使用在 Ruby 2.7.0 中首次亮相的方法Enumerable#tally

h = alphabet.each_char.tally
  #=> {"A"=>2, "B"=>3, "C"=>4,..., "Z"=>6}

使用类方法Hash::new 的形式，它接受零参数（但没有块），参数是散列的默认值

h = alphabet.each_char.with_object(Hash.new(0)) { |c,h| h[c] += 1 }
  #=> {"A"=>2, "B"=>3, "C"=>4,..., "Z"=>6}

h[c] += 1 扩展为 h[c] = h[c] + 1。如果h 没有键c，则等式右侧的h[c] 返回默认值零，产生h[c] = 0 + 1。

使用方法Enumerable#group_by

h = alphabet.each_char.
             group_by(&:itself).
             transform_values(&:count)
  #=> <same as above>

见Hash#transform_values。

步骤如下：

enum = alphabet.each_char
  #=> #<Enumerator: "AABBB...ZZZ":each_char> 
a = enum.group_by(&:itself)
  #=> {"A"=>["A", "A"], "B"=>["B", "B", "B"],...,
  #          "Z"=>["Z", "Z", "Z", "Z", "Z", "Z"]} 
a.transform_values(&:count)
  #=> {"A"=>2, "B"=>3,..., "Z"=>6}

使用哈希

获得哈希后，您可以根据需要显示信息。例如：

n = alphabet.size
  #=> 41  
h.each { |k,v| puts "#{v} #{k}'s #{(100*v.fdiv(n)).round(2)}%" }
2 A's 4.88%
3 B's 7.32%
4 C's 9.76%
...
1 X's 2.44%
1 Y's 2.44%
6 Z's 14.63%

【讨论】：

绝妙的解决方案。我从您关于 ruby 中的哈希和迭代器的方法中学到了很多
lbdl，我很高兴能提供帮助。我对我的答案做了一个小改动，添加了第三种使用Enumerable#tally 的方法，这种方法在提出问题时可能不存在。
是的，我看到了理货答案，我目前使用的是旧版本，但很高兴知道，我刚开始使用 Ruby，到目前为止我很满意

【解决方案3】：

您可以使用string.chars.uniq 并摆脱len、for 循环和uniq_char 初始化：

def count_num_of_uniq_chars(string)
  string.chars.uniq.each do |uniq_char|
    puts "uniq_chars --> #{uniq_char}"

    count_of_uniq_char = string.count(uniq_char)

    puts "count_of_uniq_char--> #{count_of_uniq_char}"


    percent_of_uniq_char = ( (count_of_uniq_char / string.length.to_f) * 100 )
    percent_of_uniq_char=percent_of_uniq_char.to_f

    puts "there are #{count_of_uniq_char} letter '#{uniq_char}'s in the string which is #{percent_of_uniq_char}% of strings length \n\n"
  end
end

参见String#chars 和Array#uniq。

注意percent_of_uniq_char 的计算方法是count_of_uniq_char 除以转换为浮点数的字符串长度。如果这对这种情况有问题，您可以在循环外对其进行初始化。

【讨论】：

通过为每个元素调用count来计算元素效率非常低，因为每次都必须遍历整个集合。
那部分代码在 OP 提供的问题中。除了问题无论字符串中出现多少次，我怎样才能让它每个字母输出一次？我没有考虑过任何其他问题代码的地狱改进版本，因为他/她可能正在玩弄该语言以了解更多关于它的信息，我也不能。
嗯，我做到了，我建议使用each 而不是for。但是，谁不会呢？
无意冒犯。我只是想指出一个对我来说非常明显的缺点。
当然没有冒犯@Stefan。就个人而言，我非常感谢大家的 cmets 作为改进代码的一种方式，并作为对未来场景的建议，所以，感谢这个。

【解决方案4】：

可以这样实现

def count_occ(str)
    d=Hash.new(0)
    str.split('').each do |ch|
            d[ch]=d[ch]+1
    end
    d.each do |key,value|
        count_ch=value
        percentage=count_ch/Float(str.length)
        puts "there are #{count_ch} letter '#{key}'s in the string which is #{percentage}% of strings length "
    end

end

【讨论】：