【问题标题】:Converting millions, billion, and trillion to a number in Python在 Python 中将百万、十亿和万亿转换为数字
【发布时间】:2021-04-21 04:07:30
【问题描述】:

我有一列包含“5.00 M”、“1.00 T”和“1.29 Juta”等值,并且想要一种简单的方法将其转换为数值。我试过了

import re
powers = {'M': 10 ** 9, 'T': 10 ** 12, 'Juta': 10 ** 6}
var1 = ['4', '7149', '6184.09', '0.00', '8', '134944', '5187.33', '5.00 M', '17', '74104', '60773.22', '260.00 M', '7', '347334', '451922.68', '1.00 T', '80', '18469', '483386.83', '2.50 M', '12', '4716', '14946.30', '0.00', '18', '7119', '111617.66', '0.00', '31', '23131', '814413.09', '0.00', '21', '16281', '192020.50', '0.00', '20', '98381', '57850.37', '0.00', '31', '12501', '39384.40', '0.00', '31', '2851', '1.29 Juta', '0.00', '34', '9440', '171364.82', '0.00', '26', '25442', '54394.00', '0.00', '24', '2492', '165295.95', '0.00', '12', '675', '51301.40', '0.00', '7', '5', '8057.77', '0.00', '6', '704', '35579.19', '0.00', '5', '2133', '15683.20', '0.00', '3', '1356', '5021.00', '0.00', '3', '966', '5456.32', '0.00', '5', '2636', '4097.42', '0.00', '8', '1878', '4554.50', '0.00', '6', '3518', '13900.00', '0.00', '2', '1', '61000.00', '0.00', '3', '0', '1688.00', '0.00', '4', '10', '1488.33', '0.00', '0', '0', '0.00', '0.00', '0', '0', '0.00', '0.00', '2', '0', '4054.00', '0.00', '0', '0', '0.00', '0.00']

def f(num_str):
    match = re.search(r"([0-9\.]+)\s?(M|T|Juta)", num_str)
    if match is not None:
        quantity = match.group(0)
        magnitude = match.group(1)
        return float(quantity) * powers[magnitude]

for i in var1:
    x = f(i)
    print(x)

但是我收到了这个错误:

None
None
None
None
None
None
None
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-23-8dd2f89076c3> in <module>
      1 for i in var1:
----> 2     x = f(i)
      3     print(x)

<ipython-input-22-cb419bc71fb8> in f(num_str)
      7         quantity = match.group(0)
      8         magnitude = match.group(1)
----> 9         return float(quantity) * powers[magnitude]

ValueError: could not convert string to float: '5.00 M'

【问题讨论】:

    标签: python regex formatting


    【解决方案1】:

    只需使用group(1)group(2),因为group(0)entire matching string

    import re
    powers = {'M': 10 ** 9, 'T': 10 ** 12, 'Juta': 10 ** 6}
    var1 = ['4', '7149', '6184.09', '0.00', '8', '134944', '5187.33', '5.00 M', '17', '74104', '60773.22', '260.00 M', '7', '347334', '451922.68', '1.00 T', '80', '18469', '483386.83', '2.50 M', '12', '4716', '14946.30', '0.00', '18', '7119', '111617.66', '0.00', '31', '23131', '814413.09', '0.00', '21', '16281', '192020.50', '0.00', '20', '98381', '57850.37', '0.00', '31', '12501', '39384.40', '0.00', '31', '2851', '1.29 Juta', '0.00', '34', '9440', '171364.82', '0.00', '26', '25442', '54394.00', '0.00', '24', '2492', '165295.95', '0.00', '12', '675', '51301.40', '0.00', '7', '5', '8057.77', '0.00', '6', '704', '35579.19', '0.00', '5', '2133', '15683.20', '0.00', '3', '1356', '5021.00', '0.00', '3', '966', '5456.32', '0.00', '5', '2636', '4097.42', '0.00', '8', '1878', '4554.50', '0.00', '6', '3518', '13900.00', '0.00', '2', '1', '61000.00', '0.00', '3', '0', '1688.00', '0.00', '4', '10', '1488.33', '0.00', '0', '0', '0.00', '0.00', '0', '0', '0.00', '0.00', '2', '0', '4054.00', '0.00', '0', '0', '0.00', '0.00']
    
    def f(num_str):
        match = re.search(r"([0-9\.]+)\s?(M|T|Juta)", num_str)
        if match is not None:
            quantity = match.group(1)
            magnitude = match.group(2)
            return float(quantity) * powers[magnitude]
        else:
            return num_str
    
    for i in var1:
        x = f(i)
        print(x)
    

    【讨论】:

    • 另一个问题是正则表达式需要 0 或 1 个空格(不再有),如果没有 M TJuta 后缀,它将失败。
    • 我得到了None None None None None None None 5000000000.0 None None None 260000000000.0 None None None 1000000000000.0 None None None 的结果,如果我希望 None 与 var1 中的值相同怎么办?
    • 加个else:来处理正则表达式不匹配的情况?
    【解决方案2】:

    除了使用错误的组号之外,您的正则表达式还有一些问题。您可以按以下方式修复它:

    def f(num_str):
        # regex below has been replaced
        match = re.search(r"(\d+(?:.\d+)?)\s?(M|T|Juta)?", num_str)    # added a ? after Juta) and replaced regex for numeric part.
        if match is not None:
            quantity = match.group(1)
            if match.group(2):                # added a test before to check if magnitude exists
                magnitude = match.group(2)
                return float(quantity) * powers[magnitude]
            else:                             # added a else condition for without magnitude
                return float(quantity)
            
    for i in var1:
        x = f(i)
        print(x)
    

    事实上,您的数字部分的正则表达式[0-9\.]+ 不正确。最好用\d+(?:.\d+)?替换\d+作为整数部分,可选小数部分(.\d+)?将小数部分括在(?: )中,使其成为非捕获组。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-12-14
      • 1970-01-01
      • 1970-01-01
      • 2013-05-12
      • 1970-01-01
      • 2019-11-28
      • 2020-12-06
      相关资源
      最近更新 更多