【问题标题】:Python Regex with split Method带有拆分方法的 Python 正则表达式
【发布时间】:2015-09-24 08:30:16
【问题描述】:

我正在尝试从 csv 文件创建一个列表。但是,我在使用 split 方法时遇到了困难,因为 csv 文件中的某些属性在括号内有逗号。例如:

csv file:

500,403,34,"hello there, this attribute has a comma in it",567

例如,当我遍历文件时:

for line in f:
    fields = line.split(",")

fields = ['500','403','34','"hello there','this attribute has a comma in it"','567']

我怎样才能让它看起来像这样:

fields = ['500','403','34','"hello there, this attribute has a comma in it"','567']

我想为此使用正则表达式,但如果有更简单的方法,我很想听听。谢谢!

【问题讨论】:

    标签: python regex csv


    【解决方案1】:

    只需使用existing CSV package。示例:

    import csv
    with open('file.csv', 'rb') as csvfile:
        reader = csv.reader(csvfile)
            for row in reader:
                print ', '.join(row)
    

    【讨论】:

      【解决方案2】:
      import re
      x='500,403,34,"hello there, this attribute has a comma in it",567'
      print re.split(r""",(?=(?:[^"]*"[^"]*"[^"]*)*[^"]*$)""",x)
      

      输出:['500', '403', '34', '"hello there, this attribute has a comma in it"', '567']

      【讨论】:

        【解决方案3】:

        CSV 模块是最简单的方法:

        import csv
        
        with open('input.csv') as f:
            for row in csv.reader(f):
                print row
        

        对于输入input.csv

        500,403,34,"你好,这个属性里面有逗号",567 500,403,34,"你好,这个属性里面没有逗号",567 500,403,34,"你好,这个属性有多个逗号,in, it",567

        输出是:

        ['500', '403', '34', '你好,这个属性里面有一个逗号', '567'] ['500', '403', '34', '你好,这个属性里面没有逗号', '567'] ['500', '403', '34', '你好,这个属性有多个逗号,in, it', '567']

        【讨论】:

          猜你喜欢
          • 2021-10-07
          • 2012-04-20
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多