【问题标题】:Python CSV to JSON Array Objects with unique values from CSV as one JSON Object where more than onePython CSV 到 JSON 数组对象,具有来自 CSV 的唯一值作为一个 JSON 对象,其中多个
【发布时间】:2017-01-09 13:19:06
【问题描述】:

有这个 CSV:

Domain,IP,Server,PoweredBy,MetaGenerator,Email
http://www.example1.com,1.1.1.1,,,,
http://www.example2.com,2.2.2.2,Apache,PHP/5.5.9-1ubuntu4.20,,
http://www.example3.com,3.3.3.3,Apache,PHP/5.5.9-1ubuntu4.20,Easy Digital Downloads v2.4.9;Powered by Visual Composer - drag and drop page builder for WordPress.,info@example3.com;sales@example3.com

尝试构建一个 JSON 对象数组,其中每个对象都是 CSV 值的唯一组合,其中有很多(由“;”分隔),即

如我们所见,我们有不同的元生成器和用于 www.example3.com 的电子邮件

对于这种情况,对象的 JSON 数组应如下所示,每个组合都作为数组中的一个 JSON 对象:

[{'Domain': 'http://www.example1.com',
  'Email': '',
  'IP': '1.1.1.1',
  'MetaGenerator': '',
  'PoweredBy': '',
  'Server': ''},
 {'Domain': 'http://www.example2.com',
  'Email': '',
  'IP': '2.2.2.2',
  'MetaGenerator': '',
  'PoweredBy': 'PHP/5.5.9-1ubuntu4.20',
  'Server': 'Apache'},
 {'Domain': 'http://www.example3.com',
  'Email': 'sales@example3.com',
  'IP': '2.2.2.2',
  'MetaGenerator': 'Easy Digital Downloads v2.4.9',
  'PoweredBy': 'PHP/5.5.9-1ubuntu4.20',
  'Server': 'Apache'},
 {'Domain': 'http://www.example3.com',
  'Email': 'sales@example3.com',
  'IP': '2.2.2.2',
  'MetaGenerator': 'Powered by Visual Composer - drag and drop page builder for WordPress.',
  'PoweredBy': 'PHP/5.5.9-1ubuntu4.20',
  'Server': 'Apache'},
 {'Domain': 'http://www.example3.com',
  'Email': 'info@example3.com',
  'IP': '2.2.2.2',
  'MetaGenerator': 'Easy Digital Downloads v2.4.9',
  'PoweredBy': 'PHP/5.5.9-1ubuntu4.20',
  'Server': 'Apache'},
 {'Domain': 'http://www.example3.com',
  'Email': 'info@example3.com',
  'IP': '2.2.2.2',
  'MetaGenerator': 'Powered by Visual Composer - drag and drop page builder for WordPress.',
  'PoweredBy': 'PHP/5.5.9-1ubuntu4.20',
  'Server': 'Apache'}]

有这个 Python 代码:

import csv
import pprint
import json

with open("results.csv", 'r') as csvfile:
    reader = csv.DictReader(csvfile, delimiter=',')
    out=[]
    d=dict()
    for row in reader:
        if ';' in row['Email']:
          val = row['Email'].split(';')
          for v in val:
            d['Email']=v
            out.append(d)    
        if ';' in row['MetaGenerator']:
          val = row['MetaGenerator'].split(';')
          for v in val:
            d['MetaGenerator']=v
            out.append(d)
        else:
          d=row
          out.append(d) 


pprint.pprint(out)

但它不能正常工作。

如何实现我的目标?伪代码也可以。顺序并不重要。我应该使用哪些模块?

谢谢,

【问题讨论】:

    标签: python json csv


    【解决方案1】:

    试试这个(检查itertools doc):

    import csv
    import pprint
    import json
    import itertools
    
    out=[]
    with open("results.csv", 'r') as csvfile:
        reader = csv.DictReader(csvfile, delimiter=',')
        for row in reader:
    
            Domains = row['Domain'].split(";")
            Ips = row['IP'].split(";")
            Servers = row['Server'].split(";")
            Emails = row['Email'].split(";")
            MetaGenerators = row['MetaGenerator'].split(";")
            PoweredBy = row['PoweredBy'].split(";")
    
            for comb in itertools.product(Domains, Ips, Servers, Emails, MetaGenerators, PoweredBy):
                (cDomain, cIp, cServer, cEmail, cMeta, cPowered) = comb
    
                out.append({
                        'Domain': cDomain,
                        'IP': cIp,
                        'Server': cServer,
                        'Email': cEmail,
                        'MeraGenerator': cMeta,
                        'PoweredBy': cPowered
                    })
    
    pprint.pprint(out)
    

    检查这个可读性较差但更智能的解决方案,与 csv 字段隔离:

    out=[]
    with open("results.csv", 'r') as csvfile:
        reader = csv.DictReader(csvfile, delimiter=',')
        headers = reader.fieldnames
    
        for row in reader:
            fields = [value.split(";") for key, value in row.iteritems()]
            out += [{headers[key]: value for key, value in enumerate(comb)} for comb in itertools.product(*fields)]
    
    pprint.pprint(out)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-07-29
      • 2021-06-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-09-13
      相关资源
      最近更新 更多