【问题标题】:writer.writerow() doesn't write to the correct columnwriter.writerow() 没有写入正确的列
【发布时间】:2019-05-07 13:01:24
【问题描述】:

我有三个DynamoDB 表。两个表的实例 ID 是应用程序的一部分,另一个是我的所有账户和标签元数据中所有实例的主表。我对这两个表进行了两次扫描以获取实例 ID,然后在主表中查询标签元数据。但是,当我尝试将其写入CSV 文件时,我希望每个发电机表的唯一输出有两个单独的标题部分。第一次迭代完成后,第二个文件写入到第一次迭代停止的最后一行,而不是从第二个标题部分的顶部重新开始。下面是我的代码和一个输出示例,以使其清楚。

CODE:

import boto3
import csv
import json 
from boto3.dynamodb.conditions import Key, Attr

dynamo = boto3.client('dynamodb')
dynamodb = boto3.resource('dynamodb')
s3 = boto3.resource('s3')

# Required resource and client calls
all_instances_table = dynamodb.Table('Master')
missing_response = dynamo.scan(TableName='T1')
installed_response = dynamo.scan(TableName='T2')

# Creates CSV DictWriter object and fieldnames 
with open('file.csv', 'w') as csvfile:
    fieldnames = ['Agent Not Installed', 'Not Installed Account', 'Not Installed Tags', 'Not Installed Environment', " ", 'Agent Installed', 'Installed Account', 'Installed Tags', 'Installed Environment']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()

    # Find instances IDs from the missing table in the master table to pull tag metadata 
    for instances in missing_response['Items']:
        instance_missing = instances['missing_instances']['S']
        #print("Missing:" + instance_missing)
        query_missing = all_instances_table.query(KeyConditionExpression=Key('ID').eq(instance_missing))

        for item_missing in query_missing['Items']:
            missing_id = item_missing['ID']
            missing_account = item_missing['Account']
            missing_tags = item_missing['Tags']
            missing_env = item_missing['Environment']
            # Write the data to the CSV file
            writer.writerow({'Agent Not Installed': missing_id, 'Not Installed Account': missing_account, 'Not Installed Tags': missing_tags, 'Not Installed Environment': missing_env})

    # Find instances IDs from the installed table in the master table to pull tag metadata
    for instances in installed_response['Items']:
        instance_installed = instances['installed_instances']['S']
        #print("Installed:" + instance_installed)
        query_installed = all_instances_table.query(KeyConditionExpression=Key('ID').eq(instance_installed))

        for item_installed in query_installed['Items']:
            installed_id = item_installed['ID']
            print(installed_id)
            installed_account = item_installed['Account']
            installed_tags = item_installed['Tags']
            installed_env = item_installed['Environment']

            # Write the data to the CSV file 
            writer.writerow({'Agent Installed': installed_id, 'Installed Account': installed_account, 'Installed Tags': installed_tags, 'Installed Environment': installed_env})

OUTPUT:

这就是文件中的列/行的样子。

我需要每个标题部分的所有输出都位于同一行。

DATA:

这是两个表格的示例。

SAMPLE OUTPUT:

这是for 循环打印出来并附加到列表的内容。

缺失:

i-0xxxxxx 333333333 foo@bar.com int 
i-0yyyyyy 333333333 foo1@bar.com int

已安装:

i-0zzzzzz 44444444 foo2@bar.com int
i-0aaaaaa 44444444 foo3@bar.com int

【问题讨论】:

  • 您需要编写代码来组合您的数据,然后每行编写一次。每次调用writeRow() 都会创建一个新行,它不会向现有行添加额外的列。
  • 您的意思是“正确列”吗?您可能希望用“正确”而不是“正确”来更新您的问题,因为列被视为位置:左、右、中间等。
  • @Smittles 我已经更新了问题。

标签: json python-3.x csv boto3


【解决方案1】:

您希望将相关的行收集到一个列表中以写在一行上,例如:

missing = [] # collection for missing_responses
installed = [] # collection for installed_responses

# Find instances IDs from the missing table in the master table to pull tag metadata 
for instances in missing_response['Items']:
    instance_missing = instances['missing_instances']['S']
    #print("Missing:" + instance_missing)
    query_missing = all_instances_table.query(KeyConditionExpression=Key('ID').eq(instance_missing))
    for item_missing in query_missing['Items']:
        missing_id = item_missing['ID']
        missing_account = item_missing['Account']
        missing_tags = item_missing['Tags']
        missing_env = item_missing['Environment']
        # Update first half of row with missing list
        missing.append(missing_id, missing_account, missing_tags, missing_env)

# Find instances IDs from the installed table in the master table to pull tag metadata
for instances in installed_response['Items']:
    instance_installed = instances['installed_instances']['S']
    #print("Installed:" + instance_installed)
    query_installed = all_instances_table.query(KeyConditionExpression=Key('ID').eq(instance_installed))

    for item_installed in query_installed['Items']:
        installed_id = item_installed['ID']
        print(installed_id)
        installed_account = item_installed['Account']
        installed_tags = item_installed['Tags']
        installed_env = item_installed['Environment']
        # update second half of row by updating installed list
        installed.append(installed_id, installed_account, installed_tags, installed_env)
# combine your two lists outside a loop
this_row = []
i = 0;
for m in missing:
    # iterate through the first half to concatenate with the second half
    this_row.append( m + installed[i] )
    i = i +1

# adding an empty column after the write operation, manually, is optional
# Write the data to the CSV file 
writer.writerow(this_row)

如果您的已安装表和缺失表在相关字段(例如时间戳或帐户 ID)上运行,这将起作用,您可以确保这些字段保持以相同顺序连接的行。数据样本将有助于真正回答问题。

【讨论】:

  • 我已经用我的表格截图和for 循环的一些输出数据更新了这个问题。
  • 我推荐的方法可行,但似乎与您希望它们位于同一行的原因存在脱节。如果是我,我会有另一个列,一个布尔值,称为“已安装”,它的字段是 Yes 或 No(或 True/False,或 1/0)。然后可以对这些进行排序。您的循环数据没有任何相关性 - 帐户不同,实例不重叠,因此(在我看来)没有理由将它们存储在同一行中。为什么它们在同一行?
猜你喜欢
  • 2012-07-26
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2019-11-02
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多