将 json 文件与 CSV 结合起来——类似于 vlookup答案

【问题标题】：Combine json file with CSV- similar to vlookup将 json 文件与 CSV 结合起来——类似于 vlookup
【发布时间】：2017-02-03 22:19:17
【问题描述】：

简单来说，我正在尝试合并两组数据。我愿意使用 grep/bash 或 python。

读取目录/mediaid
读取 .json 文件的文件名
如果 .json 文件名与 .csv 中的一行匹配，则复制该行中 json 文件的内容（如果不匹配，则跳过）

输入数据

文件 1.csv

testentry, 1234
testentry1, 6789

输入数据（文件名是要检查的 MEDIAID）

1234.json

[
{"id":"1", "text":"Nice man!"},
{"id":"2", "text":"Good job"}
]

6789.json

[
{"id":"1", "text":"Test1"},
{"id":"2", "text":"Test2"}
]

所需的输出数据 .csv

testentry, 1234, Nice man!, Good job
testentry1, 6789, Test1, Test2

我正在尝试使用 GREP，但我无法获取要检查的 json 文件名并从中传递数据。

#!/usr/bin/env bash

indir="$HOME/indir"
outdir="$HOME/outdir"

cd "$indir" || exit
mkdir -p "$outdir" || exit
for f in *.csv; do
    [[ -f $f ]] || continue
    lines=()
    while IFS=, read -ra cols; do
        if (( ${#cols[@]} != 2 )); then
            echo "Sorry buddy, you'll have to use a real CSV parser to handle: $f" >&2
            exit 1
        fi
        # Does the basename match the contents of the first column?
        if [[ ${cols[0]} == "${f%.*}" ]]; then
            echo "Match found in $f"
        fi
        lines+=("${cols[0]},${cols[1]}")
    done <"$f"
    # something with JQ to read the json filename, and pass its data into the row
    printf '%s\n' "${lines[@]}" > "$outdir/$f" || exit
done

在 Python 中的一次失败但稍微好一点的尝试：

import csv
import json

path_to_json = 'somedir/'

json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]

print json_files  # 

with open(json_files) as lookuplist:
    # IT NEEDS to match the mediaID from the json FILENAME
    with open('file1.csv', "r") as csvinput:
        with open('VlookupOut','w') as output:

            reader = csv.reader(lookuplist)
            reader2 = csv.reader(csvinput)
            writer = csv.writer(output)

            d = {}
            for xl in reader2:
                d[xl[2]] = xl[3:]

            for i in reader:
                if i[4] in d:
                    i.append(d[i[4]])
                writer.writerow(i)

【问题讨论】：

您的要求不清楚。您想要 json 文件中的所有文本，而不考虑 ID？
正确 - ID 不重要。它应该只根据文件名匹配。我已经更新了 OP 以使其更清晰。
你的 csv 真的有逗号后面的空格吗？

标签： python json bash csv grep

【解决方案1】：

这提供了您所需的输出：

for file in /mediaid/*; do
    while read -r entry fileid; do 
        jsonfile="$fileid.json"
        if [[ -f "$jsonfile" ]]; then 
            text=$(jq -r 'map(.text) | join(", ")' "$jsonfile")
            echo "$entry $fileid, $text"
        fi
    done < "$file"
done > output.csv

使用jq解析JSON文件

【讨论】：

那没有用。它需要连接 .CSV 文件中已经存在的行。我不只是打印到 CSV 文件
请进一步解释。我不明白这个新要求
我正在尝试合并 JSON 文件中的值，以便在已填充的 .CSV 文件上打印
那么，将file1.csv的处理结果保存回file1.csv？
没有。将 .JSON 文件中的内容添加到 file.csv 中。如果 .JSON 文件名与“file1.csv”中的文件匹配，则打印该行中的 .JSON 文件内容