如何在元数据之后读取 CSV？答案

【问题标题】：How to read CSV after metadata?如何在元数据之后读取 CSV？
【发布时间】：2021-06-15 20:14:40
【问题描述】：

我有一个这样的 CSV 文件：

#Description
#Param1: value
#Param2: value
...
#ParamN: value

Time (s),Header1,Header2
243.41745,3,1
243.417455,3,5
243.41746,7,6
...

我需要使用 Python 阅读它，而不需要使用 Pandas。如何读取 CSV 数据本身忽略初始行直到空行？我正在使用下面的代码成功读取元数据。

def read(file_path: str):
    '''Read the data of the Digilent WaveForms Logic Analyzer Acquisition
    (moodel Discovery2).

    Parameter: File path.
    '''
    meta = {}
    RE_CONFIG = re.compile(r'^#(?P<name>[^:]+)(: *(?P<value>.+)\s*$)*')
    with open(file_path, 'r') as fh:
        # Read the metadata and description at the beginning of the file.
        for line in fh.readlines():
            line = line.strip()
            if not line:
                break
            config = RE_CONFIG.match(line)
            if config:
                if not config.group('value'):
                    meta.update({'Description': config.group('name')})
                else:
                    meta.update({config.group('name'): config.group('value')})
        # Read the data it self.
        data = csv.DictReader(fh, delimiter=',')
    return data, meta

【问题讨论】：

这不行吗？
不是csv.DictReader(fh, delimiter=',') 部分。我希望答案为{'Time (s)': [], 'Header1': [], 'Header2': []}。

标签： python csv

【解决方案1】：

这似乎有效。我不得不将for line in fh.readlines(): 更改为for line in fh: 读取元数据的部分，以便读取与数据不会一致的部分，然后创建DictReader 和用它来获取data。

import csv
from pprint import pprint, pp
import re

def read(file_path: str):
    '''Read the data of the Digilent WaveForms Logic Analyzer Acquisition
    (moodel Discovery2).

    Parameter: File path.
    '''
    meta = {}
    RE_CONFIG = re.compile(r'^#(?P<name>[^:]+)(: *(?P<value>.+)\s*$)*')
    with open(file_path, 'r') as fh:
        # Read the metadata and description at the beginning of the file.
        for line in fh:  # CHANGED
            line = line.strip()
            if not line:
                break
            config = RE_CONFIG.match(line)
            if config:
                if not config.group('value'):
                    meta.update({'Description': config.group('name')})
                else:
                    meta.update({config.group('name'): config.group('value')})

        # Read the data itself.
        reader = csv.DictReader(fh, delimiter=',')
        data = list(reader)

    return data, meta

res = read('mixed.csv')
pprint(res)

【讨论】：