将一列浮点数转换为累积百分比答案

【问题标题】：Converting a column of floats into cumulative percentages将一列浮点数转换为累积百分比
【发布时间】：2020-12-30 08:57:54
【问题描述】：

Screenshot

嘿，我有一列'close'中的浮点数需要转换为累积百分比并将其存储在'cum_p'中。

我得到了这个脚本来帮助我，但我搞砸了一两步：

import os
import sys
import csv


def adjust_cryptors_file(source, /, values, close):
    with open(source) as f:
        data = [row for row in csv.reader(f)][1:]

    agg_data = []

    ix = 0
    total = 0
    while ix < len(data):
        # value to add to the running total
        row = data[ix]

        # Column 20 / Index 19
        v = float(row[values])

        total += v

        # percentage of running total
        p = (total / 3797.14) * 100

        closed = row[close]

        # add to new list of data
        agg_data.append([v, p, closed])

        # increment index counter
        ix += 1

    agg_data.insert(0, ['timestamp', 'close', 'cum_p'])

    parent = os.path.dirname(source)
    dest = os.path.join(parent, 'modified.data')

    with open('modified.csv', 'w') as f:
        writer = csv.writer(f)
        writer.writerows(agg_data)

    print(f"Your new modified data file: {dest}")


if __name__ == '__main__':
    # Enter your CSV file here
    source = 'BTCUSDT-1d-data.csv'
    column_of_timestamp = 1
    column_of_close = 5

    adjust_cryptors_file(source, values=column_of_timestamp, close=column_of_close)

我非常感谢一些帮助或指点 :) 第一个值是 0%，第二个值是 'close' 中第一个和第二个值的百分比差异。

希望任何人都可以帮助我。

【问题讨论】：

您愿意分享您正在使用的 .csv 文件吗？我想帮助购买希望确保格式符合您正在使用的格式，这样我就不会得到可能不兼容的解决方案。

标签： python pandas csv

【解决方案1】：

试试这个解决方案：

df["cum_p"] = pd.Series.cumsum(df.close)/df.close.sum()*100

它给出以下输出：

【讨论】：

【解决方案2】：

由于您将每一行保存在agg_data 列表中，因此我们可以使用该列表访问close 的前一行数据。根据您的指示，我已将其添加为有条件的判断：

# percentage of running total
        p = (total / 3797.14) * 100

        # new conditional statement to deal with the 'cum_p' column
        if ix == 0:
            closed = 0
        else:
            closed = ((p - agg_data[ix-1][1])/agg_data[ix-1][1]) * 100

        # add to new list of data
        agg_data.append([v, p, closed])

这是我现在为modified.csv 中的cum_p 列获得的示例输出。

编辑：我使用的是 close 文件中的 modified.csv 中的值。如果您指的是原始BTCUSDT-1d-data.csv 文件中的close，请告诉我。

【讨论】：

【解决方案3】：

解决了这个问题

import pandas as pd

如果 name == 'main'： source = 'BTCUSDT-1d-data.csv'

# create dataframe from csv file
df = pd.read_csv(source)

# calculate the cum_pnl
daily_close = df['close']
daily_returns = pd.Series(daily_close).pct_change(1)
cum_pnl = daily_returns.cumsum()

# add cum_pnl column
df['cum_pnl'] = cum_pnl

# filter to only columns that we want:
df = df[['timestamp', 'close', 'cum_pnl']]

# export it as csv again
df.to_csv('btcdailychange.csv')

这给了我每一行的个人百分比变化然后我简单地创建了一个新列 ['sum'] 并：

df['sum'] = df["cum_pnl"].cumsum()

感谢大家的回复！

【讨论】：