【发布时间】:2021-12-02 09:49:49
【问题描述】:
我有一个包含 15000 行二进制数据的数据框,每个字符串为 365 个字符。我将每个二进制数转换为 365 天,开始日期为 2020 年 12 月 13 日。
因为数据太大了,所以我的程序运行的很慢。有什么方法可以优化我的程序吗?
数据示例:
| ID | Nature | Binary |
|---|---|---|
| 1122 | M | 1001100100100010010001001100100110010011001001100100110010011001001100100110010011001001100100110010011001001100110110010011001001100100110010011001000000100110011011001001100100110010011001001100100110010011001001100100110010011001001100100110010011001001100100110010011001001100100110010011001001100100110010011001001100110110010000001001100100110010011001001100 |
输出:
| ID | Nature | Date | Code |
|---|---|---|---|
| 1122 | M | 13/12/2020 | 1 |
| 1122 | M | 14/12/2020 | 0 |
| 1122 | M | .......... | ... |
| 1122 | M | 11/12/2021 | 0 |
代码:
start_date = '2021-12-13'
table_ = pd.DataFrame({'ID': df.id[0],'Nature':df.Nature[0], Date':pd.date_range(start_date, periods=len(df.binairy[0]), freq='D'), 'Code': list(df.binairy[0])})
for i in range(1,len(df)):
table_i = pd.DataFrame({'ID': df.id[i],'Nature':df.Nature[i],'Date':pd.date_range(start_date, periods=len(df.binairy[i]), freq='D'), 'Code': list(df.binairy[i]})
table_ = pd.concat([table_,table_i],ignore_index=True)
table_
【问题讨论】: