【发布时间】:2019-10-08 15:31:06
【问题描述】:
我正在尝试构建一个 python 数据框来捕获初始数据集,然后在包含数据的 csv 文件在指定文件夹位置可用时更新字段。我已经建立了我的初始表并将其保存到 csv,但是当我重新打开表并尝试引入下一个文本文件数据时,我每次运行代码时都会不小心复制新数据的标题。读取新文本文件并将数据附加到基于通用列名的行的最佳方法是什么?
我目前的尝试是将新的文本文件构建到他们自己的数据库中,匹配初始导入的列标题,并将公共标题与起始数据集合并。然而,就像我说的那样,这些列只是不断复制自己,然后添加 _x、_y 等。
初始数据帧示例...
Line Name,Bearing,SOL X,SOL Y,EOL X,EOL Y,FSP 0.5m,FSP 1m,Length km,Length m,LSP 0.5m,LSP 1m,Date,Julian Day,Seq,Storage Db,Time Start,SOG Start,Fix Start,SOL Bearing,Line Length,SOL Easting,SOL Northing,Obs SOL X,Obs SOL Y,Time End,Fix End,SOG End,Planned EOL X,Planned EOL Y,Obs EOL X,Obs EOL Y
A901,38.67269998,568453.03,4343701.73,569156.01,4344580.05,250,125,1.125,1125,2500,1250,,,,,,,,,,,,,,,,,,,,
A902,38.67269998,568476.45,4343682.99,569179.43,4344561.31,250,125,1.125,1125,2500,1250,,,,,,,,,,,,,,,,,,,,
A903,38.67269998,568499.87,4343664.24,569202.85,4344542.56,250,125,1.125,1125,2500,1250,,,,,,,,,,,,,,,,,,,,
要合并的数据样本...
,Line Name,Date,Julian Day,Seq,Storage Db,Time Start,SOG Start,Fix Start,SOL Bearing,Line Length,SOL Easting,SOL Northing,Obs SOL X,Obs SOL Y,Time End,Fix End,SOG End,Planned EOL X,Planned EOL Y,Obs EOL X,Obs EOL Y
0,A901,9/26/2019,269,37,0037_JD269_X331 - 0001.db,09:29:54,2.73,12792,128.67,1000.0,587985.95,4380278.68,587811.22,4380427.4,09:43:51,15594,3.28,588766.68,4379653.81,588901.53,4379545.64
1,A902,9/26/2019,269,38,0038_JD269_M104 - 0001.db,11:38:24,3.69,98260,218.67,42875.0,591391.62,4383593.91,591626.99,4383892.87,17:40:02,25764,3.02,564600.29,4350120.19,568980.53,4355589.75
2,A903,9/29/2019,273,80,0080_JD273_M305 - 0001.db,00:50:53,3.64,27721,38.67,1125.0,576455.88,4351038.29,576365.15,4350932.06,01:03:26,30615,3.78,577158.86,4351916.61,577275.9,4352057.35
我尝试过使用不同的规则组合进行合并、附加和连接,但没有任何运气。一些错误,其他重复列...
df_proj = pd.concat([df_in,df_acc], axis=1,sort=False)
或
df_proj = pd.merge(df_in, df_acc, on='Line Name', how = 'right')
我试图在没有运气的情况下找到此类问题的示例,我以错误的方式解决了这个问题。也许我不应该一开始就将新的文本文件视为他们自己的数据框。但任何帮助将不胜感激。谢谢!
编辑:
预期的结果应该是这样的......
Line Name,Bearing,SOL X,SOL Y,EOL X,EOL Y,FSP 0.5m,FSP 1m,Length km,Length m,LSP 0.5m,LSP 1m,Date,Julian Day,Seq,Storage Db,Time Start,SOG Start,Fix Start,SOL Bearing,Line Length,SOL Easting,SOL Northing,Obs SOL X,Obs SOL Y,Time End,Fix End,SOG End,Planned EOL X,Planned EOL Y,Obs EOL X,Obs EOL Y
A901,38.67269998,568453.03,4343701.73,569156.01,4344580.05,250,125,1.125,1125,2500,1250,9/26/2019,269,37,0037_JD269_X331 - 0001.db,09:29:54,2.73,12792,128.67,1000.0,587985.95,4380278.68,587811.22,4380427.4,09:43:51,15594,3.28,588766.68,4379653.81,588901.53,4379545.64
A902,38.67269998,568476.45,4343682.99,569179.43,4344561.31,250,125,1.125,1125,2500,1250,9/26/2019,269,38,0038_JD269_M104 - 0001.db,11:38:24,3.69,98260,218.67,42875.0,591391.62,4383593.91,591626.99,4383892.87,17:40:02,25764,3.02,564600.29,4350120.19,568980.53,4355589.75
A903,38.67269998,568499.87,4343664.24,569202.85,4344542.56,250,125,1.125,1125,2500,1250,9/29/2019,273,80,0080_JD273_M305 - 0001.db,00:50:53,3.64,27721,38.67,1125.0,576455.88,4351038.29,576365.15,4350932.06,01:03:26,30615,3.78,577158.86,4351916.61,577275.9,4352057.35
【问题讨论】:
-
处理后可以添加预期的输出吗?
-
@jezrael 编辑已添加!谢谢!
标签: python python-3.x pandas numpy dataframe