【问题标题】:How can I convert this byte or string to a dataframe?如何将此字节或字符串转换为数据帧?
【发布时间】:2018-11-03 08:10:10
【问题描述】:

我有一个这种格式的数据(字节):

b'{"datatable":{"data":[["AAPL","1980-12-12",28.75,28.87,28.75,28.75,2093900.0,0.0,1.0,0.42270591588018,0.42447025361603,0.42270591588018,0.42270591588018,117258400.0],
["AAPL","1980-12-15",27.38,27.38,27.25,27.25,785200.0,0.0,1.0,0.40256306006259,0.40256306006259,0.40065169418209,0.40065169418209,43971200.0],
["AAPL","1980-12-16",25.37,25.37,25.25,25.25,472000.0,0.0,1.0,0.37301040298714,0.37301040298714,0.37124606525129,0.37124606525129,26432000.0],
["AAPL","1980-12-17",25.87,26.0,25.87,25.87,385900.0,0.0,1.0,0.38036181021984,0.38227317610034,0.38036181021984,0.38036181021984,21610400.0],
["AAPL","1980-12-18",26.63,26.75,26.63,26.63,327900.0,0.0,1.0,0.39153594921354,0.39330028694939,0.39153594921354,0.39153594921354,18362400.0],
["AAPL","1980-12-19",28.25,28.38,28.25,28.25,217100.0,0.0,1.0,0.41535450864748,0.41726587452798,0.41535450864748,0.41535450864748,12157600.0],
.....,{"name":"adj_high","type":"BigDecimal(50,28)"},{"name":"adj_low","type":"BigDecimal(50,28)"},{"name":"adj_close","type":"BigDecimal(50,28)"},{"name":"adj_volume","type":"double"}]},"meta":{"next_cursor_id":null}}'

我可以使用.decode('utf-8') 进行转换。但是,我想将类型转换为 DataFrame 或其他格式,以便我可以处理这些数据。 任何帮助将不胜感激。

当我尝试pd.DataFrame()时出现以下错误

ValueError: DataFrame constructor not properly called!

谢谢你给了我很好的指导! 我用过

apple = json.loads(apple1)
apple

得到

{'datatable': {'columns': [{'name': 'ticker', 'type': 'String'},
    {'name': 'date', 'type': 'Date'},
    {'name': 'open', 'type': 'BigDecimal(34,12)'},
    {'name': 'high', 'type': 'BigDecimal(34,12)'},
    {'name': 'low', 'type': 'BigDecimal(34,12)'},
    {'name': 'close', 'type': 'BigDecimal(34,12)'},
    {'name': 'volume', 'type': 'BigDecimal(37,15)'},
    {'name': 'ex-dividend', 'type': 'BigDecimal(42,20)'},
    {'name': 'split_ratio', 'type': 'double'},
    {'name': 'adj_open', 'type': 'BigDecimal(50,28)'},
    {'name': 'adj_high', 'type': 'BigDecimal(50,28)'},
    {'name': 'adj_low', 'type': 'BigDecimal(50,28)'},
    {'name': 'adj_close', 'type': 'BigDecimal(50,28)'},
    {'name': 'adj_volume', 'type': 'double'}],
   'data': [['AAPL',
     '1980-12-12',
     28.75,
     28.87,
     28.75,
     28.75,
     2093900.0,
     0.0,
     1.0,
     0.42270591588018,
     0.42447025361603,
     0.42270591588018,
     0.42270591588018,
     117258400.0],
    ['AAPL',
     '1980-12-15',
     27.38,
     27.38,
     27.25,
     27.25,
     785200.0,
     0.0,
     1.0,
     0.40256306006259,
     0.40256306006259,
     0.40065169418209,
     0.40065169418209,
     43971200.0],

如果我跑步:

pd.DataFrame(apple['datatable']['data'])

我明白了:

apple dataframe

这很好,但我希望列名称为:[date, open, high, low, close, volume, ex-dividend, split_ratio, adj_open, adj_high, adj_low, adj_close, adj_volume] 而不是[0,1,2,3,4,5,6,7,8,9,10,11,12,13]

另外,我想删除当前列 1('AAPL') 并索引为数字,使其看起来像一个以日期为第一列的时间序列。

你能帮我解决这个问题吗?

【问题讨论】:

标签: python string dataframe


【解决方案1】:

您可能需要先整理数据,但请执行以下工作。

import json
import pandas as pd
pd.DataFrame(json.loads(data.decode('utf-8'))['datatable']['data'])

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2020-06-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-02-26
    • 1970-01-01
    • 1970-01-01
    • 2020-12-13
    相关资源
    最近更新 更多