【问题标题】:Iterate Through Nested Dictionary to Create Dataframe and Add New Column Value遍历嵌套字典以创建数据框并添加新的列值
【发布时间】:2021-05-04 23:25:21
【问题描述】:

Python 菜鸟,请多多包涵。

我有一个股票信息字典列表。变量名称“json”,我想将其转换为数据框,然后在数据旁边的新列中附加一个带有股票代码的列。见下文。

    json =
    [{'Meta Data': {'1. Information': 'Monthly Prices (open, high, low, close) and Volumes', '2. 
    Symbol': 'AAPL', '3. Last Refreshed': '2021-01-29', '4. Time Zone': 'US/Eastern'}, 'Monthly Time 
    Series': {'2021-01-29': {'1. open': '133.5200', '2. high': '145.0900', '3. low': '126.3820', '4. 
    close': '131.9600', '5. volume': '2239366098'}, '2020-12-31': {'1. open': '121.0100', '2. high': 
    '138.7890', '3. low': '120.0100', '4. close': '132.6900', '5. volume': '2319687808'}}},

    {'Meta Data': {'1. Information': 'Monthly Prices (open, high, low, close) and Volumes', '2. 
    Symbol': 'ZM', '3. Last Refreshed': '2021-01-29', '4. Time Zone': 'US/Eastern'}, 'Monthly Time 
    Series': {'2021-01-29': {'1. open': '340.4000', '2. high': '404.4400', '3. low': '331.1000', '4. 
    close': '372.0700', '5. volume': '121350349'}, '2020-12-31': {'1. open': '434.7200', '2. high': 
    '434.9900', '3. low': '336.1000', '4. close': '337.3200', '5. volume': '150168985'}}}]

我运行以下命令,得到我想要的数据框,除了代码:

    df = [pd.DataFrame.from_dict(i['Monthly Time Series'], orient= 'index').sort_index(axis=1) for i in json]

输出:

    [             1. open   2. high    3. low  4. close   5. volume
    2021-01-29  133.5200  145.0900  126.3820  131.9600  2239366098
    2020-12-31  121.0100  138.7890  120.0100  132.6900  2319687808
    ],
                  1. open   2. high    3. low  4. close  5. volume
    2021-01-29  340.4000  404.4400  331.1000  372.0700  121350349
    2020-12-31  434.7200  434.9900  336.1000  337.3200  150168985]

我想要的是从'2中提取值。 Symbol' 来自 json 并将相应的股票代码附加到相应的数据中,如下所示:

    [             1. open   2. high    3. low  4. close   5. volume  ticker
    2021-01-29  133.5200  145.0900  126.3820  131.9600  2239366098  AAPL
    2020-12-31  121.0100  138.7890  120.0100  132.6900  2319687808  AAPL
    ],
                  1. open   2. high    3. low  4. close  5. volume  ticker
    2021-01-29  340.4000  404.4400  331.1000  372.0700  121350349  ZM
    2020-12-31  434.7200  434.9900  336.1000  337.3200  150168985  ZM

]

【问题讨论】:

  • 首先,json 不是字典。请在继续之前在type 上确认
  • 谢谢。那我该怎么办?
  • 如果能确认类型就好了。

标签: python pandas list loops dictionary


【解决方案1】:

更新:

单循环一行执行

df = [ (pd.DataFrame.from_dict(i['Monthly Time Series'] , orient= 'index').sort_index(axis=1).assign(ticker=i['Meta Data']['2.Symbol'])) for i in json]

json 数据:

json =[{
    'Meta Data': {
        '1. Information': 'Monthly Prices (open, high, low, close) and Volumes','2.Symbol': 'AAPL', '3. Last Refreshed': '2021-01-29', '4. Time Zone': 'US/Eastern'},
'Monthly Time Series': {
    '2020-01-29': {
        '1. open': '133.5200', '2. high': '145.0900','3. low': '126.3820', '4. close': '131.9600', '5. volume': '2239366098'
        },  
        '2020-01-30': {'1. open': '121.0100', '2. high': '138.7890', '3. low': '120.0100', 
        '4. close': '132.6900', '5. volume': '2319687808'}
        }
        },
        {
    'Meta Data': {
        '1. Information': 'Monthly Prices (open, high, low, close) and Volumes','2.Symbol': 'ZM', '3. Last Refreshed': '2021-01-01', '4. Time Zone': 'US/Eastern'},
        'Monthly Time Series': {
            '2020-02-02': {'1. open': '133.5200', '2. high': '145.0900','3. low': '126.3820',
            '4. close' : '131.9600', '5. volume': '2239366098'
            },  
        '2020-02-31': 
        {
    '1. open': '121.0100', '2. high': '138.7890', '3. low': '120.0100','4. close' : '132.6900', '5. volume': '2319687808'}
            }
        }]

利用 assign 添加新列

addTimeSeries = lambda i : pd.DataFrame.from_dict(i['Monthly Time Series'] , orient= 'index').sort_index(axis=1)

addVal = lambda i: addTimeSeries(i).assign(ticker=i['Meta Data']['2.Symbol'])
df = [ addVal(i) for i in json]

输出:

[             1. open   2. high    3. low  4. close   5. volume ticker
 2020-01-29  133.5200  145.0900  126.3820  131.9600  2239366098   AAPL
 2020-01-30  121.0100  138.7890  120.0100  132.6900  2319687808   AAPL,
              1. open   2. high    3. low  4. close   5. volume ticker
 2020-02-02  133.5200  145.0900  126.3820  131.9600  2239366098     ZM
 2020-02-31  121.0100  138.7890  120.0100  132.6900  2319687808     ZM]

【讨论】:

  • 谢谢。不幸的是仍然无法正常工作。我现在收到响应:“TypeError:字符串索引必须是整数”
  • 不知道你是如何得到那个错误 bcz 运行上面的代码我没有得到任何这样的错误。
  • 在一行中更新了答案而不使用 lambda。
猜你喜欢
  • 2021-12-31
  • 1970-01-01
  • 2018-12-24
  • 1970-01-01
  • 1970-01-01
  • 2018-05-08
  • 1970-01-01
  • 2015-12-06
  • 1970-01-01
相关资源
最近更新 更多