遍历和分析列表列表答案

【问题标题】：Iterating through and analyzing a list of lists遍历和分析列表列表
【发布时间】：2022-01-10 21:50:54
【问题描述】：

我正在尝试通过几个嵌套列表进行迭代和分析。通常，我开始的列表包含 200 多个子列表：

[
  [
    1499040000000,      // Open time
    "0.01634790",       // Open
    "0.80000000",       // High
    "0.01575800",       // Low
    "0.01577100",       // Close
    "148976.11427815",  // Volume
    1499644799999,      // Close time
    "2434.19055334",    // Quote asset volume
    308,                // Number of trades
    "1756.87402397",    // Taker buy base asset volume
    "28.46694368",      // Taker buy quote asset volume
    "17928899.62484339" // Ignore.
  ]
]

我想遍历该嵌套列表的几个不同小节。例如。我只想迭代和分析列表的最后一个季度或后半部分。

从这些小节中，我想确定值“高”的最大值，即索引 2。

这是我尝试过的：

import itertools

twentyfour_hour_klines = initial list of sublists

#last 6 hours:
lookback_period = int('6')
six_hour_highest_high = get_highest_high(klines=twentyfour_hour_klines, lookback_period=lookback_period)
print(six_hour_highest_high, flush=True)

def get_highest_high(klines, lookback_period):
    start = int(len(klines) / 24 * (24 - lookback_period) + 1)
    stop = int(len(klines) + 1)

    highest_high = None 
    for line in itertools.islice(klines , start, stop):
        if highest_high == None:
            highest_high = float(line[2])
        elif float(line[2]) > highest_high:
            highest_high = float(line[2])
    return highest_high

它有效，但它似乎是一个相当笨重的解决方案。还有比这更瘦的吗？还请记住，我需要多次执行计算，速度是一个问题。

【问题讨论】：

此数据看起来可以在 pandas.DataFrame 中很好地工作，其中 pandas 还提供了根据您的请求查看特定数据片段的方法。
谢谢，@Kraigolas。 panda 似乎确实是满足我要求的合适工具。只需查看文档，即可： # Import pandas library import pandas as pd # 初始化列表列表 data = [['DS', 'Linked_list', 10], ['DS', 'Stack', 9], [ 'DS', 'Queue', 7], ['Algo', 'Greedy', 8], ['Algo', 'DP', 6], ['Algo', 'BackTrack', 5], ] # 创建pandas DataFrame df = pd.DataFrame(data, columns = ['Category', 'Name', 'Marks']) 连同 this: titanic.iloc[9:25, 2:5] 应该可以解决问题。当我找到它并更新我的问题时，我会检查它。

标签： python loops max nested-lists

【解决方案1】：

每当我尝试对列表做同样的事情时，我会做的就是做一张地图。地图分别对列表中的每个项目应用相同的功能。

唯一需要解决的是函数的外观。我们需要创建一个 lambda 函数，它接受一个列表并返回第 n 项。

x = [1499040000000,
    "0.01634790",
    "0.80000000",
    "0.01575800",       
    "0.01577100",
    "148976.11427815",
    1499644799999,
    "2434.19055334",
    308,                
    "1756.87402397",
    "28.46694368",
    "17928899.62484339"
  ]

  x[2]    # returns 0.8, index 0-up

现在让我们尝试创建一个更长的列表，并执行一个映射。

 y = [
      [1499040000000,
       "0.01634790",
       "0.80000000",
       "0.01575800",       
       "0.01577100",
       "148976.11427815",
       1499644799999,
       "2434.19055334",
       308,                
       "1756.87402397",
       "28.46694368",
       "17928899.62484339"
       ],
      [1499040000000,
       "0.01634790",
       "0.80000000",
       "0.01575800",       
       "0.01577100",
       "148976.11427815",
       1499644799999,
       "2434.19055334",
       308,                
       "1756.87402397",
       "28.46694368",
       "17928899.62484339"
     ]    
 ]

 res=map(lambda lst: lst[2],y)
 for a in res:
   print(a)    # 0.8, 0.8

最后，创建一个函数：

 def extract(lst, n):
     return map(lambda x: x[n],lst)

Map 返回一个可迭代对象，因此您可以对其执行for x in ，也可以使用list 将其转换为列表。

【讨论】：

感谢您的回复。我已经使用过 lambda 函数（也可以使用简洁的 min() 和 max() 函数。但是，我很难根据需要将其应用于列表的特定部分/行。

【解决方案2】：

根据@Kraigolas 的建议，我设法得到了以下解决方案：

    price_data = get_minute_data(symbol="BTCUSDT", interval=Client.KLINE_INTERVAL_5MINUTE, start_str='1 day ago UTC')
    
    def get_minute_data(symbol, interval, start_str):
        price_data = client.futures_historical_klines(symbol=symbol, interval=interval, start_str=start_str)
    
        df = pd.DataFrame(price_data)
        df = df.iloc[:,:7]
        df.columns = ["Open time",
                        "Open",
                        "High",
                        "Low", 
                        "Close", 
                        "Volume", 
                        "Close time"]
        df[["Open",
            "High",
            "Low", 
            "Close", 
            "Volume"]] = df[["Open",
                                "High",
                                "Low", 
                                "Close", 
                                "Volume"]].astype(float)
        df["Open time"] = pd.to_datetime(df["Open time"], unit='ms')
        df["Close time"] = pd.to_datetime(df["Close time"], unit='ms')

        return df

【讨论】：