【问题标题】:How to have multiple MLFlow runs in parallel?如何让多个 MLFlow 并行运行?
【发布时间】:2023-04-10 00:32:01
【问题描述】:

我对 Python 中的并行化不是很熟悉,并且在尝试在多个训练折叠上并行训练模型时遇到错误。这是我的代码的简化版本:

def train_test_model(fold):
    # here I train the model etc...
    
    # now I want to save the parameters and metrics
    with mlflow.start_run():
        mlflow.log_param("run_name", run_name)
        mlflow.log_param("modeltype", modeltype)
        # and so on...

if __name__=="__main__":
    pool = ThreadPool(processes = num_trials)
    # run folds in parallel
    pool.map(lambda fold:train_test_model(fold), folds)

我收到以下错误:

Exception: Run with UUID 23e9bb6d22674a518e48af9c51252860 is already active. To start a new run, first end the current run with mlflow.end_run(). To start a nested run, call start_run with nested=True

documentationmlflow.start_run() 开始新的运行并使其处于活动状态,这是我的问题的根源。每个线程为其相应的折叠启动 MLFlow 运行并使其处于活动状态,而我需要运行并行运行,即全部处于活动状态(?)并保存相应折叠的参数/指标。我该如何解决这个问题?

【问题讨论】:

    标签: python pyspark parallel-processing mlflow


    【解决方案1】:

    我找到了一个解决方案,也许它对其他人有用。您可以在此处查看代码示例的详细信息:https://github.com/mlflow/mlflow/issues/3592

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2010-11-10
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多