使用 while 循环训练模型答案

【问题标题】：Training a model with while loop使用 while 循环训练模型
【发布时间】：2021-09-13 22:52:36
【问题描述】：

我正在尝试迭代一些值，而我的数据集 S_train 的长度

S_new = train
T_new = test
mu_new = mu
mu_test_new = mu_test

while len(S_new) <= 11:
  ground_test =  T_new[target].values.tolist()
  acquisition_function = abs(mu_test - ground_test)
  max_item = np.argmax(acquisition_function) #step 3 : value in test set that maximizes the abs difference of the energy
  alpha_al = test.iloc[[max_item]]  #identify the minimum step in test set
  S_new = S_new.append(alpha_al)
  len(S_new)
  T_new = T_new.drop(test.index[max_item])
  len(T_new)

  gpr = GaussianProcessRegressor(
    # kernel is the covariance function of the gaussian process (GP)
    kernel=Normalization( # kernel equals to normalization -> normalizes a kernel using the cosine of angle formula, k_normalized(x,y) = k(x,y)/sqrt(k(x,x)*k(y,y))
        # graphdot.kernel.fix.Normalization(kernel), set kernel as marginalized graph kernel, which is used to calculate the similarity between 2 graphs
        # implement the random walk-based graph similarity kernel as Kashima, H., Tsuda, K., & Inokuchi, A. (2003). Marginalized kernels between labeled graphs. ICML
        Tang2019MolecularKernel()
    ),
    alpha=1e-4, # value added to the diagonal of the kernel matrix during fitting
    optimizer=True, # default optimizer of L-BFGS-B based on scipy.optimize.minimize
    normalize_y=True, # normalize the y values so taht the means and variance is 0 and 1, repsectively. Will be reversed when predicions are returned
    regularization='+', # alpha (1e-4 in this case) is added to the diagonals of the kernal matrix
     )
  
  start_time = time.time()
  gpr.fit(S_new.graphs, S_new[target], repeat=1, verbose=True) # Fitting train set as graphs (independent variable) with train[target] as dependient variable
  end_time = time.time()
  print("the total time consumption is " + str(end_time - start_time) + ".")
 
  gpr.kernel.hyperparameters
  
  rmse_training = []
  rmse_test = []


  mu_new = gpr.predict(S_new.graphs)

  print('Training set')
  print('MAE:', np.mean(np.abs(S_new[target] - mu_new)))
  print('RMSE:', np.std(S_new[target] - mu_new))
  rmse_training.append(np.std(S_new[target] - mu_new)

  mu_test_new = gpr.predict(T_new.graphs)
  print('Training set')
  print('MAE:', np.mean(np.abs(T_new[target] - mu_test_new)))
  print('RMSE:', np.std(T_new[target] - mu_test_new))
  rmse_test.append(np.std(T_new[target] - mu_test_new)

基本上，我正在计算 T_new 中的值，该值使 T_new 中的第 i 个元素和 mu_test 之间的绝对误差最大化，并将其添加到集合 S_train，然后将其从 T_new 中删除。使用新的 S_train，我将再次训练我的模型，然后执行我上面解释的相同操作。我从未使用过 while 循环，我正在寻找 sintaxis，对我来说看起来是正确的，但我收到以下错误消息：

File "<ipython-input-55-d284ca5f9d1f>", line 42
    mu_test_new = gpr.predict(T_new.graphs)
              ^
SyntaxError: invalid syntax

你知道是什么原因造成的吗？任何建议都非常感谢。永远感谢您的帮助。

【问题讨论】：

您在前一行缺少一个右括号...
你是对的。非常感谢！

标签： python machine-learning while-loop dataset

【解决方案1】：

问题不在于 while 循环。这只是打字错误。特别是这一行 -

  rmse_training.append(np.std(S_new[target] - mu_new)

缺少右括号。
如果你尝试

  rmse_training.append(np.std(S_new[target] - mu_new))

您看到的错误将会消失。

非常值得注意的是，针对特定行报告的错误有时是由于之前的语法错误造成的，这是调试时需要注意的事项。

【讨论】：

谢谢，我会留意的！
另一个关于 rmse_training 列表的问题。我将这些值附加为：rmse_training.append(np.std(S_new[target] - mu_new))，但是一旦循环结束，我只会得到最后一个值。如何在 while 循环中获取所有这些？
我正在考虑将 rmse_training 放在 while 循环之外。你觉得这样行吗？