【发布时间】:2021-09-13 22:52:36
【问题描述】:
我正在尝试迭代一些值,而我的数据集 S_train 的长度
S_new = train
T_new = test
mu_new = mu
mu_test_new = mu_test
while len(S_new) <= 11:
ground_test = T_new[target].values.tolist()
acquisition_function = abs(mu_test - ground_test)
max_item = np.argmax(acquisition_function) #step 3 : value in test set that maximizes the abs difference of the energy
alpha_al = test.iloc[[max_item]] #identify the minimum step in test set
S_new = S_new.append(alpha_al)
len(S_new)
T_new = T_new.drop(test.index[max_item])
len(T_new)
gpr = GaussianProcessRegressor(
# kernel is the covariance function of the gaussian process (GP)
kernel=Normalization( # kernel equals to normalization -> normalizes a kernel using the cosine of angle formula, k_normalized(x,y) = k(x,y)/sqrt(k(x,x)*k(y,y))
# graphdot.kernel.fix.Normalization(kernel), set kernel as marginalized graph kernel, which is used to calculate the similarity between 2 graphs
# implement the random walk-based graph similarity kernel as Kashima, H., Tsuda, K., & Inokuchi, A. (2003). Marginalized kernels between labeled graphs. ICML
Tang2019MolecularKernel()
),
alpha=1e-4, # value added to the diagonal of the kernel matrix during fitting
optimizer=True, # default optimizer of L-BFGS-B based on scipy.optimize.minimize
normalize_y=True, # normalize the y values so taht the means and variance is 0 and 1, repsectively. Will be reversed when predicions are returned
regularization='+', # alpha (1e-4 in this case) is added to the diagonals of the kernal matrix
)
start_time = time.time()
gpr.fit(S_new.graphs, S_new[target], repeat=1, verbose=True) # Fitting train set as graphs (independent variable) with train[target] as dependient variable
end_time = time.time()
print("the total time consumption is " + str(end_time - start_time) + ".")
gpr.kernel.hyperparameters
rmse_training = []
rmse_test = []
mu_new = gpr.predict(S_new.graphs)
print('Training set')
print('MAE:', np.mean(np.abs(S_new[target] - mu_new)))
print('RMSE:', np.std(S_new[target] - mu_new))
rmse_training.append(np.std(S_new[target] - mu_new)
mu_test_new = gpr.predict(T_new.graphs)
print('Training set')
print('MAE:', np.mean(np.abs(T_new[target] - mu_test_new)))
print('RMSE:', np.std(T_new[target] - mu_test_new))
rmse_test.append(np.std(T_new[target] - mu_test_new)
基本上,我正在计算 T_new 中的值,该值使 T_new 中的第 i 个元素和 mu_test 之间的绝对误差最大化,并将其添加到集合 S_train,然后将其从 T_new 中删除。 使用新的 S_train,我将再次训练我的模型,然后执行我上面解释的相同操作。 我从未使用过 while 循环,我正在寻找 sintaxis,对我来说看起来是正确的,但我收到以下错误消息:
File "<ipython-input-55-d284ca5f9d1f>", line 42
mu_test_new = gpr.predict(T_new.graphs)
^
SyntaxError: invalid syntax
你知道是什么原因造成的吗?任何建议都非常感谢。 永远感谢您的帮助。
【问题讨论】:
-
您在前一行缺少一个右括号...
-
你是对的。非常感谢!
标签: python machine-learning while-loop dataset