【发布时间】:2023-01-25 18:27:35
【问题描述】:
我正在尝试实施物理知情神经网络。损失中的微分部分确实在(假设的)未知区域带来了一些改进(与经典神经网络相比)。这个未知区域实际上是已知的,但我只是将它们从训练和测试数据集中删除,以检查 PINN 与其他技术的性能。这是我使用的代码:
model = tf.keras.Sequential([
layers.Dense(units=64, activation='relu', input_shape=(2,)),
layers.Dense(units=64, activation='relu'),
layers.Dense(units=1,)
])
optimizer = tf.keras.optimizers.Adam()
objective = tf.keras.losses.Huber()
metric = tf.keras.metrics.MeanAbsoluteError()
w_phys = 0.5
w_loss = 1.0 - w_phys
with tf.device('gpu:0'):
for epoch in range(epochs):
cumulative_loss_train = 0.0
metric.reset_states()
for mini_batch, gdth in dataset:
with tf.GradientTape(persistent=True) as tape:
tape.watch(unknown_area_SOCP_tensor)
tape.watch(mini_batch)
# Physics loss
predictions_unkwon = model(unknown_area_SOCP_tensor, training=True)
d_f = tape.gradient(predictions_unkwon, unknown_area_SOCP_tensor)
# Physics part with P #
dp = tf.convert_to_tensor(1/((K*unknown_area_SOCP_tensor[:,0]+L)**2-4*R*unknown_area_SOCP_tensor[:,1]), dtype = np.float64)
phys_loss_p = 10*tf.cast(tf.math.reduce_mean(tf.math.square(d_f[:,1]**2 - dp)), np.float32)
# Traditionall loss #
predictions = model(mini_batch, training=True)
loss = objective(gdth, predictions)
# Compute grads #
grads = tape.gradient(w_loss*loss + w_phys*(phys_loss_p), model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
cumulative_loss_train += loss
metric.update_state(gdth, predictions)
del tape
到目前为止,一切都很好。 K、R和L为固定参数。 下一步是假设它们是未知的,并尝试弄清楚我们是否可以学习它们。 我首先尝试只关注 R 参数。这是使用的代码:
with tf.device('gpu:0'):
for epoch in range(epochs):
cumulative_loss_train = 0.0
metric.reset_states()
for mini_batch, gdth in dataset:
with tf.GradientTape(persistent=True) as tape:
tape.watch(unknown_area_SOCP_tensor)
tape.watch(mini_batch)
tape.watch(R)
# Physics loss
predictions_unkwon = model(unknown_area_SOCP_tensor, training=True)
d_f = tape.gradient(predictions_unkwon, unknown_area_SOCP_tensor)
# Physics part with P #
dp = tf.convert_to_tensor(1/((K*unknown_area_SOCP_tensor[:,0]+L)**2-4*R*unknown_area_SOCP_tensor[:,1]), dtype = np.float64)
phys_loss_p = 10*tf.cast(tf.math.reduce_mean(tf.math.square(d_f[:,1]**2 - dp)), np.float32)
# Traditionall loss #
predictions = model(mini_batch, training=True)
loss = objective(gdth, predictions)
# Compute grads #
grads = tape.gradient(w_loss*loss + w_phys*(phys_loss_p), model.trainable_variables + [R])
optimizer.apply_gradients(zip(grads, model.trainable_variables + [R]))
cumulative_loss_train += loss
metric.update_state(gdth, predictions)
del tape
但这会导致糟糕的结果(比如高损失和糟糕的指标)。更糟糕的是,R 的值必须是正数,而在训练结束时,R 被估计为负值......
我对方程式很有信心,因为我已经检查了很多时间,而且与我使用的模拟软件相比,它似乎是准确的。此外,方程式为学习带来了价值(因为对 unknwon 的预测要好得多)。
我在这里错过了什么吗?
谢谢你的帮助 !
【问题讨论】:
标签: tensorflow machine-learning deep-learning tensorflow2.0 gradient-descent