【问题标题】:Compilation failure: Detected unsupported operations when trying to compile graph get_loss_cond_1_true_88089_rewritten[]编译失败:尝试编译图 get_loss_cond_1_true_88089_rewritten[] 时检测到不支持的操作
【发布时间】:2020-12-04 09:08:03
【问题描述】:

尝试使用自定义 crf 损失函数时,我在 google colab TPU 上收到以下错误。 我检查了https://cloud.google.com/tpu/docs/tensorflow-ops 的 FakeParam 操作,看起来操作员在 Cloud TPU 上可用。

InvalidArgumentError: 9 root error(s) found. (0) Invalid argument: {{function_node __inference_train_function_104228}} Compilation failure: Detected unsupported operations when trying to compile graph get_loss_cond_1_true_88089_rewritten[] on XLA_TPU_JIT: FakeParam (No registered 'FakeParam' OpKernel for XLA_TPU_JIT devices compatible with node {{node get_loss/cond_1/FakeParam_15}} (OpKernel was found, but attributes didn't match) Requested Attributes: dtype=DT_VARIANT, shape=[]){{node get_loss/cond_1/FakeParam_15}} [[get_loss/cond_1]] TPU compilation failed [[tpu_compile_succeeded_assert/_12238515605435969423/_6]] [[tpu_compile_succeeded_assert/_12238515605435969423/_6/_279]] (1) Invalid argument: {{function_node __inference_train_function_104228}} Compilation failure: Detected unsupported operations when trying to compile graph get_loss_cond_1_true_88089_rewritten[] on XLA_TPU_JIT: FakeParam (No registered 'FakeParam' OpKernel for XLA_TPU_JIT devices compatible with node {{node get_loss/cond_1/FakeParam_15}} (OpKernel was found, but attributes didn't match) Requested Attributes: dtype=DT_VARIANT, shape=[]){{node get_loss/cond_1/FakeParam_15}} [[get_loss/cond_1]] TPU compilation failed [[tpu_compile_succeeded_assert/_12238515605435969423/_6]] [[tpu_compile_succeeded_assert/_12238515605435969423/_6/_223]] (2) Invalid argument: {{function_node __inference_train_function_104228}} Compilation failure: Detected unsupported operations when trying to compile graph get_loss_cond_1_true_88089_rewritten[] on XLA_TPU_JIT: FakeParam (No registered 'FakeParam' OpKernel for XLA_TPU_JIT devices compatible with node {{node get_loss/cond_1/FakeParam_15}} (OpKernel was found, but attributes didn't match) Requested Attributes: dtype=DT_VARIANT, shape=[]){{node get_loss/cond_1/FakeParam_15}} [[get_loss/cond_1]] TPU compilation failed [[tpu_compile_succeeded_assert/_12238515605435969423/_6]] [[tpu_compile_succeeded_assert/_12238515605435969423/_6/_265]] (3) Invalid argument: {{function_node __inference_train_function_104228}} Compilation failure: Detected unsupported operations when trying to compile graph get_loss_cond_1_true_88089_rewritten[] on XLA_TPU_JIT: FakeParam (No registered 'FakeParam' OpKernel for XLA_TPU_JIT devices compatible with node {{node get_loss/cond_1/FakeParam_15}} (OpKernel was found, but attributes didn't match) Requested Attributes: dtype=DT_VARIANT, shape=[]){{node get_loss/cond_1/FakeParam_15}} [[get_loss/cond_1]] TPU compilation failed [[tpu_compile_succeeded_assert/_12238515605435969423/_6]] [[tpu_compile_succeeded_assert/_12238515605435969423/_6/_251]] (4) Invalid argument: {{function_node __inference_train_function_104228}} Compilation failure: Detected unsupported operations when trying to compile graph get_loss_cond_1_true_88089_rewritten[] on XLA_TPU_JIT: FakeParam (No registered 'FakeParam' OpKernel for XLA_TPU_JIT devices compatible with node {{node get_loss/cond_1/FakeParam_15}} (OpKernel was found, but attributes didn't match) Requested Attributes: dtype=DT_VARIANT, shape=[ ... [truncated]

这是我的代码:

def make_model():
  input_ids_in = tf.keras.layers.Input(shape=(100,), name='input_token', dtype=tf.int32)
  input_mask_in = tf.keras.layers.Input(shape=(100,), name='input_mask', dtype=tf.int32)
  bert_model = TFAutoModel.from_pretrained("dbmdz/bert-base-turkish-cased")
  embedding_layer = bert_model(input_ids_in, attention_mask = input_mask_in)[0]
  model = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(50,trainable=False,
                            return_sequences=True))(embedding_layer)
  model = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(len(labels_ner), activation="relu"))(model)
            
  crf = CRF(len(labels_ner))  # CRF layer
  out = crf(model)  # output
  model = Model([input_ids_in,input_mask_in], out)
  model.compile('adam', loss=crf.get_loss)

  print("Baseline/LSTM-CRF model built: ")
  return model 

with strategy.scope():
  model = make_model()
  model.fit(x_tr, np.argmax(y_tr,axis=-1) ,batch_size=32 ,epochs=5,verbose=1,validation_split = 0.1)

我使用了这个 tensorflow_addon crf.py 模块https://github.com/howl-anderson/addons/blob/feature/crf_layers/tensorflow_addons/layers/crf.py

谢谢

【问题讨论】:

    标签: tensorflow tf.keras tpu


    【解决方案1】:

    看起来 FakeParam 仅支持以下 dtype:{bfloat16,bool,complex64,float,int32,int64,uint32,uint64},而不支持 dtype=DT_VARIANT

    在 TF2 上启用自动外部编译应该可以解决此问题,请在某处添加此行: tf.config.set_soft_device_placement(True).

    【讨论】:

      猜你喜欢
      • 2023-03-05
      • 2016-11-02
      • 1970-01-01
      • 1970-01-01
      • 2019-09-11
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-03-27
      相关资源
      最近更新 更多