【问题标题】:Train Hugging face AutoModel defined using AutoConfig训练使用 AutoConfig 定义的拥抱脸 AutoModel
【发布时间】:2026-02-02 06:45:01
【问题描述】:

我在transformers 中定义了模型的配置。后来,我使用这个配置来初始化分类器如下

from transformers import AutoConfig, AutoModel

config = AutoConfig.from_pretrained('bert-base-uncased')
classifier = AutoModel.from_config(config)

我已经检查了该类可用的函数列表

>>> dir(classifier)

>>>
['add_memory_hooks',
 'add_module',
 'adjust_logits_during_generation',
 'apply',
 'base_model',
 'base_model_prefix',
 'beam_sample',
 'beam_search',
 'bfloat16',
 'buffers',
 'children',
 'config',
 'config_class',
 'cpu',
 'cuda',
 'device',
 'double',
 'dtype',
 'dummy_inputs',
 'dump_patches',
 'embeddings',
 'encoder',
 'estimate_tokens',
 'eval',
 'extra_repr',
 'float',
 'floating_point_ops',
 'forward',
 'from_pretrained',
 'generate',
 'get_buffer',
 'get_extended_attention_mask',
 'get_head_mask',
 'get_input_embeddings',
 'get_output_embeddings',
 'get_parameter',
 'get_position_embeddings',
 'get_submodule',
 'gradient_checkpointing_disable',
 'gradient_checkpointing_enable',
 'greedy_search',
 'group_beam_search',
 'half',
 'init_weights',
 'invert_attention_mask',
 'is_parallelizable',
 'load_state_dict',
 'load_tf_weights',
 'modules',
 'name_or_path',
 'named_buffers',
 'named_children',
 'named_modules',
 'named_parameters',
 'num_parameters',
 'parameters',
 'pooler',
 'prepare_inputs_for_generation',
 'prune_heads',
 'push_to_hub',
 'register_backward_hook',
 'register_buffer',
 'register_forward_hook',
 'register_forward_pre_hook',
 'register_full_backward_hook',
 'register_parameter',
 'requires_grad_',
 'reset_memory_hooks_state',
 'resize_position_embeddings',
 'resize_token_embeddings',
 'retrieve_modules_from_names',
 'sample',
 'save_pretrained',
 'set_input_embeddings',
 'share_memory',
 'state_dict',
 'supports_gradient_checkpointing',
 'tie_weights',
 'to',
 'to_empty',
 'train',
 'training',
 'type',
 'xpu',
 'zero_grad']

除此之外,只有train 方法似乎相关。但是,在检查该函数的文档字符串后,我得到了

>>> print(classifier.train.__doc__)
>>> Sets the module in training mode.

        This has any effect only on certain modules. See documentations of
        particular modules for details of their behaviors in training/evaluation
        mode, if they are affected, e.g. :class:`Dropout`, :class:`BatchNorm`,
        etc.

        Args:
            mode (bool): whether to set training mode (``True``) or evaluation
                         mode (``False``). Default: ``True``.

        Returns:
            Module: self

如何在自定义数据集上训练这个分类器(最好在transformerstensorflow 中)?

【问题讨论】:

    标签: deep-learning nlp tensorflow2.0 huggingface-transformers


    【解决方案1】:

    以上代码中需要TFAutoModel

    from transformers import AutoConfig, TFAutoModel
    
    config = AutoConfig.from_pretrained('bert-base-uncased')
    
    model = TFAutoModel.from_config(config)
    
    model.compile(
        loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
        optimizer=tf.keras.optimizers.RMSprop(),
        metrics=["accuracy"],
    )
    

    然后,我们调用model.fitmodel.predict 函数在自定义数据集上进行训练

    【讨论】: