【问题标题】:Force BERT transformer to use CUDA强制 BERT 变压器使用 CUDA
【发布时间】:2021-08-29 02:27:07
【问题描述】:

我想强制 Huggingface 转换器 (BERT) 使用 CUDA。 nvidia-smi 显示在代码执行期间我的所有 CPU 内核都已用尽,但我的 GPU 利用率为 0%。不幸的是,我是 Hugginface 库和 PyTorch 的新手,不知道在哪里放置 CUDA 属性 device = cuda:0.to(cuda:0)

下面的代码基本上是来自german sentiment BERT working example的自定义部分

class SentimentModel_t(pt.nn.Module):
      def __init__(self, model_name: str = "oliverguhr/german-sentiment-bert"):
           DEVICE = "cuda:0" if pt.cuda.is_available() else "cpu"
           print(DEVICE)
           super(SentimentModel_t,self).__init__()

           self.model = AutoModelForSequenceClassification.from_pretrained(model_name).to(DEVICE)
           self.tokenizer = BertTokenizerFast.from_pretrained(model_name)
    
        def predict_sentiment(self, texts: List[str])-> List[str]:
            texts = [self.clean_text(text) for text in texts]
            # Add special tokens takes care of adding [CLS], [SEP], <s>... tokens in the right way for each model.
            input_ids = self.tokenizer.batch_encode_plus(texts,padding=True, add_special_tokens=True, truncation=True, max_length=self.tokenizer.max_len_single_sentence)
            input_ids = pt.tensor(input_ids["input_ids"])
    
            with pt.no_grad():
                logits = self.model(input_ids)
    
            label_ids = pt.argmax(logits[0], axis=1)
    
            labels = [self.model.config.id2label[label_id] for label_id in label_ids.tolist()]
            return labels

编辑:应用@KonstantinosKokos 的建议后(参见上面的编辑代码)我得到了一个

RuntimeError: Input, output and indices must be on the current device

指向

        with pt.no_grad():
           logits = self.model(input_ids)

完整的错误代码可以在下面获得:

<ipython-input-15-b843edd87a1a> in predict_sentiment(self, texts)
     23 
     24         with pt.no_grad():
---> 25             logits = self.model(input_ids)
     26 
     27         label_ids = pt.argmax(logits[0], axis=1)

~/PycharmProjects/Test_project/venv/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~/PycharmProjects/Test_project/venv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, labels, output_attentions, output_hidden_states, return_dict)
   1364         return_dict = return_dict if return_dict is not None else self.config.use_return_dict
   1365 
-> 1366         outputs = self.bert(
   1367             input_ids,
   1368             attention_mask=attention_mask,

~/PycharmProjects/Test_project/venv/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~/PycharmProjects/Test_project/venv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, output_attentions, output_hidden_states, return_dict)
    859         head_mask = self.get_head_mask(head_mask, self.config.num_hidden_layers)
    860 
--> 861         embedding_output = self.embeddings(
    862             input_ids=input_ids, position_ids=position_ids, token_type_ids=token_type_ids, inputs_embeds=inputs_embeds
    863         )

~/PycharmProjects/Test_project/venv/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~/PycharmProjects/Test_project/venv/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py in forward(self, input_ids, token_type_ids, position_ids, inputs_embeds)
    196 
    197         if inputs_embeds is None:
--> 198             inputs_embeds = self.word_embeddings(input_ids)
    199         token_type_embeddings = self.token_type_embeddings(token_type_ids)
    200 

~/PycharmProjects/Test_project/venv/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~/PycharmProjects/Test_project/venv/lib/python3.8/site-packages/torch/nn/modules/sparse.py in forward(self, input)
    122 
    123     def forward(self, input: Tensor) -> Tensor:
--> 124         return F.embedding(
    125             input, self.weight, self.padding_idx, self.max_norm,
    126             self.norm_type, self.scale_grad_by_freq, self.sparse)

~/PycharmProjects/Test_project/venv/lib/python3.8/site-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
   1850         # remove once script supports set_grad_enabled
   1851         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 1852     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   1853 
   1854 

【问题讨论】:

    标签: python pytorch huggingface-transformers transformer


    【解决方案1】:

    你可以像这样让整个类继承torch.nn.Module

    class SentimentModel_t(torch.nn.Module):
        def __init___(...)
        super(SentimentModel_t, self).__init__()
        ...
    

    初始化模型后,您可以调用 .to(device) 将其投射到您选择的设备上,如下所示:

    sentiment_model = SentimentModel_t(...)
    sentiment_model.to('cuda')
    

    .to() 递归地应用于类的所有子模块,model 是其中之一(拥抱脸模型继承torch.nn.Module,从而提供了to() 的实现)。 请注意,这使得在 __init__() 中选择设备变得多余:它现在是一个外部上下文,您可以轻松地切换到/从。


    或者,您可以通过将包含的 BERT 模型直接转换为 cuda 来对设备进行硬编码(不太优雅):

    class SentimentModel_t():
            def __init__(self, ...):
                DEVICE = "cuda:0" if pt.cuda.is_available() else "cpu"
                print(DEVICE)
    
                self.model = AutoModelForSequenceClassification.from_pretrained(model_name).to(DEVICE)
    

    【讨论】:

    • 感谢您的快速建议!我已经尝试了你的两个解决方案,不幸的是,我得到一个“运行时错误:输入、输出和索引必须在当前设备上”我已经编辑了原始问题
    • 这是一个不同的问题,与您的原始问题无关。老实说,错误描述很能说明问题:您的模型的设备和您的输入不一样。您可以通过将输入投射到适当的设备来解决此问题,再次使用to()(它适用于TensorModule 对象),即input_ids = input_ids.to("cuda")
    • 是的,它成功了!谢谢,我今天学到了一些东西
    【解决方案2】:

    我参加聚会有点晚了。我写的 python 包已经使用了你的 GPU。你可以看看code to see how it was implemented

    只需安装软件包:

    pip install germansentiment
    

    并运行代码:

    from germansentiment import SentimentModel
    
    model = SentimentModel()
    
    texts = [
        "Mit keinem guten Ergebniss","Das ist gar nicht mal so gut",
        "Total awesome!","nicht so schlecht wie erwartet",
        "Der Test verlief positiv.","Sie fährt ein grünes Auto."]
    
    result = model.predict_sentiment(texts)
    print(result)
    

    重要提示:如果您编写自己的代码来使用模型,则还需要运行预处理代码。否则结果可能会关闭。

    【讨论】:

      猜你喜欢
      • 2023-03-13
      • 1970-01-01
      • 2021-12-08
      • 2021-11-16
      • 2022-06-24
      • 2020-07-07
      • 1970-01-01
      • 2021-09-03
      • 2016-06-06
      相关资源
      最近更新 更多