【问题标题】:cuda out of memory problem in gpu in google colab谷歌colab中gpu中的cuda内存不足问题
【发布时间】:2022-03-31 21:50:12
【问题描述】:

我正在尝试运行代码以从 flair 和 bert 获取堆叠嵌入,但出现以下错误。建议之一是减少批量大小,但是如何批量传递数据?这是代码和错误。

from tqdm import tqdm ## tracks progress of loop ##
import torch
from flair.data import Sentence
from flair.embeddings import TransformerDocumentEmbeddings

from flair.embeddings import DocumentPoolEmbeddings
bert_embeddings = TransformerDocumentEmbeddings('bert-base-uncased')

### initialize the document embeddings, mode = mean ###
document_embeddings = DocumentPoolEmbeddings([
                                             flair_forward, 
                                             flair_backward,
                                              bert_embeddings 
                                             ])
# Storing Size of embedding #
z = sentence.embedding.size()[0]
print(z)
### Vectorising text ###
# creating a tensor for storing sentence embeddings
sen = torch.zeros(0,z)
print(sen)

# iterating Sentences #
for tweet in tqdm(txt):   
  sentence = Sentence(tweet)
  document_embeddings.embed(sentence)# *****this line is giving error*****
  # Adding Document embeddings to list #
  if(torch.cuda.is_available()):
    sen = sen.cuda()
  sen = torch.cat((sen, sentence.embedding.view(-1,z)),0)

这是我遇到的错误。

RuntimeError                              Traceback (most recent call last)
<ipython-input-24-1eee00445350> in <module>()
     24 for tweet in tqdm(txt):
     25   sentence = Sentence(tweet)
---> 26   document_embeddings.embed(sentence)
     27   # Adding Document embeddings to list #
     28   if(torch.cuda.is_available()):

7 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
    580         if batch_sizes is None:
    581             result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers,
--> 582                               self.dropout, self.training, self.bidirectional, self.batch_first)
    583         else:
    584             result = _VF.lstm(input, batch_sizes, hx, self._flat_weights, self.bias,

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 7.43 GiB total capacity; 6.54 GiB already allocated; 10.94 MiB free; 6.70 GiB reserved in total by PyTorch)

【问题讨论】:

  • sentence 变量的维度是多少?
  • 句子变量的形状[6424,2864]
  • 那么,你有 6424 个例子吗?
  • 是@AshwinGeetD'Sa
  • 尝试将少量样本传递给document_embeddingsdocument_embeddings.embed(sentence[:32])(这里,您传递了32个样本)。寻找少量样本,它们可以像这样执行而不会出现任何错误。您甚至可以减小到 1 的大小。这里称为 batch size

标签: embedding bert-language-model flair


【解决方案1】:
embeddings = FlairEmbeddings('news-forward', chars_per_chunk=128)

例如:

embedding_types = [
    WordEmbeddings('glove'),
    FlairEmbeddings('news-forward',chars_per_chunk=128),
    FlairEmbeddings('news-backward'),
]

相应地编辑您的代码,flair 添加的新功能以避免这种情况,看看它是否有效。根据我自己的经验,Google collab 不适合用于 NER 等任务的大型 Transformer 模型。

有关您的特定任务的更多示例,请参阅 github 文档!

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2023-01-31
    • 2020-12-31
    • 2020-03-21
    • 2020-05-04
    • 2016-08-23
    • 2020-03-31
    • 2020-02-25
    • 2021-07-07
    相关资源
    最近更新 更多