【发布时间】:2022-09-27 16:31:40
【问题描述】:
我正在尝试运行以下代码以使用自定义整理函数将数据集加载到 PyTorch 数据集类中并映射它们,但我收到错误消息。该数据集由 123061 个数据样本组成,因此在下面的代码中我只使用了 10 个样本。如果我使用总数据集,那么我会收到ValueError: 123061 is not in range 的错误。那么我到底在哪里做错了?
class Dataclass(Dataset):
def __init__(self,dataset):
self.dataset = dataset
def __len__(self):
return len(self.dataset)
def __getitem__(self, idx):
solute = self.dataset.loc[idx][\'Drug1_SMILES\']
mol = Chem.MolFromSmiles(solute)
mol = Chem.AddHs(mol)
solute = Chem.MolToSmiles(mol)
solute_graph = get_graph_from_smile(solute)
solvent = self.dataset.loc[idx][\'Drug2_SMILES\']
mol = Chem.MolFromSmiles(solvent)
mol = Chem.AddHs(mol)
solvent = Chem.MolToSmiles(mol)
solvent_graph = get_graph_from_smile(solvent)
delta_g = self.dataset.loc[idx][\'label\']
return [solute_graph, solvent_graph]
tg = Dataclass(train_df[:10])
solute_graphs, solvent_graphs, labels = map(list, zip(*tg))
Error
ValueError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/range.py in get_loc(self, key, method, tolerance)
384 try:
--> 385 return self._range.index(new_key)
386 except ValueError as err:
ValueError: 10 is not in range
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
6 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/range.py in get_loc(self, key, method, tolerance)
385 return self._range.index(new_key)
386 except ValueError as err:
--> 387 raise KeyError(key) from err
388 raise KeyError(key)
389 return super().get_loc(key, method=method, tolerance=tolerance)
KeyError: 10
标签: python dataframe pytorch pytorch-dataloader collate