相同的代码，相同的库，但为什么我的训练在新笔记本电脑上的运行速度比旧笔记本电脑慢答案

【问题标题】：Same code, same library, but why my training runs slower in a new laptop compare to an old laptop相同的代码，相同的库，但为什么我的训练在新笔记本电脑上的运行速度比旧笔记本电脑慢
【发布时间】：2021-08-18 20:28:42
【问题描述】：

这是背景：

我对深度学习了解不多，也不是我编写代码的人。我按照某人的程序测试人工智能。我在 3 台不同的笔记本电脑上尝试相同的过程。我认为具有更好硬件的笔记本电脑会提高训练速度，但结果并非如此。

根据代码，似乎是使用带有 tensorflow 后端的 Keras。

我做了一些研究并试图加快这个过程：比如使用 GPU。但后来我发现两台笔记本电脑的 GPU 负载都在 0% 到 1% 之间。似乎两台笔记本电脑都没有使用 GPU。

所以我想，可能是tensorflow没有识别GPU，所以我尝试使用tersorflow-gpu，安装cuda和cudnn...

>>> from tensorflow.python.client import device_lib
>>> print(device_lib.list_local_devices())
2021-08-18 17:17:00.307495: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2021-08-18 17:17:00.312631: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2021-08-18 17:17:00.364157: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce GTX 1070 computeCapability: 6.1
coreClock: 1.645GHz coreCount: 16 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 238.66GiB/s
2021-08-18 17:17:00.364352: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-08-18 17:17:00.397938: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-08-18 17:17:00.427946: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2021-08-18 17:17:00.435072: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2021-08-18 17:17:00.478467: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2021-08-18 17:17:00.495200: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2021-08-18 17:17:00.559633: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-08-18 17:17:00.560557: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2021-08-18 17:17:04.129809: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-18 17:17:04.129968: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2021-08-18 17:17:04.130734: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2021-08-18 17:17:04.132802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 6788 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 2340425778646607054
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 7118530151
locality {
  bus_id: 1
  links {
  }
}
incarnation: 4718765836722936952
physical_device_desc: "device: 0, name: NVIDIA GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1"
]

即使 tensorflow-gpu 似乎也能识别 GPU，但仍然没有变得更快，而没有 GPU 和旧 CPU 的笔记本电脑实际上更快。

新笔记本电脑的运行速度约为 1 it/s，但旧笔记本电脑的运行速度为 9 it/s。我还有一台更旧的笔记本电脑可以运行 5~6 it/s

现在要训练 14 GB 数据集，我估计用旧笔记本电脑需要 30 天，而新笔记本电脑可能需要 45 天。

困扰我的是：使用相同的代码和库，接下来会影响训练速度不是硬件吗？还是我有什么误解？

【问题讨论】：

您是否也对代码进行了更改以指定哪个部分将在 GPU 上运行？

标签： tensorflow artificial-intelligence hardware

【解决方案1】：

如果您希望特定操作在您选择的设备上运行，而不是自动为您选择的设备，您可以使用 tf.device 创建设备上下文，该上下文中的所有操作都将在相同的指定设备。

import tensorflow as tf
tf.debugging.set_log_device_placement(True)

# Place tensors on the CPU
with tf.device('/CPU:0'):
  a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
  b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])

# Run on the GPU
c = tf.matmul(a, b)
print(c)

【讨论】：