为推力重载“+”运算符，有什么想法吗？答案

【问题标题】：Overloading "+" operator for Thrust, any ideas?为推力重载“+”运算符，有什么想法吗？
【发布时间】：2013-05-30 03:58:28
【问题描述】：

我正在使用 CUDA 和 Thrust。我发现输入thrust::transform [plus/minus/divide] 很乏味，所以我只想重载一些简单的运算符。

如果我能做到，那就太棒了：

thrust::[host/device]_vector<float> host;
thrust::[host/device]_vector<float> otherHost;
thrust::[host/device]_vector<float> result = host + otherHost;

这是+ 的示例 sn-p：

template <typename T>
__host__ __device__ T& operator+(T &lhs, const T &rhs) {
    thrust::transform(rhs.begin(), rhs.end(),
                      lhs.begin(), lhs.end(), thrust::plus<?>());
    return lhs;
}

但是，thrust::plus<?> 没有正确过载，或者我没有正确地执行它......其中之一。（如果为此重载简单运算符是一个坏主意，请解释原因）。最初，我认为我可以用 typename T::iterator 之类的东西重载 ? 占位符，但那不起作用。

我不确定如何使用向量的类型和向量迭代器的类型来重载+ 运算符。这有意义吗？

感谢您的帮助！

【问题讨论】：

<?> 是什么意思？
@Elazar 这意味着我不知道该放什么。也许某种类型的T::iterator 类型或其他东西。
你可以说“我不知道我应该把什么作为thrust::plus的模板参数”
我正在尝试通过来自thrust::host_device 的模板参数获取thrust::plus 的模板参数。如果不清楚，请自行编辑问题。
您可能对 this library 感兴趣，它已经实现了您正在尝试的内容。

标签： c++ cuda operator-keyword nvidia nvcc

【解决方案1】：

这似乎可行，其他人可能有更好的想法：

#include <ostream>
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/transform.h>
#include <thrust/functional.h>
#include <thrust/copy.h>
#include <thrust/fill.h>

#define DSIZE 10


template <typename T>
thrust::device_vector<T>  operator+(thrust::device_vector<T> &lhs, const thrust::device_vector<T> &rhs) {
    thrust::transform(rhs.begin(), rhs.end(),
                      lhs.begin(), lhs.begin(), thrust::plus<T>());
    return lhs;
}

template <typename T>
thrust::host_vector<T>  operator+(thrust::host_vector<T> &lhs, const thrust::host_vector<T> &rhs) {
    thrust::transform(rhs.begin(), rhs.end(),
                      lhs.begin(), lhs.begin(), thrust::plus<T>());
    return lhs;
}
int main() {


  thrust::device_vector<float> dvec(DSIZE);
  thrust::device_vector<float> otherdvec(DSIZE);
  thrust::fill(dvec.begin(), dvec.end(), 1.0f);
  thrust::fill(otherdvec.begin(), otherdvec.end(), 2.0f);
  thrust::host_vector<float> hresult1 = dvec + otherdvec;

  std::cout << "result 1: ";
  thrust::copy(hresult1.begin(), hresult1.end(), std::ostream_iterator<float>(std::cout, " "));  std::cout << std::endl;

  thrust::host_vector<float> hvec(DSIZE);
  thrust::fill(hvec.begin(), hvec.end(), 5.0f);
  thrust::host_vector<float> hresult2 = hvec + hresult1;


  std::cout << "result 2: ";
  thrust::copy(hresult2.begin(), hresult2.end(), std::ostream_iterator<float>(std::cout, " "));  std::cout << std::endl;

  // this line would produce a compile error:
  // thrust::host_vector<float> hresult3 = dvec + hvec;

  return 0;
}

请注意，无论哪种情况，我都可以为结果指定主机或设备向量，因为推力会看到差异并自动生成必要的复制操作。所以我的模板中的结果向量类型（主机、设备）并不重要。

另请注意，您在模板定义中的 thrust::transform 函数参数并不完全正确。

【讨论】：

啊，我认为这会奏效。等我早上上班去看看。
我想我应该指出的另一件事是（我认为）您建议进行就地转换（这就是您建议返回 lhs 的原因，我遵循了您的约定），但这样做的效果是两个操作数（lhs）之一被覆盖。因此，这可能会造成 result = vec1 + vec2; 将答案放在 both result 和 vec1 中的稍微不直观的行为。