OpenMP 和 MPI 混合动态调度答案

【问题标题】：OpenMP and MPI hybrid dynamic schedulingOpenMP 和 MPI 混合动态调度
【发布时间】：2015-12-13 02:46:26
【问题描述】：

随着线程数的增加，“temp”的计数减少.. 当我将线程数发送为“1”时，它给出了正确的答案，但随着线程数的增加，运行时间更短但给出了错误的答案

#include <stdio.h>
#include <mpi.h>
#include <complex.h>
#include <time.h>
#include <omp.h>

#define MAXITERS 1000

// globals
int count = 0;
int nptsside;
float side2;
float side4;
int temp = 0;

int inset(double complex c) {
   int iters;
   float rl,im;
   double complex z = c;
   for (iters = 0; iters < MAXITERS; iters++) { 
      z = z*z + c;
      rl = creal(z);
      im = cimag(z);
      if (rl*rl + im*im > 4) return 0;
   }
   return 1;
}

int main(int argc, char **argv)
{
   nptsside = atoi(argv[1]);
   side2 = nptsside / 2.0;
   side4 = nptsside / 4.0;

   //struct timespec bgn,nd;
   //clock_gettime(CLOCK_REALTIME, &bgn);

   int x,y; float xv,yv;
  double complex z;
  int i;
  int mystart, myend;
  int nrows;
  int nprocs, mype;
  int data;


  MPI_Status status;
  MPI_Init(&argc,&argv);
  MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
  MPI_Comm_rank(MPI_COMM_WORLD, &mype);
  nrows = nptsside/nprocs;
  printf("%d\n", nprocs);

  mystart = mype*nrows;
  myend = mystart + nrows - 1;


  #pragma omp parallel shared(mystart, myend, temp)
  {
  int nth = omp_get_num_threads();
  printf("%d\n", nth);
  #ifdef STATIC
  #pragma omp for reduction(+:temp) schedule(static)
  #elif defined DYNAMIC
  #pragma omp for reduction(+:temp) schedule(dynamic)
  #elif defined GUIDED
  #pragma omp for reduction(+:temp) schedule(guided)
  #endif
  for (x=mystart; x<=myend; x++) {  

     for ( y=0; y<nptsside; y++)  {
        xv = (x - side2) / side4;
        yv = (y - side2) / side4;
        z = xv + yv*I;
        if (inset(z)) {
           temp++;
        }
     }
  }
  }


  if(mype==0) {
     count += temp;
     printf("%d\n", temp);

     for (i = 1; i < nprocs; i++) {
        MPI_Recv(&temp, 1, MPI_INT, i, 0, MPI_COMM_WORLD, &status);
        count += temp;
        printf("%d\n", temp);
        }
        }
        else{
        MPI_Send(&temp, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
        }



  MPI_Finalize();

  if(mype==0) {
  printf("%d\n", count);
  }

   //clock_gettime(CLOCK_REALTIME, &nd);
   //printf("%f\n",timediff(bgn,nd));
}

【问题讨论】：

问题是？
问题是为什么当线程数增加时它会给出错误的答案......例如，如果线程数是“一”..它给出1000但是当线程数增加时，当正确的计数（温度）为 1000 时，它给出 200 或 300

标签： multithreading mpi openmp

【解决方案1】：

当您进入 OpenMP 循环时，您没有定义任何私有变量。

首先，您必须始终将 OpenMP 循环的循环计数器（以及 OpenMP 循环内嵌套循环的任何循环计数器）声明为私有。

其次，您有三个变量（xv、yv 和 z），每个变量都取决于您在这些循环中的迭代。因此，每个线程也需要拥有自己的这些变量的私有副本。将您的并行语句更改为

#pragma omp parallel shared(mystart, myend, temp) private(x, y, xv, yv, z)

应该可以解决您的 OpenMP 问题。

看到你说将你的线程数设置为 1 会产生正确的答案，我没有看过你的 MPI 代码。

编辑：好吧，我撒谎了，我现在简要查看了您的 MPI 代码。您应该编写一个 reduce，而不是所有的发送和接收。这个集合体将比您当前设置的阻塞通信快得多。

MPI_Reduce(&temp, &count, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);

【讨论】：

非常感谢。我还有一个问题。为什么添加更多线程会减少大约一半的运行时间，但添加更多进程不会像 Openmp 那样减少运行时间......有什么原因吗？？
也许启动进程的开销是不值得的，因为您所做的工作很少？或者你最后进行的沟通可能比你想象的要花更多的时间？我不能肯定地说。