这是衡量线程上下文切换开销的正确解决方案吗？答案

【问题标题】：Is this a correct solution to measure thread context switch overhead?这是衡量线程上下文切换开销的正确解决方案吗？
【发布时间】：2017-07-03 17:06:36
【问题描述】：

我正在尝试测量线程切换开销时间。我有两个线程、一个共享变量、一个互斥锁和两个条件变量。两个线程将来回切换以将 1 或 0 写入共享变量。

我假设 pthread_cond_wait(&cond, &mutex) 等待时间大约等于 2 x 线程上下文切换时间。因为如果线程 1 必须等待条件变量，它必须放弃对线程 2 的互斥锁->线程 2 上下文切换-> 线程 2 执行其任务并通知条件变量唤醒第一个线 ->上下文切换回thread1->thread1重新获取锁。

我的假设正确吗？

我的代码如下：

#include <sys/types.h>
#include <wait.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/resource.h>
#include <dirent.h>
#include <ctype.h>
#include<signal.h>
#include <stdio.h>
#include <stdint.h>
#include <time.h>
#include <pthread.h>


int var = 0;

int setToZero = 1;

int count = 5000;

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

pthread_cond_t isZero = PTHREAD_COND_INITIALIZER;

pthread_cond_t isOne = PTHREAD_COND_INITIALIZER;


struct timespec firstStart; 

unsigned long long timespecDiff(struct timespec *timeA_p, struct timespec *timeB_p)
{
  return ((timeA_p->tv_sec * 1000000000) + timeA_p->tv_nsec) - 
           ((timeB_p->tv_sec * 1000000000) + timeB_p->tv_nsec);
}

void* thread1(void* param)
{ 

  int rc;
  struct timespec previousStart;
  struct timespec start; //start timestamp
  struct timespec stop; //stop timestamp
  unsigned long long result;
  int idx = 0;
  int measurements[count];
   clock_gettime(CLOCK_MONOTONIC, &stop);

   result = timespecDiff(&stop,&firstStart);

   printf("first context-switch time:%llu\n", result);

  clock_gettime(CLOCK_MONOTONIC, &previousStart);

  while(count > 0){

  //acquire lock
  rc = pthread_mutex_lock(&mutex);

  clock_gettime(CLOCK_MONOTONIC,&start);

  while(setToZero){
    pthread_cond_wait(&isOne,&mutex); // use condition variables so the threads don't busy wait inside local cache
  }

  clock_gettime(CLOCK_MONOTONIC,&stop);


   var = 0;

   count--;

   setToZero = 1;

   //printf("in thread1\n");

   pthread_cond_signal(&isZero);
    //end of critical section
   rc = pthread_mutex_unlock(&mutex); //release lock

    result = timespecDiff(&stop,&start);

    measurements[idx] = result;

    idx++;
 }

 result = 0;

 int i = 0;
while(i < idx)
 {
   result += measurements[i++];
 }

 result = result /(2*idx);

 printf("thread1 result: %llu\n",result);
}


void* thread2(void* param)
{
  int rc;
  struct timespec previousStart;
  struct timespec start; //start timestamp
  struct timespec stop; //stop timestamp
  unsigned long long result;
  int idx = 0;
  int measurements[count];

  while(count > 0){

  //acquire lock
  rc = pthread_mutex_lock(&mutex);

  clock_gettime(CLOCK_MONOTONIC,&start);

  while(!setToZero){
    pthread_cond_wait(&isZero,&mutex);
  }

  clock_gettime(CLOCK_MONOTONIC,&stop);

   var = 1;

   count--;

   setToZero = 0;

   //printf("in thread2\n");

   pthread_cond_signal(&isOne);
    //end of critical section
   rc = pthread_mutex_unlock(&mutex); //release lock

   result = timespecDiff(&stop,&start);

   measurements[idx] = result;

   idx++;
  }

 result = 0;

 int i = 0;
while(i < idx)
 {
   result += measurements[i++];
 }

 result = result /(2*idx);

 printf("thread2 result: %llu\n",result);
}

int main(){
  pthread_t threads[2];

  pthread_attr_t attr;

  pthread_attr_init(&attr);

  clock_gettime(CLOCK_MONOTONIC,&firstStart);

  pthread_create(&threads[0],&attr,thread1,NULL);

  pthread_create(&threads[1],&attr,thread2,NULL);

  printf("waiting...\n");

  pthread_join(threads[0],NULL);

  pthread_join(threads[1],NULL);

  pthread_cond_destroy(&isOne);

  pthread_cond_destroy(&isZero);

}

我得到以下时间：

first context-switch time:144240
thread1 result: 3660
thread2 result: 3770

【问题讨论】：

除非您只有一个 CPU 内核，否则在这种情况下线程可能不需要进行上下文切换根本。
我用的是单核机
然后还有其他进程/线程需要cpu时间。上下文切换是状态变量等的加载/卸载，我认为你无法测量它，你甚至可能会受到观察者效应的影响:) 操作系统负责上下文切换，而在用户空间中你不需要控制它，这一切都发生在你的进程休眠时。
@selcuk Cihan：我认为这不一定是真的。还有其他关于这个主题的SO帖子：stackoverflow.com/questions/304752/…

标签： c multithreading pthreads context-switch

【解决方案1】：

你说：

我假设 pthread_cond_wait(&cond, &mutex) 等待时间大约等于 2 x 线程上下文切换时间。

这不是一个有效的假设。一旦互斥锁被释放，它就会通知内核，然后内核必须唤醒另一个线程。例如，如果有其他线程等待运行，它可能不会选择立即执行此操作。互斥体——顾名思义——保证事情什么时候不会发生。它不保证他们什么时候会。

您不能期望从进程中可靠地测量上下文切换，当然也不使用 Posix API，因为没有任何承诺可以做到这一点。

在 Linux 上，您可以使用/proc/[pid]/status计数进程或线程的上下文切换。
在 Windows 上，此信息可从性能监视器 API 获得。

我不知道这两种方法是否能帮助你实现目标。我怀疑您真正想知道的是，使用多线程系统对性能的影响有多大，但这需要您衡量整个应用程序的性能。

【讨论】：