【发布时间】:2020-01-04 02:19:00
【问题描述】:
我正在测试一个带有四个工作项和一个工作组的 opencl 内核。内核是:
__kernel void pgs(__global float l2_norm)
{
int gid_x=get_global_id(0);
int gid_y=get_global_id(1);
if (gid_x==0 && gid_y==0) printf("[INFO] local_size_x:%02d, local_size_y:%02d, global_size_x:%02d, global_size_y:%02d, group_size_x:%02d, group_size_y:%02d\n", get_local_size(0), get_local_size(1), get_global_size(0), get_global_size(1), get_group_size(0), get_group_size(1));
barrier(CLK_GLOBAL_MEM_FENCE);
printf("%d,%d before: %2.6f\n",gid_x,gid_y,l2_norm);
barrier(CLK_GLOBAL_MEM_FENCE);
l2_norm+=1;
barrier(CLK_GLOBAL_MEM_FENCE);
printf("%d,%d after: %2.6f\n",gid_x,gid_y,l2_norm);
printf("testing %d,%d\n",gid_x,gid_y);
}
输出是:
1,1 before: 0.000000
0,1 before: 0.000000
1,0 before: 0.000000
[INFO] local_size_x:01, local_size_y:01, global_size_x:02, global_size_y:02, group_size_x:01, group_size_y:01
1,1 after: 1.000000
0,1 after: 2.000000
1,0 after: 3.000000
testing 1,1
0,0 before: 3.000000
testing 0,1
testing 1,0
0,0 after: 4.000000
testing 0,0
我的问题是:为什么没有先打印以[INFO] 开头的行?全局屏障不应该在工作项 0 打印出[INFO] 行之前停止所有工作项吗?
【问题讨论】:
标签: c synchronization opencl barrier