OpenMP对于嵌套循环应该添加多少个parallel for

2093人阅读
并行计算(16)
10.3.1.3 显式并行化抑制因素
一般而言,如果您显式指导编译器对循环进行并行化,编译器就会执行。但也有例外情况-存在一些编译器不进行并行化的循环。
下面是可检测到的主要抑制因素,这些抑制因素可以防止对 DO 循环进行显式并行化:
DO 循环嵌套在已并行化的另一 DO 循环内。
该例外情况也适用于间接嵌套。如果显式并行化包含子例程调用的循环,那么,即使要求编译器并行化该子例程中的循环,这些循环在运行时也不会以并行方式运行。
流控制语句允许跳出 DO 循环。
循环的索引变量受副作用影响,例如被等价。
通过使用 -vpara 和
-loopinfo 进行编译,可以得到诊断消息,指出在显式并行化循环过程中编译器是否检测到问题。
下表列出了编译器检测到的典型并行化问题:
表 10–3 显式并行化问题
循环嵌套在并行化了的另一循环内。&
循环在并行化循环体内调用的某个子例程中。&
流控制语句允许跳出循环。&
循环的索引变量受副作用影响。&
循环中的某变量具有循环携带依赖性。&
在循环中使用 I/O 语句-通常是不明智的,因为输出顺序无法预料。
示例:嵌套循环:
!$OMP PARALLEL DO
do 900 i = 1, 1000
Parallelized (outer loop)
do 200 j = 1, 1000
Not parallelized, no warning
示例:子例程中已并行化的循环:
program main
!$OMP PARALLEL DO
do 100 i = 1, 200
&-parallelized
call calc (a, x)
subroutine calc ( b, y )
!$OMP PARALLEL DO
do 1 m = 1, 1000
&-not parallelized
在此例中,由于子例程本身是以并行方式运行的,所以子例程中的循环未被并行化。
示例:跳出循环:
!$omp parallel do
do i = 1, 1000
! &- Not parallelized, error issued
if (a(i) .gt. min_threshold ) go to 20
如果标记进行并行化的循环外有转跳,编译器会发出诊断错误。
示例:循环中的某个变量具有循环携带依赖性:
demo% cat vpfn.f
real function fn (n,x,y,z)
real y(*),x(*),z(*)
!$omp parallel do private(i,s) shared(x,y,z)
s = y(i)*z(i)
demo% f95 -c -vpara -loopinfo -openmp -O4 vpfn.f
&vpfn.f&, line 5: Warning: the loop may have parallelization inhibiting reference
&vpfn.f&, line 5: PARALLELIZED, user pragma used
在此,循环被并行化,但在警告中诊断出可能的循环携带依赖性。但要注意,编译器并不能诊断出所有循环依赖性。
10.3.1.4 显式并行化时的 I/O
在下列情况下,可以在并行执行的循环中执行 I/O:
来自不同线程的输出相互交错(程序输出是非确定的),这一点并不重要。
可以确保并行执行循环的安全性。
示例:循环中有 I/O 语句
!$OMP PARALLEL DO PRIVATE(k)
do i = 1, 10
Parallelized
call show ( k )
subroutine show( j )
write(6,1) j
format(’Line number ’, i3, ’.’)
demo% f95 -openmp t13.f
demo% setenv PARALLEL 4
demo% a.out
Line number 9.
Line number 4.
Line number 5.
Line number 6.
Line number 1.
Line number 2.
Line number 3.
Line number 7.
Line number 8.
但递归的 I/O,即 I/O 语句包含对本身执行 I/O 的函数的调用,将会造成运行时错误。
参考知识库
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
访问:1103440次
积分:11661
积分:11661
排名:第1050名
原创:150篇
转载:309篇
评论:58条
(4)(2)(9)(1)(5)(5)(7)(8)(10)(10)(14)(11)(9)(8)(10)(18)(20)(23)(14)(18)(12)(28)(18)(1)(2)(4)(11)(2)(18)(18)(30)(39)(18)(6)(5)(2)(7)(5)(8)(6)(1)(12)来自CSDN博客:OpenMP并行程序设计——for循环并行化详解
最后更新时间
blog__44628
#include &iostream&#include &omp.h&using namespace std;int main(int argc, char **argv) { //设置线程数,一般设置的线程数不超过CPU核心数,这里开4个线程执行并行代码段 omp_set_num_threads(4);#pragma omp parallel {
cout && &Hello& && &, I am Thread & && omp_get_thread_num() && endl; }}
blog__8157253
Hello, I am Thread 1Hello, I am Thread 0Hello, I am Thread 2Hello, I am Thread 3
blog__872469
#include &iostream&#include &stdio.h&#include &omp.h&using namespace std;int main(int argc, char **argv) { //设置线程数,一般设置的线程数不超过CPU核心数,这里开4个线程执行并行代码段 omp_set_num_threads(4);#pragma omp parallel for (int i = 0; i & 2; i++)
//cout && &i = & && i && &, I am Thread & && omp_get_thread_num() &&
printf(&i = %d, I am Thread %d\n&, i, omp_get_thread_num());}
blog__7732142
i = 0, I am Thread 0i = 0, I am Thread 1i = 1, I am Thread 0i = 1, I am Thread 1i = 0, I am Thread 2i = 1, I am Thread 2i = 0, I am Thread 3i = 1, I am Thread 3
blog__4907460
#pragma omp parallel for for (int i = 0; i & 6; i++)
printf(&i = %d, I am Thread %d\n&, i, omp_get_thread_num());
blog__9607542
i = 4, I am Thread 2i = 2, I am Thread 1i = 0, I am Thread 0i = 1, I am Thread 0i = 3, I am Thread 1i = 5, I am Thread 3
blog__2115015
#pragma omp parallel {#pragma omp for
for (int i = 0; i & 6; i++)
printf(&i = %d, I am Thread %d\n&, i, omp_get_thread_num()); }
blog__926535
#pragma omp parallel {#pragma omp parallel for
for (int i = 0; i & 6; i++)
printf(&i = %d, I am Thread %d\n&, i, omp_get_thread_num()); }
blog__610108
i = 0, I am Thread 0i = 0, I am Thread 0i = 1, I am Thread 0i = 1, I am Thread 0i = 2, I am Thread 0i = 2, I am Thread 0i = 3, I am Thread 0i = 3, I am Thread 0i = 4, I am Thread 0i = 4, I am Thread 0i = 5, I am Thread 0i = 5, I am Thread 0i = 0, I am Thread 0i = 1, I am Thread 0i = 0, I am Thread 0i = 2, I am Thread 0i = 1, I am Thread 0i = 3, I am Thread 0i = 2, I am Thread 0i = 4, I am Thread 0i = 3, I am Thread 0i = 5, I am Thread 0i = 4, I am Thread 0i = 5, I am Thread 0
blog__3150600
#pragma omp parallel for for (int i = 0; i & 6; i++)
printf(&i = %d, I am Thread %d\n&, i, omp_get_thread_num()); //这里是两个for循环之间的代码,将会由线程0即主线程执行 printf(&I am Thread %d\n&, omp_get_thread_num());#pragma omp parallel for for (int i = 0; i & 6; i++)
printf(&i = %d, I am Thread %d\n&, i, omp_get_thread_num());
blog__8930477
i = 0, I am Thread 0i = 2, I am Thread 1i = 1, I am Thread 0i = 3, I am Thread 1i = 4, I am Thread 2i = 5, I am Thread 3I am Thread 0i = 4, I am Thread 2i = 2, I am Thread 1i = 5, I am Thread 3i = 0, I am Thread 0i = 3, I am Thread 1i = 1, I am Thread 0
blog__9490
#pragma omp parallel {#pragma omp for
for (int i = 0; i & 6; i++)
printf(&i = %d, I am Thread %d\n&, i, omp_get_thread_num());#pragma omp master
//这里的代码由主线程执行
printf(&I am Thread %d\n&, omp_get_thread_num());
}#pragma omp for
for (int i = 0; i & 6; i++)
printf(&i = %d, I am Thread %d\n&, i, omp_get_thread_num()); }
blog__8264602
#include &iostream&#include &omp.h&using namespace std;int main(int argc, char **argv) { int n = 100000; int sum = 0; omp_set_num_threads(4);#pragma omp parallel {#pragma omp for
for (int i = 0; i & n; i++) {
} } cout && & sum = & && sum && endl;}
blog__5439920
第一次输出sum = 58544第二次输出sum = 77015第三次输出sum = 78423
blog__7108360
#pragma omp parallel {#pragma omp for
for (int i = 0; i & n; i++) {
{#pragma omp critical
} } cout && & sum = & && sum && endl;
blog__7173617
#pragma omp parallel {#pragma omp for reduction(+:sum)
for (int i = 0; i & n; i++) {
blog__9365444
int n = 100000; int sum[4] = { 0 }; omp_set_num_threads(4);#pragma omp parallel {#pragma omp for
for (int i = 0; i & n; i++) {
sum[omp_get_thread_num()] += 1;
} } cout && & sum = & && sum[0] + sum[1] + sum[2] + sum[3] && endl;
blog__3509119
#include &iostream&#include &omp.h&#include &stdio.h&using namespace std;int main(int argc, char **argv) { int n = 12; omp_set_num_threads(4);#pragma omp parallel {#pragma omp for schedule(static, 3)
for (int i = 0; i & n; i++) {
printf(&i = %d, I am Thread %d\n&, i, omp_get_thread_num());
blog__5700947
i = 6, I am Thread 2i = 3, I am Thread 1i = 7, I am Thread 2i = 4, I am Thread 1i = 8, I am Thread 2i = 5, I am Thread 1i = 0, I am Thread 0i = 9, I am Thread 3i = 1, I am Thread 0i = 10, I am Thread 3i = 2, I am Thread 0i = 11, I am Thread 3

我要回帖

 

随机推荐