-
Notifications
You must be signed in to change notification settings - Fork 65
/
Copy pathChap_tasking.tex
63 lines (54 loc) · 3.13 KB
/
Chap_tasking.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
\cchapter{Tasking}{tasking}
\label{chap:tasking}
Tasking constructs provide units of work to a thread for execution.
Worksharing constructs do this, too (e.g. \kcode{for}, \kcode{do},
\kcode{sections}, and \kcode{single} constructs);
but the work units are tightly controlled by an iteration limit and limited
scheduling, or a limited number of \kcode{sections} or \kcode{single} regions.
Worksharing was designed
with ``data parallel'' computing in mind. Tasking was designed for
``task parallel'' computing and often involves non-locality or irregularity
in memory access.
The \kcode{task} construct can be used to execute work chunks: in a while loop;
while traversing nodes in a list; at nodes in a tree graph;
or in a normal loop (with a \kcode{taskloop} construct).
Unlike the statically scheduled loop iterations of worksharing, a task is
often enqueued, and then dequeued for execution by any of the threads of the
team within a parallel region. The generation of tasks can be from a single
generating thread (creating sibling tasks), or from multiple generators
in a recursive graph tree traversals.
%(creating a parent-descendents hierarchy of tasks, see example 4 and 7 below).
A \kcode{taskloop} construct
bundles iterations of an associated loop into tasks, and provides
similar controls found in the \kcode{task} construct.
Sibling tasks are synchronized by the \kcode{taskwait} construct, and tasks
and their descendent tasks can be synchronized by containing them in
a \kcode{taskgroup} region. Ordered execution is accomplished by specifying
dependences with a \kcode{depend} clause. Also, priorities can be
specified as hints to the scheduler through a \kcode{priority} clause.
Various clauses can be used to manage and optimize task generation,
as well as reduce the overhead of execution and to relinquish
control of threads for work balance and forward progress.
Once a thread starts executing a task, it is the designated thread
for executing the task to completion, even though it may leave the
execution at a scheduling point and return later. The thread is \plc{tied}
to the task. Scheduling points can be introduced with the \kcode{taskyield}
construct. With an \kcode{untied} clause any other thread is allowed to continue
the task. An \kcode{if} clause with an expression that evaluates to \plc{false}
results in an \plc{undeferred} task, which instructs the runtime to suspend
the generating task until the undeferred task completes its execution.
By including the data environment of the generating task into the generated task with the
\kcode{mergeable} and \kcode{final} clauses, task generation overhead can be reduced.
A complete list of the tasking constructs and details of their clauses
can be found in the \docref{Tasking Constructs} chapter of the OpenMP Specifications.
%in the \docref{OpenMP Application Programming Interface} section.
%===== Examples Sections =====
\input{tasking/tasking}
\input{tasking/task_priority}
\input{tasking/task_dep}
\input{tasking/task_detach}
\input{tasking/taskgroup}
\input{tasking/taskyield}
\input{tasking/taskloop}
\input{tasking/parallel_masked_taskloop}
\input{tasking/taskloop_dep}