Skip to content
Navigation Menu
{{ message }}
forked from taskflow/taskflow
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathCompileTaskflowWithSYCL.html
More file actions
183 lines (171 loc) · 21 KB
/
Copy pathCompileTaskflowWithSYCL.html
File metadata and controls
183 lines (171 loc) · 21 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>Building and Installing » Compile Taskflow with SYCL | Taskflow QuickStart</title>
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Source+Sans+Pro:400,400i,600,600i%7CSource+Code+Pro:400,400i,600" />
<link rel="stylesheet" href="m-dark+documentation.compiled.css" />
<link rel="icon" href="favicon.ico" type="image/vnd.microsoft.icon" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="theme-color" content="#22272e" />
</head>
<body>
<header><nav id="navigation">
<div class="m-container">
<div class="m-row">
<span id="m-navbar-brand" class="m-col-t-8 m-col-m-none m-left-m">
<a href="https://taskflow.github.io"><img src="taskflow_logo.png" alt="" />Taskflow</a> <span class="m-breadcrumb">|</span> <a href="index.html" class="m-thin">QuickStart</a>
</span>
<div class="m-col-t-4 m-hide-m m-text-right m-nopadr">
<a href="#search" class="m-doc-search-icon" title="Search" onclick="return showSearch()"><svg style="height: 0.9rem;" viewBox="0 0 16 16">
<path id="m-doc-search-icon-path" d="m6 0c-3.31 0-6 2.69-6 6 0 3.31 2.69 6 6 6 1.49 0 2.85-0.541 3.89-1.44-0.0164 0.338 0.147 0.759 0.5 1.15l3.22 3.79c0.552 0.614 1.45 0.665 2 0.115 0.55-0.55 0.499-1.45-0.115-2l-3.79-3.22c-0.392-0.353-0.812-0.515-1.15-0.5 0.895-1.05 1.44-2.41 1.44-3.89 0-3.31-2.69-6-6-6zm0 1.56a4.44 4.44 0 0 1 4.44 4.44 4.44 4.44 0 0 1-4.44 4.44 4.44 4.44 0 0 1-4.44-4.44 4.44 4.44 0 0 1 4.44-4.44z"/>
</svg></a>
<a id="m-navbar-show" href="#navigation" title="Show navigation"></a>
<a id="m-navbar-hide" href="#" title="Hide navigation"></a>
</div>
<div id="m-navbar-collapse" class="m-col-t-12 m-show-m m-col-m-none m-right-m">
<div class="m-row">
<ol class="m-col-t-6 m-col-m-none">
<li><a href="pages.html">Handbook</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
</ol>
<ol class="m-col-t-6 m-col-m-none" start="3">
<li><a href="annotated.html">Classes</a></li>
<li><a href="files.html">Files</a></li>
<li class="m-show-m"><a href="#search" class="m-doc-search-icon" title="Search" onclick="return showSearch()"><svg style="height: 0.9rem;" viewBox="0 0 16 16">
<use href="#m-doc-search-icon-path" />
</svg></a></li>
</ol>
</div>
</div>
</div>
</div>
</nav></header>
<main><article>
<div class="m-container m-container-inflatable">
<div class="m-row">
<div class="m-col-l-10 m-push-l-1">
<h1>
<span class="m-breadcrumb"><a href="install.html">Building and Installing</a> »</span>
Compile Taskflow with SYCL
</h1>
<nav class="m-block m-default">
<h3>Contents</h3>
<ul>
<li><a href="#InstallSYCLCompiler">Install SYCL Compiler</a></li>
<li><a href="#CompileTaskflowWithSYCLDirectly">Compile Source Code Directly</a></li>
<li><a href="#CompileTaskflowWithSYCLSeparately">Compile Source Code Separately</a></li>
</ul>
</nav>
<section id="InstallSYCLCompiler"><h2><a href="#InstallSYCLCompiler">Install SYCL Compiler</a></h2><p>To compile Taskflow with SYCL code, you need the DPC++ clang compiler, which can be acquired from <a href="https://intel.github.io/llvm-docs/GetStartedGuide.html">Getting Started with oneAPI DPC++</a>.</p></section><section id="CompileTaskflowWithSYCLDirectly"><h2><a href="#CompileTaskflowWithSYCLDirectly">Compile Source Code Directly</a></h2><p>Taskflow's GPU programming interface for SYCL is <a href="classtf_1_1syclFlow.html" class="m-doc">tf::<wbr />syclFlow</a>. Consider the following <code>simple.cpp</code> program that performs the canonical saxpy (single-precision AX + Y) operation on a GPU:</p><pre class="m-code"><span class="cp">#include</span><span class="w"> </span><span class="cpf"><taskflow/taskflow.hpp></span><span class="c1"> // core taskflow routines</span><span class="cp"></span>
<span class="cp">#include</span><span class="w"> </span><span class="cpf"><taskflow/syclflow.hpp></span><span class="c1"> // core syclflow routines</span><span class="cp"></span>
<span class="kt">int</span><span class="w"> </span><span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span>
<span class="w"> </span><span class="n">tf</span><span class="o">::</span><span class="n">Executor</span><span class="w"> </span><span class="n">executor</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">tf</span><span class="o">::</span><span class="n">Taskflow</span><span class="w"> </span><span class="n">taskflow</span><span class="p">(</span><span class="s">"saxpy example"</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span>
<span class="w"> </span><span class="n">sycl</span><span class="o">::</span><span class="n">queue</span><span class="w"> </span><span class="n">queue</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span>
<span class="w"> </span><span class="k">auto</span><span class="w"> </span><span class="n">X</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sycl</span><span class="o">::</span><span class="n">malloc_shared</span><span class="o"><</span><span class="kt">float</span><span class="o">></span><span class="p">(</span><span class="n">N</span><span class="p">,</span><span class="w"> </span><span class="n">queue</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="k">auto</span><span class="w"> </span><span class="n">Y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sycl</span><span class="o">::</span><span class="n">malloc_shared</span><span class="o"><</span><span class="kt">float</span><span class="o">></span><span class="p">(</span><span class="n">N</span><span class="p">,</span><span class="w"> </span><span class="n">queue</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span>
<span class="w"> </span><span class="n">taskflow</span><span class="p">.</span><span class="n">emplace_on</span><span class="p">([</span><span class="o">&</span><span class="p">](</span><span class="n">tf</span><span class="o">::</span><span class="n">syclFlow</span><span class="o">&</span><span class="w"> </span><span class="n">sf</span><span class="p">){</span><span class="w"></span>
<span class="w"> </span><span class="n">tf</span><span class="o">::</span><span class="n">syclTask</span><span class="w"> </span><span class="n">fillX</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sf</span><span class="p">.</span><span class="n">fill</span><span class="p">(</span><span class="n">X</span><span class="p">,</span><span class="w"> </span><span class="mf">1.0f</span><span class="p">,</span><span class="w"> </span><span class="n">N</span><span class="p">).</span><span class="n">name</span><span class="p">(</span><span class="s">"fillX"</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">tf</span><span class="o">::</span><span class="n">syclTask</span><span class="w"> </span><span class="n">fillY</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sf</span><span class="p">.</span><span class="n">fill</span><span class="p">(</span><span class="n">Y</span><span class="p">,</span><span class="w"> </span><span class="mf">2.0f</span><span class="p">,</span><span class="w"> </span><span class="n">N</span><span class="p">).</span><span class="n">name</span><span class="p">(</span><span class="s">"fillY"</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">tf</span><span class="o">::</span><span class="n">syclTask</span><span class="w"> </span><span class="n">saxpy</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sf</span><span class="p">.</span><span class="n">parallel_for</span><span class="p">(</span><span class="n">sycl</span><span class="o">::</span><span class="n">range</span><span class="o"><</span><span class="mi">1</span><span class="o">></span><span class="p">(</span><span class="n">N</span><span class="p">),</span><span class="w"> </span>
<span class="w"> </span><span class="p">[</span><span class="o">=</span><span class="p">]</span><span class="w"> </span><span class="p">(</span><span class="n">sycl</span><span class="o">::</span><span class="n">id</span><span class="o"><</span><span class="mi">1</span><span class="o">></span><span class="w"> </span><span class="n">id</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="n">X</span><span class="p">[</span><span class="n">id</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">3.0f</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">X</span><span class="p">[</span><span class="n">id</span><span class="p">]</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Y</span><span class="p">[</span><span class="n">id</span><span class="p">];</span><span class="w"></span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="w"> </span><span class="p">).</span><span class="n">name</span><span class="p">(</span><span class="s">"saxpy"</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">saxpy</span><span class="p">.</span><span class="n">succeed</span><span class="p">(</span><span class="n">fillX</span><span class="p">,</span><span class="w"> </span><span class="n">fillY</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="p">},</span><span class="w"> </span><span class="n">queue</span><span class="p">).</span><span class="n">name</span><span class="p">(</span><span class="s">"syclFlow"</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span>
<span class="w"> </span><span class="n">executor</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">taskflow</span><span class="p">).</span><span class="n">wait</span><span class="p">();</span><span class="w"></span>
<span class="p">}</span><span class="w"></span></pre><p>Use DPC++ clang to compile the program with the following options:</p><ul><li><code>-fsycl</code>: enable SYCL compilation mode</li><li><code>-fsycl-targets=nvptx64-nvidia-cuda-sycldevice</code>: enable CUDA target</li><li><code>-fsycl-unnamed-lambda</code>: enable unnamed SYCL lambda kernel</li></ul><pre class="m-console"><span class="go">~$ clang++ -fsycl -fsycl-unnamed-lambda \</span>
<span class="go"> -fsycl-targets=nvptx64-nvidia-cuda-sycldevice \ # for CUDA target</span>
<span class="go"> -I path/to/taskflow -pthread -std=c++17 simple.cpp -o simple</span>
<span class="go">~$ ./simple</span></pre><aside class="m-note m-warning"><h4>Attention</h4><p>You need to include <code>taskflow/syclflow.hpp</code> in order to use <a href="classtf_1_1syclFlow.html" class="m-doc">tf::<wbr />syclFlow</a>.</p></aside></section><section id="CompileTaskflowWithSYCLSeparately"><h2><a href="#CompileTaskflowWithSYCLSeparately">Compile Source Code Separately</a></h2><p>Large GPU applications often compile a program into separate objects and link them together to form an executable or a library. You can compile your SYCL code into separate object files and link them to form the final executable. Consider the following example that defines two tasks on two different pieces (<code>main.cpp</code> and <code>syclflow.cpp</code>) of source code:</p><pre class="m-code"><span class="c1">// main.cpp</span>
<span class="cp">#include</span><span class="w"> </span><span class="cpf"><taskflow/taskflow.hpp></span><span class="cp"></span>
<span class="n">tf</span><span class="o">::</span><span class="n">Task</span><span class="w"> </span><span class="nf">make_syclflow</span><span class="p">(</span><span class="n">tf</span><span class="o">::</span><span class="n">Taskflow</span><span class="o">&</span><span class="w"> </span><span class="n">taskflow</span><span class="p">);</span><span class="w"> </span><span class="c1">// create a syclFlow task</span>
<span class="kt">int</span><span class="w"> </span><span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="n">tf</span><span class="o">::</span><span class="n">Executor</span><span class="w"> </span><span class="n">executor</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">tf</span><span class="o">::</span><span class="n">Taskflow</span><span class="w"> </span><span class="n">taskflow</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">tf</span><span class="o">::</span><span class="n">Task</span><span class="w"> </span><span class="n">task1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">taskflow</span><span class="p">.</span><span class="n">emplace</span><span class="p">([](){</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">cout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"main.cpp!</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span><span class="w"> </span><span class="p">})</span><span class="w"></span>
<span class="w"> </span><span class="p">.</span><span class="n">name</span><span class="p">(</span><span class="s">"cpu task"</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">tf</span><span class="o">::</span><span class="n">Task</span><span class="w"> </span><span class="n">task2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">make_syclflow</span><span class="p">(</span><span class="n">taskflow</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">task1</span><span class="p">.</span><span class="n">precede</span><span class="p">(</span><span class="n">task2</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">executor</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">taskflow</span><span class="p">).</span><span class="n">wait</span><span class="p">();</span><span class="w"></span>
<span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span></pre><pre class="m-code"><span class="c1">// syclflow.cpp</span>
<span class="cp">#include</span><span class="w"> </span><span class="cpf"><taskflow/taskflow.hpp></span><span class="cp"></span>
<span class="cp">#include</span><span class="w"> </span><span class="cpf"><taskflow/syclflow.hpp></span><span class="cp"></span>
<span class="kr">inline</span><span class="w"> </span><span class="n">sycl</span><span class="o">::</span><span class="n">queue</span><span class="w"> </span><span class="n">queue</span><span class="p">;</span><span class="w"> </span><span class="c1">// create a global sycl queue</span>
<span class="n">tf</span><span class="o">::</span><span class="n">Task</span><span class="w"> </span><span class="nf">make_syclflow</span><span class="p">(</span><span class="n">tf</span><span class="o">::</span><span class="n">Taskflow</span><span class="o">&</span><span class="w"> </span><span class="n">taskflow</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="n">taskflow</span><span class="p">.</span><span class="n">emplace_on</span><span class="p">([](</span><span class="n">tf</span><span class="o">::</span><span class="n">syclFlow</span><span class="o">&</span><span class="w"> </span><span class="n">cf</span><span class="p">){</span><span class="w"></span>
<span class="w"> </span><span class="n">printf</span><span class="p">(</span><span class="s">"syclflow.cpp!</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">cf</span><span class="p">.</span><span class="n">single_task</span><span class="p">([](){}).</span><span class="n">name</span><span class="p">(</span><span class="s">"kernel"</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="p">},</span><span class="w"> </span><span class="n">queue</span><span class="p">).</span><span class="n">name</span><span class="p">(</span><span class="s">"gpu task"</span><span class="p">);</span><span class="w"></span>
<span class="p">}</span><span class="w"></span></pre><p>Compile each source to an object using DPC++ clang:</p><pre class="m-console"><span class="go">~$ clang++ -I path/to/taskflow/ -pthread -std=c++17 -c main.cpp -o main.o</span>
<span class="go">~$ clang++ -fsycl -fsycl-unnamed-lambda \</span>
<span class="go"> -fsycl-targets=nvptx64-nvidia-cuda-sycldevice \</span>
<span class="go"> -I path/to/taskflow/ -pthread -std=c++17 -c syclflow.cpp -o syclflow.o</span>
<span class="gp"># </span>now we have the two compiled .o objects, main.o and syclflow.o
<span class="go">~$ ls</span>
<span class="go">main.o syclflow.o </span></pre><p>Next, link the two object files to the final executable:</p><pre class="m-console"><span class="go">~$ clang++ -fsycl -fsycl-unnamed-lambda \</span>
<span class="go"> -fsycl-targets=nvptx64-nvidia-cuda-sycldevice \ # for CUDA target</span>
<span class="go"> main.o syclflow.o -pthread -std=c++17 -o main</span>
<span class="gp"># </span>run the main program
<span class="go">~$ ./main</span>
<span class="go">main.cpp!</span>
<span class="go">syclflow.cpp!</span></pre></section>
</div>
</div>
</div>
</article></main>
<div class="m-doc-search" id="search">
<a href="#!" onclick="return hideSearch()"></a>
<div class="m-container">
<div class="m-row">
<div class="m-col-m-8 m-push-m-2">
<div class="m-doc-search-header m-text m-small">
<div><span class="m-label m-default">Tab</span> / <span class="m-label m-default">T</span> to search, <span class="m-label m-default">Esc</span> to close</div>
<div id="search-symbolcount">…</div>
</div>
<div class="m-doc-search-content">
<form>
<input type="search" name="q" id="search-input" placeholder="Loading …" disabled="disabled" autofocus="autofocus" autocomplete="off" spellcheck="false" />
</form>
<noscript class="m-text m-danger m-text-center">Unlike everything else in the docs, the search functionality <em>requires</em> JavaScript.</noscript>
<div id="search-help" class="m-text m-dim m-text-center">
<p class="m-noindent">Search for symbols, directories, files, pages or
modules. You can omit any prefix from the symbol or file path; adding a
<code>:</code> or <code>/</code> suffix lists all members of given symbol or
directory.</p>
<p class="m-noindent">Use <span class="m-label m-dim">↓</span>
/ <span class="m-label m-dim">↑</span> to navigate through the list,
<span class="m-label m-dim">Enter</span> to go.
<span class="m-label m-dim">Tab</span> autocompletes common prefix, you can
copy a link to the result using <span class="m-label m-dim">⌘</span>
<span class="m-label m-dim">L</span> while <span class="m-label m-dim">⌘</span>
<span class="m-label m-dim">M</span> produces a Markdown link.</p>
</div>
<div id="search-notfound" class="m-text m-warning m-text-center">Sorry, nothing was found.</div>
<ul id="search-results"></ul>
</div>
</div>
</div>
</div>
</div>
<script src="search-v2.js"></script>
<script src="searchdata-v2.js" async="async"></script>
<footer><nav>
<div class="m-container">
<div class="m-row">
<div class="m-col-l-10 m-push-l-1">
<p>Taskflow handbook is part of the <a href="https://taskflow.github.io">Taskflow project</a>, copyright © <a href="https://tsung-wei-huang.github.io/">Dr. Tsung-Wei Huang</a>, 2018–2022.<br />Generated by <a href="https://doxygen.org/">Doxygen</a> 1.8.14 and <a href="https://mcss.mosra.cz/">m.css</a>.</p>
</div>
</div>
</div>
</nav></footer>
</body>
</html>
You can’t perform that action at this time.
