-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathNodeLaunches.hlsl
320 lines (289 loc) · 16.9 KB
/
NodeLaunches.hlsl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
// This file is part of the AMD & HSC Work Graph Playground.
//
// Copyright (C) 2024 Advanced Micro Devices, Inc. and Coburg University of Applied Sciences and Arts.
// All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files(the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and /or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions :
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
#include "Common.h"
// In this tutorial, we're going to take a look at all the different options for launching nodes in a work graph.
// Work graphs replaces the concepts of draws (e.g., DrawInstanced, DrawIndexedInstances) and
// dispatches (e.g., Dispatch, DispatchRays) with records and node launches.
// Each record invokes a node and the node can choose from one of three launch modes:
//
// "broadcasting":
// A broadcasting node is - on the surface - very similar to a compute shader:
// Each record invokes a 3D grid of thread groups, with every thread group consisting of a 3D grid of threads.
// You specify the size of the thread group with the [NumThreads(x, y, z)] attribute.
// You can set the dispatch size (i.e., size of the thread-group grid) either statically using the
// [NodeDispatchGrid(x, y, z)] attribute, or dynamically as part of the node input record.
// We'll take a look at the latter part in Task 1.
//
// "thread":
// Each record invokes a single thread, but unlike compute shaders or broadcasting nodes with [NumThreads(1, 1, 1)],
// the work graphs runtime can combine multiple records and process them in a single thread group. Thus, the work
// graph runtime can then better leverage the available GPU resources.
// Thread launches are ideal for single-threaded workloads (e.g., our Print functions).
// You have already seen thread node launches in action in tutorial-0 and tutorial-1.
//
// "coalescing":
// Unlike "broadcasting" or "thread" nodes, "coalescing" nodes can accept more than one input record:
// A set of one or more input records invokes a single thread group. The size of the thread group is again denoted
// by the [NumThreads(x, y, z)] attribute.
// The maximum number of input records is declared using the [MaxRecords(...)] attribute. This happens in the same way as
// we declare the output limits of a node. We've done this already in tutorial-1.
// That said, a limit of, say, [MaxRecords(5)] only guarantees that the coalescing node is invoked with one to five records.
// However, it does not mean that the input will always contain five records.
// The actual number of input records present can be queried with the "Count()" method of the "GroupNodeInputRecords" object.
// In this tutorial, we'll use a coalescing node to combine two rectangles into a single one if they share a vertical edge.
//
// See https://microsoft.github.io/DirectX-Specs/d3d/WorkGraphs.html#node-types for more details on all the launch modes.
// In our Work Graph Playground, we skip the experimental "mesh" launch mode. If you're interested,
// you can find more details on mesh nodes in Work Graphs here: https://gpuopen.com/learn/work_graphs_mesh_nodes.
//
// In tutorial-1, we declared inputs to our thread-launch nodes using the "ThreadNodeInputRecord" object.
// Node with the "broadcasting" and "coalescing" attribute use "DispatchNodeInputRecord" and "GroupNodeInputRecords", respectively.
// See https://microsoft.github.io/DirectX-Specs/d3d/WorkGraphs.html#node-input-declaration for details.
//
// In this assignments, we are again going to draw rectangles, but this time, instead of just drawing the outline,
// we also going to fill the rectangle. This is an ideal use-case for broadcasting nodes: we require as many threads as pixels
// to efficiently fill large areas.
// Additionally, we're going to demonstrate the capabilities of coalescing nodes, by combining two neighboring rectangles
// into a single one, if they share a vertical edge.
// The resulting graph for this tutorial will be as follows:
//
// +----------------------------+
// | NodeLaunch("broadcasting") |
// | Entry |
// +----------------------------+
// |
// +--------------+--------------+
// v v
// +----------------------+ +--------------------------+
// | NodeLaunch("thread") | | NodeLaunch("coalescing") |
// | PrintLabel | | MergeRectangle |
// +----------------------+ +--------------------------+
// |
// v
// +----------------------------+
// | NodeLaunch("broadcasting") |
// | FillRectangle |
// +----------------------------+
//
// Task 1: Have a look at the "FillRectangle" node below. It is currently using a fixed dispatch grid set by the
// [NodeDispatchGrid(...)] attribute, thus all rectangles have the same size.
// As the GetRectanglePositionAndSize helper computes an individual position and size for every rectangle,
// we need to change this to a dynamic dispatch grid set by the input record.
// Start by adding variables for the dispatch grid and rectangle size in the "RectangleRecord" struct.
// Next, change the [NodeDispatchGrid(...)] attribute of the "FillRectangle" to a [NodeMaxDispatchGrid(...)]
// and update the dispatch size limit in the x dimension.
// Lastly, set the dispatch grid and rectangle size for the rectangle records in the "Entry" node.
// Once you're done, the rectangles should now cover a continuous horizontal rectangle.
// Task 2: Change the "rectangleOutput" of the "Entry" node to target the "MergeRectangle" coalescing node.
// The "MergeRectangle" takes one to two rectangles and we'll later use this functionality to combine
// rectangles if they share an edge. In this task, you are going to implement the fallback path and
// passthrough all incoming records to the "FillRectangle" node.
// Once you're done, everything should still look the same.
// Task 3: Complete the implementation of the "MergeRectangle" node.
// Complete the sub-call to the "ComputeCombinedRect" helper method.
// If this helper returns "true", then you must emit a single record to the "FillRectangle" node.
// Position and size of this rectangle are given by the "ComputeCombinedRect" helper.
// For the color of this rectangle, you can re-use the color from any of the input records (e.g., record[0]).
// Once you're done, you should now see the same area being filled, but this time with just three instead of five rectangles.
// As five is not dividable by two, there's also one rectangle which could not be merged and is passed through as-is from
// the "MergeRectangle" node to the "FillRectangle" node.
// Task 4: Increase the dispatch grid of the "Entry" node in x dimension to emit more rectangles.
// You should now see the merged rectangles flickering, as the input to the coalescer node is non-deterministic
// and depends on the timing of the different thread groups of the "Entry" node.
// This step is omitted from the sample solution.
// Constants that define the layout and positioning of rectangles.
static const int RectangleSize = 48;
// Size increase with every rectangle.
static const int RectangleSizeStep = 4;
static const int2 RectangleCursorOffset = int2(-8, -20);
static const int2 InitialRectanglePosition = int2(20, 60);
struct PrintLabelRecord {
int2 topLeft;
uint index;
};
// [Task 1]:
// Add a dispatch size and rectangle size member to the "RectangleRecord" struct.
// The rectangle size will be in pixels, while the dispatch size will control how many thread
// groups are dispatched. Each thread group will then cover an 8x8 pixel area.
// Dispatch size (or dispatch grid) of a broadcasting node is specified in the record with the
// "SV_DispatchGrid" semantic. The dispatch grid can be of type uint, uint2, uint3, uint16_t, uint16_t2 or uint16_t3.
// See https://microsoft.github.io/DirectX-Specs/d3d/WorkGraphs.html#sv_dispatchgrid for more details.
// In our case, we only need two dimensions, thus we recommend using uint2.
struct RectangleRecord {
int2 topLeft;
float4 color;
};
// Helper function to compute the "position" and "size" for the rectangles from an "index".
void GetRectanglePositionAndSize(in uint index, out int2 position, out int2 size);
[Shader("node")]
[NodeIsProgramEntry]
[NodeLaunch("broadcasting")]
// [Task 4]: Increment the x dimension of the dispatch grid and observe the changes to the rectangle merging.
[NodeDispatchGrid(1, 1, 1)]
[NumThreads(5, 1, 1)]
void Entry(
uint dispatchThreadId : SV_DispatchThreadID,
[MaxRecords(5)]
[NodeId("PrintLabel")]
NodeOutput<PrintLabelRecord> printLabelOutput,
[MaxRecords(5)]
// [Task 2]: Change this output to target the "MergeRectangle" node.
// Hint: As "FillRectangle" and "MergeRectangle" share the same input node, your change should be rather small.
[NodeId("FillRectangle")]
NodeOutput<RectangleRecord> rectangleOutput)
{
// Rectangle position and size for each thread
int2 threadRectanglePositon;
int2 threadRectangleSize;
GetRectanglePositionAndSize(dispatchThreadId, threadRectanglePositon, threadRectangleSize);
ThreadNodeOutputRecords<PrintLabelRecord> printLabelRecord =
printLabelOutput.GetThreadNodeOutputRecords(1);
printLabelRecord.Get().topLeft = threadRectanglePositon;
printLabelRecord.Get().index = dispatchThreadId;
printLabelRecord.OutputComplete();
ThreadNodeOutputRecords<RectangleRecord> rectangleOutputRecord =
rectangleOutput.GetThreadNodeOutputRecords(1);
// [Task 1]:
// Set the newly created dispatch size and rectangle size in the record.
// Rectangle size should be set to "threadRectangleSize" above.
// The unit of "threadRectangleSize" is in pixels, but the dispatch size is in thread groups.
// As each thread group covers an 8x8 pixel area, we need to divide the rectangle size by 8
// and round up to get the required dispatch size.
// You can use the "DivideAndRoundUp(int2 dividend, int2 divisor)" function in Common.h to perform this calculation.
rectangleOutputRecord.Get().topLeft = threadRectanglePositon;
rectangleOutputRecord.Get().color = UintToColor(dispatchThreadId);
rectangleOutputRecord.OutputComplete();
}
[Shader("node")]
[NodeLaunch("broadcasting")]
// [Task 1]:
// Change this from [NodeDispatchGrid(...)] to [NodeMaxDispatchGrid(...)] to allow for a dynamic grid size.
// Together with the SV_DispatchGrid semantic in the "RectangleRecord" struct, this will
// enable the "FillRectangle" to read it's dispatch grid dynamically from the input record.
// The width of the rectangles increases linearly by "RectangleSizeStep" with each one.
// We currently have 5 rectangles, but we want to increase this number later on.
// Set the maximum dispatch grid to allow for at least 20 rectangles
// (i.e., a rectangle which has a width of RectangleSize + 20 * RectangleSizeStep).
[NodeDispatchGrid(6, 6, 1)]
[NumThreads(8, 8, 1)]
[NodeId("FillRectangle")]
void FillRectangleNode(
DispatchNodeInputRecord<RectangleRecord> inputRecord,
uint2 dispatchThreadId : SV_DispatchThreadID
)
{
const RectangleRecord record = inputRecord.Get();
const int2 pixel = record.topLeft + dispatchThreadId;
// [Task 1]:
// Each thread group can fill up to 8x8 pixels. If the rectangle size is not divisible by 8,
// we have to round up to ensure we launch enough thread groups.
// Thus, some thread groups may extend past the size of the rectangle.
// Add a check to test if "dispatchThreadId" is within the rectangle size (supplied by the input record).
if (// Check if pixel is within bounds of render target.
all(pixel >= 0) && all(pixel < RenderSize)) {
RenderTarget[pixel] = record.color;
}
}
[Shader("node")]
[NodeLaunch("thread")]
[NodeId("PrintLabel")]
void PrintLabelNode(ThreadNodeInputRecord<PrintLabelRecord> inputRecord)
{
const PrintLabelRecord record = inputRecord.Get();
Cursor cursor = Cursor(record.topLeft + RectangleCursorOffset);
Print(cursor, "|");
PrintUint(cursor, record.index);
}
// Helper function to check if two rectangles share a vertical edge.
// Rectangles are defined by the position of their top-left corner and their size.
// If rectangles share a vertical edge, "topLeft" and "size" will contain the position and
// size of a rectangle covering both input rectangles.
bool ComputeCombinedRect(in int2 topLeft0, in int2 size0, in int2 topLeft1, in int2 size1, out int2 topLeft, out int2 size);
[Shader("node")]
[NodeLaunch("coalescing")]
[NumThreads(1, 1, 1)]
[NodeId("MergeRectangle")]
void MergeRectangleNode(
[MaxRecords(2)]
GroupNodeInputRecords<RectangleRecord> inputRecords,
[MaxRecords(2)]
[NodeId("FillRectangle")]
NodeOutput<RectangleRecord> output
)
{
// Only test of rectangles can be merged if two rectangles were passed.
if (inputRecords.Count() == 2) {
int2 topLeft, size;
// [Task 3]:
// Replace the parameters with the data from the input records.
// "inputRecords.Get(uint index)" or "inputRecords[uint index]" to access a specific input record.
if (ComputeCombinedRect(/* in topLeft0: replace me! */ 0,
/* in size0 : replace me! */ 0,
/* in topLeft1: replace me! */ 1,
/* in size1 : replace me! */ 0,
/* out */ topLeft,
/* out */ size))
{
// [Task 3]:
// Emit a single record to the "FillRectangle" node here.
// Use "topLeft" and "size" from above for this rectangle.
// Compute and set the dispatch size in the same way as you did in the "Entry" node.
// You can re-use the color from any of the input records, or compute a new color for
// the merged rectangle here.
// If we found two rectangles to merge, we can end the node here and thus
// skip passing the input records through to the "FillRectangle" node.
// Note: as we only have a single thread in our thread-group, such control flow
// is allowed, since all calls to output records are still thread-group uniform.
return;
}
}
// [Task 2]:
// Passthrough all incoming records to the "FillRectangle" output.
// Use "inputRecords.Count()" to get the number of input records,
// and thus also the number of required output records.
// Use ".Get(uint index)" or the "[]"-operator to get/set and input/output record.
}
// ================= Helper Functions =================
// Helper function to compute position and size for the rectangles.
void GetRectanglePositionAndSize(in uint index, out int2 position, out int2 size) {
position = InitialRectanglePosition +
int2(index, 0) * RectangleSize +
int2(index * (index - 1) / 2, 0) * RectangleSizeStep;
size = RectangleSize.xx + int2(index, 0) * RectangleSizeStep;
}
// Helper function to check if two rectangles share a vertical edge.
bool ComputeCombinedRect(in int2 topLeft0, in int2 size0, in int2 topLeft1, in int2 size1, out int2 topLeft, out int2 size)
{
const int2 topRight0 = topLeft0 + int2(size0.x, 0);
const int2 topRight1 = topLeft1 + int2(size1.x, 0);
// Compute top-left edge of combined rectangle.
topLeft = min(topLeft0, topLeft1);
// Compute size of combined rectangle.
const int width = max(topRight0, topRight1).x - topLeft.x;
size = int2(width, size0.y);
return
// check if rectangles have same height.
size0.y == size1.y &&
// check if rectangles share a vertical edge.
(width <= (size0.x + size1.x));
}