Terms and definitions[Terms]

4 Terms and definitions[Terms]

This document uses terms consistent with their definitions in isoC and isoCPP. In cases where the definitions are unclear, or where this document diverges from isoC and isoCPP, the definitions in this clause, and the attached glossary ([main]) supersede other sources.

4.1 Common Definitions[Terms.Common]

The following definitions are consistent between hlsl and the isoC and isoCPP specifications, however they are included here for reader convenience.

4.1.1 Correct Data[Terms.CorrectData]

Data is correct if it represents values that have specified or unspecified but not undefined behavior for all the operations in which it is used. Data that is the result of undefined behavior is not correct, and may be treated as undefined.

4.1.2 Diagnostic Message[Terms.Diags]

An implementation defined message belonging to a subset of the implementation’s output messages which communicates diagnostic information to the user.

4.1.3 Ill-formed Program[Terms.IllFormed]

A program that is not well-formed, for which the implementation is expected to return unsuccessfully and produce one or more diagnostic messages.

4.1.4 Implementation-defined Behavior[Terms.ImpDef]

Behavior of a well-formed program and correct data which may vary by the implementation, and the implementation is expected to document the behavior.

4.1.5 Implementation Limits[Terms.ImpLimits]

Restrictions imposed upon programs by the implementation of either the compiler or runtime environment. The compiler may seek to surface runtime-imposed limits to the user for improved user experience.

4.1.6 Undefined Behavior[Terms.Undefined]

Behavior of invalid program constructs or incorrect data for which this standard imposes no requirements, or does not sufficiently detail.

4.1.7 Unspecified Behavior[Terms.Unspecified]

Behavior of a well-formed program and correct data which may vary by the implementation, and the implementation is not expected to document the behavior.

4.1.8 Well-formed Program[Terms.WellFormed]

An hlsl program constructed according to the syntax rules, diagnosable semantic rules, and the One Definition Rule.

4.2 General Terms[Terms.General]

The following terms are specific to hlsl and are used throughout the specification.

4.2.1 Runtime Implementation[Terms.Runtime]

A runtime implementation refers to a full-stack implementation of a software runtime that can facilitate the execution of hlsl programs. This broad definition includes libraries and device driver implementations. The hlsl specification does not distinguish between the user-facing programming interfaces and the vendor-specific backing implementation.

4.2.2 Host and Device[Terms.HostDevice]

hlsl is a data-parallel programming language designed for programming auxiliary processors in a larger system. In this context the host refers to the primary processing unit that runs the application which in turn uses a runtime to execute hlsl programs on a supported device. There is no strict requirement that the host and device be different physical hardware, although they commonly are. The separation of host and device in this specification is useful for defining the execution and memory model as well as specific semantics of language constructs.

4.3 spmd Terminology[Terms.SPMD]

hlsl is a spmd programming language. The following terms are used to describe the execution model of spmd programs and the semantics of language constructs that are specific to spmd programming.

4.3.1 lane[Terms.Lane]

A lane represents a single computed element in an spmd program. In a traditional programming model it would be analogous to a thread of execution, however it differs in one key way. In multi-threaded programming, threads advance independent of each other. In spmd programs, a group of lanes may execute instructions in lockstep because each instruction may be a simd instruction computing the results for multiple lanes simultaneously, or synchronizing execution across multiple lanes or waves. A lane has an associated lane state which denotes the execution status of the lane (4.3.6).

4.3.2 wave[Terms.Wave]

A grouping of lanes for execution is called a wave. The size of a wave is defined as the maximum number of active lanes the wave supports. wave sizes vary by hardware architecture, and are required to be powers of two. The number of active lanes in a wave can be any value between one and the wave size.

Some hardware implementations support multiple wave sizes. There is no overall minimum wave size requirement, although some language features do have minimum lane size requirements.

hlsl is explicitly designed to run on hardware with arbitrary wave sizes. Hardware architectures may implement waves as simt where each thread executes instructions in lockstep. This is not a requirement of the model. Some constructs in hlsl require synchronized execution. Such constructs will explicitly specify that requirement.

4.3.3 quad[Terms.Quad]

A quad is a subdivision of four lanes in a wave which are computing adjacent values. In pixel shaders a quad may represent four adjacent pixels and quad operations allow passing data between adjacent lanes. In compute shaders quads may be one or two dimensional depending on the workload dimensionality. Quad operations require four active lanes.

4.3.4 threadgroup[Terms.Group]

A grouping of lanes executing the same shader to produce a combined result is called a threadgroup. threadgroups are independent of simd hardware specifications. The dimensions of a threadgroup are defined in three dimensions. The maximum extent along each dimension of a threadgroup, and the total size of a threadgroup are implementation limits defined by the runtime and enforced by the compiler. If a threadgroup’s size is not a whole multiple of the hardware wave size, the unused hardware lanes are implicitly inactive.

If a threadgroup size is smaller than the wave size , or if the threadgroup size is not an even multiple of the wave size, the remaining lane are inactive lanes.

4.3.5 dispatch[Terms.Dispatch]

A grouping of threadgroups which represents the full execution of a hlsl program and results in a completed result for all input data elements.

4.3.6 lane States[Terms.LaneState]

lanes may be in four primary states: active, helper, inactive, and predicated off.

An active lane is enabled to perform computations and produce output results based on the initial launch conditions and program control flow.

A helper lane is a lane which would not be executed by the initial launch conditions except that its computations are required for adjacent pixel operations in pixel fragment shaders. A helper lane will execute all computations but will not perform writes to buffers, and any outputs it produces are discarded. Helper lanes may be required for lane-cooperative operations to execute correctly.

An inactive lane is a lane that is not executed by the initial launch conditions. This can occur if there are insufficient inputs to fill all lanes in the wave, or to reduce per-thread memory requirements or register pressure.

A predicated off lane is a lane that is not being executed due to program control flow. A lane may be predicated off when control flow for the lanes in a wave diverge and one or more lanes are temporarily not executing.

The diagram blow illustrates the state transitions between lane states:

State transitions for lane states

4.3.7 Uniformity[Terms.Uniformity]

Uniformity is a property of either a value or a control flow construct. Uniformity can be discussed in terms of any grouping of lanes (e.g. quad, threadgroup, wave, dispatch), if a grouping of lanes is not explicitly mentioned, it is assumed to be a wave.

When referring to a value, uniformity means that the value is the same across all lanes in the grouping. When referring to a control flow construct, uniformity means that all lanes in the grouping are executing the same path through the program’s control flow.

Uniformity can be a static property that is determined at compile time, or a dynamic property that is determined at runtime.

4.3.8 Divergence[Terms.Divergence]

Divergence is a dynamic property of program execution where one or more lanes in a wave are executing different paths through the program’s control flow. Control flow that causes divergence is said to be divergent.

4.3.9 Convergence[Terms.Convergence]

Convergence is a property of an operation that requires the lanes executing the operation to synchronize in order to produce correct results. Operations requiring convergence are said to be convergent.

4.4 Memory Spaces[Terms.MemorySpaces]

Memory spaces refer to the different types of memory that programs can access. Each memory space has different properties and semantics for how it can be accessed and used.