The DirectML wrappers are a key component of Harlinn.AI, providing an API on top of the DirectML API, facilitating robust C++ DirectML development.

Yolo v9

DirectML is a high-performance, hardware-accelerated DirectX 12 library for executing AI workloads. DirectML provides hardware acceleration for common AI tasks across a broad range of hardware, including neural processing units (NPU) and DirectX 12-capable GPUs.

NPUs are processing units that simulates the neural network of the brain, capable of processing large amounts of data in parallel, performing trillions of operations per second, providing a cost efficient alternative for AI workloads.

Using DirectML we can create hardware accelerated AI apps capable of leveraging the capabilities of the system for executing those workloads efficiently, regardless of whether they run on GPUs or NPUs.

DirectML leverages the Direct3D 12 execution model, where efficient execution of AI workloads can be interleaved with execution of graphic workloads. For now, this is an important feature of DirectML, as the current generation of NPUs lacks the processing power of high end GPUs.

DirectML executes hardware-accelerated AI primitives, called operators, on a suitable device on the system running the AI workload. Operators are building blocks that can be executed individually, or composed into a graph that fully describes an AI task that can be compiled and executed by DirectML.

Harlinn.AI redeclares the description operator structures provided with DirectML, providing a complete set of new structures describing all the operators implemented by DirectML. Every operator describing structure in the Harlinn::AI::DML namespace is binary compatible with the original structure from DirectML describing a particular operator, but with some key differences:

  1. A static constexpr DML::OperatorType OperatorType identifying the operator.
  2. All operator describing structures are directly, or indirectly, derived from struct BaseOperatorDesc.
  3. Every operator member variable gets initialized to a default value.
  4. The operator describing structures have constructors accepting similar arguments, in the same order, even when the struct declares its members in a different order.
  5. Nearly all unary operators are derived from struct UnaryOperatorDesc. The exceptions are the unary operators defined by DirectML that have an incompatible memory layout.
  6. Nearly all binary operators are derived from struct BinaryOperatorDesc. The exceptions are the binary operators defined by DirectML that have an incompatible memory layout.

These features makes it easier to create template based code in C++ for the operators.

This is used by Harlinn.AI DirectML eXtensions (DML.X) to implement a framework that makes it easier to compose a graph of operators that describes an AI task that can be compiled and executed by DirectML.

The DirectML wrappers can also be used with the Microsoft.AI.MachineLearning package, which is a cross-platform library that supports the open standard ONNX format for machine learning models. The ONNX Runtime can use DirectML as one of its execution providers, along with other backends such as CPU, CUDA, or TensorRT.

Operator Descriptors

Operator descriptors are used to define the datatype and shape of input and output tensors. In many cases the operator descriptors also hold additional parameters for the operation they describe.

BaseOperatorDesc

All operator descriptors are derived from BaseOperatorDesc, an empty struct adding zero bytes to the derived operator descriptors due to Empty base optimization.

struct BaseOperatorDesc abstract
{
};

UnaryOperatorDesc

Nearly all unary DirectML operators have const TensorDesc* InputTensor and const TensorDesc* OutputTensor as the first two member fields of their descriptor type. When this is the case, Harlinn.AI DirectML derives the operator descriptor from UnaryOperatorDesc.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BaseOperatorDesc.

UnaryOperatorWithScaleBiasDesc

Many unary DirectML operators have an optional const DML::ScaleBias* as the third member field of their descriptor type. When this is the case, Harlinn.AI DirectML derives the operator descriptor from UnaryOperatorWithScaleBiasDesc.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorDesc.

BinaryOperatorDesc

Many binary DirectML operators have const TensorDesc* ATensor, const TensorDesc* BTensor and const TensorDesc* OutputTensor as the first three member fields of their descriptor type. When this is the case, Harlinn.AI DirectML derives the operator descriptor from BinaryOperatorDesc.

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BaseOperatorDesc.

ElementWiseIdentityOperatorDesc

Computes the identity for each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x)=x\]

or

\[f(x)=x \times scale + bias\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseAbsOperatorDesc

Computes the absolute value for each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x)=|x|\]

or

\[f(x)=|x \times scale + bias|\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseACosOperatorDesc

Computes the inverse cosine of each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x)=cos^{-1}(x)\]

or

\[f(x)=cos^{-1}(x \times scale + bias)\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseAddOperatorDesc

Adds every element in ATensor to its corresponding element in BTensor, placing the result into the corresponding element of OutputTensor.

\[f(a,b)=a+b\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as ATensor, or BTensor, or both, during binding.

ElementWiseAdd1OperatorDesc

Adds every element in ATensor to its corresponding element in BTensor and places the result into the corresponding element of OutputTensor, with the option for fused activation.

\[f(a,b)=a+b\]

or

\[f(a,b)=FusedActivation(a+b)\]

when fusedActivation is specified.

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
fusedActivation const OperatorDesc* An optional operator descriptor for a fused activation operator to be applied to the output.

Derived from BinaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as ATensor, or BTensor, or both, during binding.

ElementWiseASinOperatorDesc

Computes the arcsine for each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x)=sin^{-1}(x)\]

or

\[f(x)=sin^{-1}(x \times scale + bias)\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseATanOperatorDesc

Computes the arctangent for each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x)=tan^{-1}(x)\]

or

\[f(x)=tan^{-1}(x \times scale + bias)\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseCeilOperatorDesc

Computes the ceiling for each element of InputTensor, placing the result into the corresponding element of OutputTensor. The ceiling of \(x\) is the smallest integer that is greater than or equal to \(x\).

\[f(x)=ceil(x)\]

or

\[f(x)=ceil(x \times scale + bias)\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseClipOperatorDesc

Performs the following operation for each element of InputTensor, placing the result into the corresponding element of OutputTensor. This operator clamps (or limits) every element in the input within the closed interval \([Min, Max]\).

\[f(x)=max(Min,min(x,Max))\]

Where \(max(a,b)\) returns the larger of the two values, and \(min(a,b)\) returns the smaller of the two values \(a,b\), or

\[f(x)=max(Min,min(x \times scale + bias,Max))\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseCosOperatorDesc

Computes the trigonometric cosine of each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x)=cos(x)\]

or

\[f(x)=cos(x \times scale + bias)\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseDivideOperatorDesc

Computes the quotient of each element of ATensor over the corresponding element of BTensor, placing the result into the corresponding element of OutputTensor.

\[f(a,b)=\frac{a}{b}\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as ATensor, or BTensor, during binding.

ElementWiseExpOperatorDesc

Applies the natural exponentiation function to each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x)=e^x\]

or

\[f(x)=e^{x \times scale + bias}\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseFloorOperatorDesc

Computes the floor for each element of InputTensor, placing the result into the corresponding element of OutputTensor.

The floor of \(x\) is the largest integer that is less than or equal to \(x\).

\[f(x)=floor(x)\]

or

\[f(x)=floor(x \times scale + bias)\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseLogOperatorDesc

Computes the base-e (natural) logarithm of each element of InputTensor, placing the result into the corresponding element of OutputTensor.

If \(x\) is negative, then this function returns NaN. If \(x\) is 0, then this function returns \(-\infty\).

\[f(x)=\ln x\]

or

\[f(x)=\ln{(x \times scale + bias)}\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseLogicalAndOperatorDesc

Performs a logical and on each pair of corresponding elements of the input tensors, placing the result (1 for true, 0 for false) into the corresponding element of OutputTensor.

\[f(a,b)= a \land b\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as ATensor, or BTensor, during binding.

ElementWiseLogicalEqualsOperatorDesc

Performs a logical equals on each pair of corresponding elements of the input tensors, placing the result (1 for true, 0 for false) into the corresponding element of OutputTensor.

\[f(a,b)= a \Leftrightarrow b\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as ATensor, or BTensor, during binding.

ElementWiseLogicalGreaterThanOperatorDesc

Performs a logical greater than on each pair of corresponding elements of the input tensors, placing the result (1 for true, 0 for false) into the corresponding element of OutputTensor.

\[f(a,b)= a > b\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

ElementWiseLogicalLessThanOperatorDesc

Performs a logical less than on each pair of corresponding elements of the input tensors, placing the result (1 for true, 0 for false) into the corresponding element of OutputTensor.

\[f(a,b)= a < b\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

ElementWiseLogicalNotOperatorDesc

Performs a logical NOT on each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x)= \neg x\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseLogicalOrOperatorDesc

Performs a logical OR on each pair of corresponding elements of the input tensors, placing the result into the corresponding element of OutputTensor.

\[f(a,b)= a \lor b\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as ATensor, or BTensor, during binding.

ElementWiseLogicalXorOperatorDesc

Performs a logical XOR (exclusive or) on each pair of corresponding elements of the input tensors, placing the result into the corresponding element of OutputTensor.

\[f(a,b)= a \oplus b\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as ATensor, or BTensor, during binding.

ElementWiseMaxOperatorDesc

Takes the greater of two corresponding elements from the input tensors, and places the result into the corresponding element of the output tensor.

\[f(a,b)= max(a, b)\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as ATensor, or BTensor, during binding.

ElementWiseMeanOperatorDesc

Averages each pair of corresponding elements of the input tensors, placing the result into the corresponding element of OutputTensor.

\[f(a,b)= \frac{a+b}{2}\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as ATensor, or BTensor, during binding.

ElementWiseMinOperatorDesc

Takes the lesser of two corresponding elements from the input tensors, and places the result into the corresponding element of OutputTensor.

\[f(a,b)= min(a, b)\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as ATensor, or BTensor, during binding.

ElementWisePowOperatorDesc

Computes each element of InputTensor raised to the power of the corresponding element of ExponentTensor, placing the result into the corresponding element of OutputTensor.

Negative bases are supported for exponents with integral values (though datatype can still be float), otherwise this operator returns NaN.

When the input tensor and exponent tensor both have integral data type, this operator guarantees exact results.

\[f(x,y)= x^y\]

or

\[f(x,y)= (x \times scale + bias)^y\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the first input tensor to read from.
exponentTensor const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from BaseOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor, or ExponentTensor, during binding.

ElementWiseConstantPowOperatorDesc

Raises each element of InputTensor to the power of Exponent, placing the result into the corresponding element of OutputTensor.

\[f(x,y)= x^y\]

or

\[f(x,y)= (x \times scale + bias)^y\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.
exponent float The exponent that all inputs will be raised to.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseReciprocalOperatorDesc

Computes the reciprocal for each element of the input tensor, placing the result into the corresponding element of the output tensor.

\[f(x)= \frac{1}{x}\]

or

\[f(x)= \frac{1}{x \times scale + bias}\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseSinOperatorDesc

Computes the trigonometric sine of each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x)= sin(x)\]

or

\[f(x)= sin(x \times scale + bias)\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseSqrtOperatorDesc

Computes the square root of each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x)= \sqrt{x}\]

or

\[f(x)= \sqrt{x \times scale + bias}\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseSubtractOperatorDesc

Subtracts each element of BTensor from the corresponding element of ATensor, placing the result into the corresponding element of OutputTensor.

\[f(a,b)=a-b\]

Constructor parameters:

Parameter Type Description
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BinaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as ATensor, or BTensor, or both, during binding.

ElementWiseTanOperatorDesc

Computes the trigonometric tangent of each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x)= tan(x)\]

or

\[f(x)= tan(x \times scale + bias)\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseThresholdOperatorDesc

Replaces all elements of InputTensor below the given threshold, Min, with Min. Results are placed into the corresponding element of OutputTensor.

\[f(x)= max(x,Min)\]

or

\[f(x)= max(x \times scale + bias,Min)\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.
min float The minimum value, below which the operator replaces the value with Min.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseQuantizeLinearOperatorDesc

Performs the following linear quantization function on every element in InputTensor with respect to its corresponding element in ScaleTensor and ZeroPointTensor, placing the results in the corresponding element of OutputTensor.

\[f(input, scale, zeroPoint) = clamp(round(\frac{input}{scale}) + zeroPoint, Min, Max)\]

Where Min is 0 and Max is 255 for UInt8 output, and Min is -128 and Max is 127 for Int8 output.

Quantizing involves converting to a lower-precision data type in order to accelerate arithmetic. It’s a common way to increase performance at the cost of precision. A group of 8-bit values can be computed faster than a group of 32-bit values can.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleTensor const TensorDesc* The tensor containing the scales. If InputTensor is Int32, then ScaleTensor must be Float32. Otherwise, ScaleTensor must have the same DataType as InputTensor. A scale value of 0 results in undefined behavior.
zeroPointTensor const TensorDesc* The tensor containing the desired zero point for the quantization.

Derived from BaseOperatorDesc.

ElementWiseDequantizeLinearOperatorDesc

Performs the following linear dequantization function on every element in InputTensor with respect to its corresponding element in ScaleTensor and ZeroPointTensor, placing the results in the corresponding element of OutputTensor.

\[f(input, scale, zeroPoint) = (input - zeroPoint) \times scale\]

Quantization is a common way to increase performance at the cost of precision. A group of 8-bit int values can be computed faster than a group of 32-bit float values can. Dequantizing converts the encoded data back to its domain.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleTensor const TensorDesc* The tensor containing the scales. If InputTensor is Int32, then ScaleTensor must be Float32. Otherwise, ScaleTensor must have the same DataType as InputTensor. A scale value of 0 results in undefined behavior.
zeroPointTensor const TensorDesc* The tensor containing the zero point that was used for quantization.

Derived from BaseOperatorDesc.

ActivationELUOperatorDesc

Performs an exponential linear unit (ELU) activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = \begin{cases} x, & \text{ if } x > 0\\ \alpha \times (\exp(x) - 1), & \text{ if } x \leq 0 \end{cases}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
alpha float The alpha coefficient. The default for this value is 1.0.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationHardMaxOperatorDesc

Performs a hardmax function on each element of InputTensor, placing the result into the corresponding element of OutputTensor.

The operator computes the hardmax (1 for the first occurrence of the largest value in the layer, and 0 for all other values) of each row in the given input.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from. This tensor must have an effective rank no greater than 2. The effective rank of a tensor is the DimensionCount of the tensor, excluding leftmost dimensions of size 1. For example a tensor size of [ 1, 1, BatchCount, Width ] is valid, and is equivalent to a tensor of sizes [ BatchCount, Width ].
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from UnaryOperatorDesc.

The operator computes the hardmax (1 for the first maximum value, and 0 for all others) values for each layer in the batch of the given input. The input is a 2-D tensor of size (batch_size x input_feature_dimensions). The output tensor has the same shape and contains the hardmax values of the corresponding input.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationHardSigmoidOperatorDesc

Performs a hard sigmoid function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = max(0, min(Alpha \times x + Beta, 1))\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
alpha float* The alpha coefficient. The default for this value is 0.2.
beta float* The beta coefficient. The default for this value is 0.5.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationIdentityOperatorDesc

Performs the identity activation, effectively copying every element of InputTensor to the corresponding element of OutputTensor.

\[f(x) = x\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
alpha float* The alpha coefficient. The default for this value is 0.2.
beta float* The beta coefficient. The default for this value is 0.5.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationLeakyReLUOperatorDesc

Performs a leaky rectified linear unit (ReLU) activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = \begin{cases} x, & \text{ if } x \geq 0\\ \alpha \times x, & \text{ if } x < 0 \end{cases}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
alpha float The alpha coefficient. The default for this value is 0.01.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationHardSigmoidOperatorDesc

Performs the linear activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = Alpha \times x + Beta\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
alpha float* The alpha coefficient. The default for this value is 1.0.
beta float* The beta coefficient. The default for this value is 0.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationLogSoftMaxOperatorDesc

Performs a natural log-of-softmax activation function on each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x_{i}) = \log\left(\frac{\exp(x_i) }{ \sum_j \exp(x_j)} \right)\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from. This tensor must have an effective rank no greater than 2. The effective rank of a tensor is the DimensionCount of the tensor, excluding leftmost dimensions of size 1. For example a tensor size of [ 1, 1, BatchCount, Width ] is valid, and is equivalent to a tensor of sizes [ BatchCount, Width ].
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from UnaryOperatorDesc.

ActivationParameterizedReLUOperatorDesc

Performs a parameterized rectified linear unit (ReLU) activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x, slope) = \begin{cases} x, & \text{ if } x \geq 0\\ slope \times x, & \text{ if } x < 0 \end{cases}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
slopeTensor const TensorDesc* A tensor containing the slope for each corresponding value of the input.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BaseOperatorDesc.

ActivationParametricSoftPlusOperatorDesc

Performs a parametric softplus activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = Alpha \times \log(1 + e^{Beta * x})\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
alpha float* The alpha coefficient.
beta float* The beta coefficient.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationReLUOperatorDesc

Performs a rectified linear unit (ReLU) activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = max(0,x)\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationScaledELUOperatorDesc

Performs a scaled exponential linear unit (ELU) activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = \begin{cases} Gamma \times x, & \text{ if } x > 0\\ Gamma \times (Alpha \times e^x - Alpha), & \text{ if } x \leq 0 \end{cases}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
alpha float* The alpha coefficient. The default value is 1.67326319217681884765625.
gamma float* The beta coefficient. The default value is 1.05070102214813232421875.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationScaledTanHOperatorDesc

Performs a scaled hyperbolic tangent activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = Alpha \times tanh(Beta \times x)\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
alpha float* The alpha coefficient. The default value is 1.0
beta float* The beta coefficient. The default value is 0.5

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationSigmoidOperatorDesc

Performs the sigmoid function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = \frac{1}{1 + e^{-x}}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationSoftMaxOperatorDesc

Performs a softmax activation function on InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x_{i}) = \frac{e^{x_i}}{\sum_j e^{x_j}}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from. This tensor must have an effective rank no greater than 2. The effective rank of a tensor is the DimensionCount of the tensor, excluding leftmost dimensions of size 1. For example a tensor size of [ 1, 1, BatchCount, Width ] is valid, and is equivalent to a tensor of sizes [ BatchCount, Width ].
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from UnaryOperatorDesc.

ActivationSoftPlusOperatorDesc

Performs a parametric softplus activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = \frac{\log(1 + e^{Steepness \times x})}{ Steepness}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
steepness float* The Steepness coefficient. The default value is 1.0. This value cannot be less than 1.0.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationSoftSignOperatorDesc

Performs the softsign function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = \frac{x}{1+|x|}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationTanHOperatorDesc

Performs a hyperbolic tangent activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = \frac{1 - e^{-2 \times x}}{1 + e^{-2 \times x}}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ActivationThresholdedReLUOperatorDesc

Performs a thresholded rectified linear unit (ReLU) activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = \begin{cases} x, & \text{ if } x > \alpha\\ 0, & \text{ if } x \leq \alpha \end{cases}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
alpha float* The alpha coefficient. The default value is 1.0

Derived from UnaryOperatorDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ConvolutionOperatorDesc

Performs a convolution of the FilterTensor with the InputTensor. This operator supports a number of standard convolution configurations. These standard configurations include forward and backward (transposed) convolution by setting the Direction and Mode fields, as well as depth-wise convolution by setting the GroupCount field.

A summary of the steps involved:

  1. Perform the convolution into the output tensor.
  2. Reshape the bias to the same dimension sizes as the output tensor.
  3. Add the reshaped bias tensor to the output tensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
filterTensor const TensorDesc* A tensor containing the filter data.
biasTensor const TensorDesc* An optional tensor containing the bias data. The bias tensor is a tensor containing data which is broadcasted across the output tensor at the end of the convolution which is added to the result.
mode ConvolutionMode The mode to use for the convolution operation. The default value is ConvolutionMode::CrossCorrelation.
direction ConvolutionDirection The direction of the convolution operation. The default value is ConvolutionDirection::Forward
dimensionCount UInt32 The number of spatial dimensions for the convolution operation. Spatial dimensions are the lower dimensions of the convolution FilterTensor. For example, the width and height dimension are spatial dimensions of a 4D convolution filter tensor. This value also determines the size of the Strides, Dilations, StartPadding, EndPadding, and OutputPadding arrays. It should be set to 2 when InputTensor.DimensionCount is 4, and 3 when InputTensor.DimensionCount is 5.
strides const UInt32* An array containing the strides of the convolution operation. These strides are applied to the convolution filter.
dilations const UInt32* An array containing the dilations of the convolution operation. Dilations are strides applied to the elements of the filter kernel. This has the effect of simulating a larger filter kernel by padding the internal filter kernel elements with zeros.
startPadding const UInt32* An array containing the padding values to be applied to the beginning of each spatial dimension of the filter and input tensor of the convolution operation. The start padding values are interpreted according to the Direction field.
endPadding const UInt32* An array containing the padding values to be applied to the end of each spatial dimension of the filter and input tensor of the convolution operation. The end padding values are interpreted according to the Direction field.
outputPadding const UInt32* An array containing the output padding of the convolution operation. OutputPadding applies a zero padding to the result of the convolution. This padding is applied to the end of each spatial dimension of the output tensor.
groupCount UInt32 The number of groups which to divide the convolution operation up into. This can be used to achieve depth-wise convolution by setting GroupCount equal to the input channel count, and Direction equal to ConvolutionDirection::Forward. This divides the convolution up into a separate convolution per input channel.
fusedActivation const OperatorDesc* An optional fused activation layer to apply after the convolution.

Derived from BaseOperatorDesc.

GEMMOperatorDesc

Performs a general matrix multiplication function of the form \(f(a,b,c) = FusedActivation(Alpha * TransA(A) x TransB(B) + Beta * C)\), where x denotes matrix multiplication, and * denotes multiplication with a scalar.

This operator requires 4D tensors with layout [ BatchCount, ChannelCount, Height, Width ], and it will perform BatchCount * ChannelCount number of independent matrix multiplications.

For example, if ATensor has Sizes of [ BatchCount, ChannelCount, M, K ], and BTensor has Sizes of [ BatchCount, ChannelCount, K, N ], and OutputTensor has Sizes of [ BatchCount, ChannelCount, M, N ], then this operator performs BatchCount * ChannelCount independent matrix multiplications of dimensions [M,K] x [K,N] = [M,N].

Constructor parameters:

Parameter Type Description
aTensor const TensorDesc* A tensor containing the A matrix. This tensor’s Sizes should be [ BatchCount, ChannelCount, M, K ] if TransA is MatrixTransform::None, or [ BatchCount, ChannelCount, K, M ] if TransA is MatrixTransform::Transpose.
bTensor const TensorDesc* A tensor containing the B matrix. This tensor’s Sizes should be [ BatchCount, ChannelCount, K, N ] if TransB is MatrixTransform::None, or [ BatchCount, ChannelCount, N, K ] if TransB is MatrixTransform::Transpose.
outputTensor const TensorDesc* The tensor to write the results to. This tensor’s Sizes are [ BatchCount, ChannelCount, M, N ].
cTensor const TensorDesc* An optional tensor containing the C matrix, or nullptr. Values default to 0 when not provided. If provided, this tensor’s Sizes should be [ BatchCount, ChannelCount, M, N ].
alpha float The value of the scalar multiplier for the product of inputs ATensor and BTensor.
beta float The value of the scalar multiplier for the optional input CTensor. If CTensor is not provided, then this value is ignored.
transA MatrixTransform The transform to be applied to ATensor; either a transpose, or no transform.
transB MatrixTransform The transform to be applied to BTensor; either a transpose, or no transform.
fusedActivation const OperatorDesc* An optional fused activation layer to apply after the GEMM.

Derived from BaseOperatorDesc.

ReduceOperatorDesc

Outputs the reduction of elements (sum, product, minimum, and so on) within one or more dimensions of the input tensor.

Each output element is the result of applying a reduction function on a subset of the input tensor. A reduction function, such as sum, maps N input elements to a single output element. The input elements involved in each reduction are determined by the provided input axes: N is equal to the product of the sizes of the reduced axes. If all input axes are specified, then the operator performs a reduction on the entire input tensor and produces a single output element.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* The tensor to read from.
outputTensor const TensorDesc* The tensor to write the results to. Each output element is the result of a reduction on a subset of elements from the InputTensor.DimensionCount must match InputTensor.DimensionCount (the rank of the input tensor is preserved). Sizes must match InputTensor.Sizes, except for dimensions included in the reduced Axes, which must be size 1.
function ReduceFunction Specifies the reduction function to apply to the input.
axisCount UInt32 The number of axes to reduce. This field determines the size of the Axes array.
axes const UInt32* The axes along which to reduce. Values must be in the range [0, InputTensor.DimensionCount - 1].

Derived from BaseOperatorDesc.

AveragePoolingOperatorDesc

Averages values across the elements within the sliding window over the input tensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
dimensionCount UInt32 The number of spatial dimensions of the input tensor InputTensor, which also corresponds to the number of dimensions of the sliding window WindowSize. This value also determines the size of the Strides, StartPadding, and EndPadding arrays. It should be set to 2 when InputTensor is 4D, and 3 when it’s a 5D tensor.
strides const UInt32* The strides for the sliding window dimensions of sizes [ Height, Width ] when the DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
windowSize const UInt32* The dimensions of the sliding window in [ Height, Width ] when DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
startPadding const UInt32* The number of padding elements to be applied to the beginning of each spatial dimension of the input tensor InputTensor. The values are in [ Height, Width ] when DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
endPadding const UInt32* The number of padding elements to be applied to the end of each spatial dimension of the input tensor InputTensor. The values are in [ Height, Width ] when DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
includePadding bool Indicates whether to include the padding elements around the spatial edges when calculating the average value across all elements within the sliding window. When the value is set to FALSE, the padding elements are not counted as part of the divisor value of the averaging calculation.

Derived from UnaryOperatorDesc.

LPPoolingOperatorDesc

Computes the Lp-normalized value across the elements within the sliding window over the input tensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
dimensionCount UInt32 The number of spatial dimensions of the input tensor InputTensor, which also corresponds to the number of dimensions of the sliding window WindowSize. This value also determines the size of the Strides, StartPadding, and EndPadding arrays. It should be set to 2 when InputTensor is 4D, and 3 when it’s a 5D tensor.
strides const UInt32* The strides for the sliding window dimensions of sizes [ Height, Width ] when the DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
windowSize const UInt32* The dimensions of the sliding window in [ Height, Width ] when DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
startPadding const UInt32* The number of padding elements to be applied to the beginning of each spatial dimension of the input tensor InputTensor. The values are in [ Height, Width ] when DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
endPadding const UInt32* The number of padding elements to be applied to the end of each spatial dimension of the input tensor InputTensor. The values are in [ Height, Width ] when DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
p UInt32 The value of the P variable in the Lp-normalization function \(Y = (X1^P + X2^P + ... + Xn^P)^{1/P}\), where X1 to Xn representing each of the values within the sliding window. In common use cases, this value is either set to 1 or 2, representing either the L1 or L2 normalization respectively.

Derived from UnaryOperatorDesc.

MaxPoolingOperatorDesc

Computes the maximum value across the elements within the sliding window over the input tensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
dimensionCount UInt32 The number of spatial dimensions of the input tensor InputTensor, which also corresponds to the number of dimensions of the sliding window WindowSize. This value also determines the size of the Strides, StartPadding, and EndPadding arrays. It should be set to 2 when InputTensor is 4D, and 3 when it’s a 5D tensor.
strides const UInt32* The strides for the sliding window dimensions of sizes [ Height, Width ] when the DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
windowSize const UInt32* The dimensions of the sliding window in [ Height, Width ] when DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
startPadding const UInt32* The number of padding elements to be applied to the beginning of each spatial dimension of the input tensor InputTensor. The values are in [ Height, Width ] when DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
endPadding const UInt32* The number of padding elements to be applied to the end of each spatial dimension of the input tensor InputTensor. The values are in [ Height, Width ] when DimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.

Derived from UnaryOperatorDesc.

ROIPoolingOperatorDesc

Performs a MaxPool function across the input tensor according to regions of interest.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
roiTensor const TensorDesc* A tensor containing the regions of interest (ROI) data. The expected dimensions of ROITensor are [ 1, 1, NumROIs, 5 ] and the data for each ROI is [BatchID, x1, y1, x2, y2]. x1, y1, x2, y2 are the inclusive coordinates of the corners of each ROI and \(x2 >= x1, y2 >= y1.\)
spatialScale float Multiplicative spatial scale factor used to translate the ROI coordinates from their input scale to the scale used when pooling.
pooledSize Size2D The ROI pool output size (height, width), which must match the last 2 dimensions of OutputTensor.

Derived from BaseOperatorDesc.

SliceOperatorDesc

Extracts a single subregion (a “slice”) of an input tensor.

The elements copied in the slice are determined using three values for each dimension.

  • The offset marks the first element to copy in a dimension.
  • The size marks the number of elements to copy in a dimension.
  • The stride indicates the element increment or step in a dimension.

The provided Offsets, Sizes, and Strides must only copy elements that are within the bounds of the input tensor (out-of-bounds reads are not permitted). The Sizes of the slice must exactly match the output tensor sizes. In general, the elements copied are calculated as follows.

\[OutputTensor[OutputCoordinates] = InputTensor[Offsets + Strides \times OutputCoordinates]\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the tensor to extract slices from.
outputTensor const TensorDesc* Describes the tensor to write the sliced data results to.
dimensionCount UInt32 The number of dimensions. This field determines the size of the Offsets, Sizes, and Strides arrays. This value must match the DimensionCount of the input and output tensors. This value must be between 1 and 8, inclusively.
offsets const UInt32* An array containing the slice’s start along each dimension of the input tensor, in elements.
sizes const UInt32* An array containing the slice’s size along each dimension, in elements. The values in this array must match the sizes specified in the output tensor.
strides const UInt32* An array containing the slice’s stride along each dimension of the input tensor, in elements. A stride larger than 1 indicates that elements of the input tensor may be skipped (for example, a stride of 2 will select every second element along the dimension).

Derived from UnaryOperatorDesc.

CastOperatorDesc

Casts each element in the input to the data type of the output tensor, and stores the result in the corresponding element of the output.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from UnaryOperatorDesc.

SplitOperatorDesc

Splits an input tensor along an axis into multiple output tensors.

All input and output tensors must have the same sizes, except for the split axis. The size of input tensor in the split axis determines the possible splits. For example, if the input tensor’s split axis has size 3, then there are these potential splits:

  • 1+1+1 (3 outputs)
  • 1+2 (2 outputs)
  • 2+1 (2 outputs)
  • 3 (1 output, which is simply a copy of the input tensor).

The output tensors’ split axis sizes must sum up to exactly the input tensor’s split axis size.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensorCount UInt32 This parameter determines the size of the OutputTensors array. This value must be greater than 0.
outputTensors const TensorDesc* An array containing the descriptions of the tensors split off from the input tensor. The output sizes must have the same sizes as the input tensor except for the split axis.
axis UInt32 The index of the dimension of the input tensor to split. All input and output tensors must have identical sizes in all dimensions except for this axis. This value must be in the range [0, InputTensor.DimensionCount - 1].

Derived from BaseOperatorDesc.

JoinOperatorDesc

Concatenates an array of input tensors along a specified axis.

Input tensors may only be joined if their sizes are identical in all dimensions except for the join axis, which may contain any non-zero size. The output sizes are equal to the input sizes except for the join axis, which is the sum of all inputs’ join axis size.

Parameter Type Description
inputTensorCount UInt32 This parameter determines the size of the InputTensors array. This value must be greater than 0.
inputTensors const TensorDesc* An array containing the descriptions of the tensors to join into a single output tensor. All input tensors in this array must have the same sizes except for the join axis, which may have any non-zero value.
outputTensor const TensorDesc* The tensor to write the joined input tensors into. The output sizes must have the same sizes as all input tensors except for the join axis, which must be equal to the sum of all inputs’ join axis size.
axis UInt32 The index of the dimension of the input tensors to join. All input and output tensors must have identical sizes in all dimensions except for this axis. This value must be in the range [0, OutputTensor.DimensionCount - 1].

Derived from BaseOperatorDesc.

PaddingOperatorDesc

Inflates the input tensor with constant or mirrored values on the edges, and writes the result to the output.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
dimensionCount UInt32 The size of the arrays pointed to by StartPadding and EndPadding. This value must be the same value as the dimension count of InputTensor and OutputTensor.
startPadding const UInt32* The sizes of the padding regions to add at the beginning of each dimension. For each dimension i, StartPadding[i] = OutputTensor.Sizes[i] - InputTensor.Sizes[i] - EndPadding[i].
endPadding const UInt32* The sizes of the padding regions to add at the end of each dimension. For each dimension i, EndPadding[i] = OutputTensor.Sizes[i] - InputTensor.Sizes[i] - StartPadding[i].
paddingValue float The padding value to use when paddingMode is DML::PaddingMode::Constant. This value is ignored for other padding modes. Note that if the DataType of the tensors is not DML::TensorDataType::Float16 or DML::TensorDataType::Float32, then the value might be truncated (for example, 10.6 will become 10).
paddingMode DML::PaddingMode The padding mode to use when filling the padding regions.

Derived from UnaryOperatorDesc.

ValueScale2DOperatorDesc

Performs an element-wise scale-and-bias function

\[f(x) = x \times Scale + Bias\]

This operator is similar to using an ElementWiseIdentityOperatorDesc with a scale and bias, except that ValueScale2DOperatorDesc applies a different bias for each channel, rather than a single bias for the entire tensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from. This tensor’s dimensions should be [ BatchCount, ChannelCount, Height, Width ].
outputTensor const TensorDesc* Describes the output tensor to write the results to. This tensor’s dimensions should match the inputTensor’s dimensions.
scale float Scale value to be applied to all input values.
channelCount UInt32 This parameter determines the size of the Bias array. This parameter must be set to either 1 or 3, and must also match the size of the Channel dimension of the input tensor.
bias const float* An array of float values containing the bias term for each dimension of the input tensor.

Derived from UnaryOperatorDesc.

UpSample2DOperatorDesc

Upsamples the input image, writing the result into the output tensor. The order of the dimensions should be NCHW (BatchSize, ChannelCount, Height, Width) or NCDHW (BatchSize, ChannelCount, Depth, Height, Width), but strides can be used if the data is stored in a different format. Unlike ResampleOperatorDesc, only the last 2 dimensions (height and width) can be upsampled.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from. The expected dimensions of the InputTensor are [ InputBatchCount, InputChannelCount, InputHeight, InputWidth ] for 4D, and [ InputBatchCount, InputChannelCount, InputDepth, InputHeight, InputWidth ] for 5D.
outputTensor const TensorDesc* Describes the output tensor to write the results to. The expected dimensions of the OutputTensor are [ InputBatchCount, InputChannelCount, InputHeight * HeightScale, InputWidth * WidthScale ] for 4D, and [ InputBatchCount, InputChannelCount, InputDepth, InputHeight * HeightScale, InputWidth * WidthScale ] for 5D.
scaleSize Size2D The width and height scales of type UInt32 to apply when upsampling the input. 0 < ScaleSize.Height <= UINT_MAX / InputHeight and 0 < ScaleSize.Width <= UINT_MAX / InputWidth.
interpolationMode DML::InterpolationMode Determines the kind of interpolation used to choose output pixels.

Derived from UnaryOperatorDesc.

GatherOperatorDesc

Gathers elements from the input tensor along Axis, using IndicesTensor to remap indices.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to. The DimensionCount and DataType of this tensor must match InputTensor.DimensionCount. The expected OutputTensor.Sizes are the concatenation of the InputTensor.Sizes leading and trailing segments split at the current Axis with the IndicesTensor.Sizes inserted between.
indicesTensor const TensorDesc* A tensor containing the indices. The DimensionCount of this tensor must match InputTensor.DimensionCount. This operator supports negative index values when using a signed integral type with this tensor. Negative indices are interpreted as being relative to the end of the axis dimension. For example, an index of -1 refers to the last element along that dimension. Invalid indices will yield incorrect outputs, but no failure will occur, and all reads will be clamped safely within the input tensor’s memory.
axis UInt32 The axis dimension of InputTensor to gather on, ranging [0, InputTensor.DimensionCount).
indexDimensions UInt32 The number of actual index dimensions within the IndicesTensor after ignoring any irrelevant leading ones, ranging [0, IndicesTensor.DimensionCount). For example, given IndicesTensor.Sizes = [ 1, 1, 4, 6 ] and IndexDimensions = 3, the actual meaningful indices are [ 1, 4, 6 ].

Derived from BaseOperatorDesc.

SpaceToDepthOperatorDesc

Rearranges blocks of spatial data into depth. The operator outputs a copy of the input tensor where values from the height and width dimensions are moved to the depth dimension.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from. The input tensor’s dimensions are [ Batch, Channels, Height, Width ].
outputTensor const TensorDesc* Describes the output tensor to write the results to. The output tensor’s dimensions are [ Batch, Channels / (BlockSize * BlockSize), Height * BlockSize, Width * BlockSize ].
blockSize UInt32 The width and height of the Blocks that are moved.

Derived from UnaryOperatorDesc.

DepthToSpaceOperatorDesc

Rearranges (permutes) data from depth into blocks of spatial data. The operator outputs a copy of the input tensor where values from the depth dimension are moved in spatial blocks to the height and width dimensions.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from. The input tensor’s dimensions are [ BatchCount, InputChannelCount, InputHeight, InputWidth ].
outputTensor const TensorDesc* Describes the output tensor to write the results to. The tensor to write the results to. The output tensor’s dimensions are [ BatchCount, OutputChannelCount, OutputHeight, OutputWidth ], where: OutputChannelCount is computed as \(\frac{InputChannelCount}{BlockSize \times BlockSize}. OutputHeight is computed as\)InputHeight \times BlockSize\(. OutputWidth is computed as\)InputWidth \times BlockSize.$$
blockSize UInt32 The width and height of the Blocks that are moved.

Derived from UnaryOperatorDesc.

TileOperatorDesc

Constructs an output tensor by tiling the input tensor. The elements in each dimension of the input tensor are repeated by a multiple in the Repeats array.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to. The tensor to write to, which will hold the tiled output. For each dimension i in [0, InputTensor.DimensionCount-1], the output size is calculated as OutputTensor.Sizes[i] = InputTensor.Sizes[i] * Repeats[i]. This tensor must have the same DimensionCount as the input tensor.
repeatsCount UInt32 This parameter determines the size of the Repeats array. This value must be the same as the InputTensor.DimensionCount.
repeats const UInt32* Each value in this array corresponds to one of the input tensor’s dimensions (in order). Each value is the number of tiled copies to make of that dimension. Values must be larger than 0.

Derived from UnaryOperatorDesc.

TopKOperatorDesc

Selects the largest K elements from each sequence along an axis of the InputTensor, and returns the values and indices of those elements in the OutputValueTensor and OutputIndexTensor, respectively. A sequence refers to one of the sets of elements that exist along the Axis dimension of the InputTensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputValueTensor const TensorDesc* The output tensor to write the values of the top K elements to. This tensor must have sizes equal to the InputTensor, except for the dimension specified by the axis parameter, which must have a size equal to k. The k values selected from each input sequence are guaranteed to be sorted descending (largest to smallest).
outputIndexTensor const TensorDesc* The output tensor to write the indices of the top K elements to. This tensor must have sizes equal to the InputTensor, except for the dimension specified by the axis parameter, which must have a size equal to k. The indices returned in this tensor are measured relative to the beginning of their sequence (as opposed to the beginning of the tensor). For example, an index of 0 always refers to the first element for all sequences in an axis. In cases where two or more elements in the top-K have the same value (that is, when there is a tie), the indices of both elements are included, and are guaranteed to be ordered by ascending element index.
axis UInt32 The index of the dimension to select elements across. This value must be less than the DimensionCount of the InputTensor.
k UInt32 The number of elements to select. k must be greater than 0, but less than the number of elements in the inputTensor along the dimension specified by axis.

Derived from BaseOperatorDesc.

BatchNormalizationOperatorDesc

Performs a batch normalization on the input. This operator performs the following computation:

\[Output = Scale \times \frac{Input - Mean}{ \sqrt{Variance + Epsilon} } + Bias\]

or

\[Output = FusedActivation(Scale \times \frac{Input - Mean}{ \sqrt{Variance + Epsilon} } + Bias)\]

when FusedActivation is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
meanTensor const TensorDesc* A tensor containing the Mean data.
varianceTensor const TensorDesc* A tensor containing the Variance data.
scaleTensor const TensorDesc* A tensor containing the Scale data.
biasTensor const TensorDesc* A tensor containing the Bias data.
epsilon float The epsilon value to use to avoid division by zero.
fusedActivation const OperatorDesc* An optional fused activation layer to apply after the normalization.

Derived from BaseOperatorDesc.

MeanVarianceNormalizationOperatorDesc

Performs a mean variance normalization function on the input tensor. This operator will calculate the mean and variance of the input tensor to perform normalization. This operator performs the following computation:

\[Output = Scale \times \frac{Input - Mean}{ \sqrt{Variance + Epsilon} } + Bias\]

or

\[Output = FusedActivation(Scale \times \frac{Input - Mean}{ \sqrt{Variance + Epsilon} } + Bias)\]

when FusedActivation is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from. This tensor’s dimensions should be [ BatchCount, ChannelCount, Height, Width ].
outputTensor const TensorDesc* Describes the output tensor to write the results to. This tensor’s dimensions are [ BatchCount, ChannelCount, Height, Width ].
scaleTensor const TensorDesc* An optional tensor containing the Scale data. This tensor’s dimensions should be [ BatchCount, ChannelCount, Height, Width ]. Any dimension can be replaced with 1 to broadcast in that dimension.
biasTensor const TensorDesc* An optional tensor containing the bias data. This tensor’s dimensions should be [ BatchCount, ChannelCount, Height, Width ]. Any dimension can be replaced with 1 to broadcast in that dimension.
crossChannel bool When true, the MeanVariance layer includes channels in the Mean and Variance calculations, meaning they are normalized across axes [ChannelCount, Height, Width]. When false, Mean and Variance calculations are normalized across axes [Height, Width] with each channel being independent.
normalizeVariance bool true if the Normalization layer includes Variance in the normalization calculation. Otherwise, false. If false, then normalization equation is \(Output = FusedActivation(Scale \times (Input - Mean) + Bias)\).
epsilon float The epsilon value to use to avoid division by zero. A value of 0.00001 is recommended as default.
fusedActivation const OperatorDesc* An optional fused activation layer to apply after the normalization.

Derived from BaseOperatorDesc.

LocalResponseNormalizationOperatorDesc

Performs a local response normalization (LRN) function on the input. This operator performs the following computation:

\[b_{c} = a_{c}\left(k + \frac{\alpha}{n}\sum_{c'=\max(0, c-n/2)}^{\min(N-1,c+n/2)}a_{c'}^2\right)^{-\beta}\]

Where the size is the number of neighbouring channels used for normalization, \(\alpha\) is multiplicative factor, \(\beta\) an exponent and \(k\) an additive factor.

Local Response Normalization is a normalization layer that implements the idea of lateral inhibition. Lateral inhibition is a concept in neurobiology that refers to the phenomenon of an excited neuron inhibiting its neighbors: this leads to a peak in the form of a local maximum, creating contrast in that area and increasing sensory perception. In practice, we can either normalize within the same channel or normalize across channels when we apply LRN to convolutional neural networks.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from. This tensor’s Sizes should be [ BatchCount, ChannelCount, Height, Width ].
outputTensor const TensorDesc* Describes the output tensor to write the results to. This tensor’s Sizes should match the InputTensor.
crossChannel bool true if the LRN layer sums across channels; otherwise, false.
localSize UInt32 The number of elements to sum over per dimension: Width, Height, and optionally Channel (if CrossChannel is set). This value must be at least 1.
alpha float The value of the scaling parameter. A value of 0.0001 is recommended as default.
beta float The value of the exponent. A value of 0.75 is recommended as default.
bias float The value of bias. A value of 1 is recommended as default.

Derived from UnaryOperatorDesc.

LPNormalizationOperatorDesc

Performs an Lp-normalization function along the specified axis of the input tensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to. This tensor’s Sizes should match the InputTensor.
axis UInt32 The axis on which to apply normalization.
epsilon float The epsilon value to use to avoid division by zero. A value of 0.00001 is recommended as default.
p UInt32 The order of the normalization (either 1 or 2).

Derived from UnaryOperatorDesc.

RNNOperatorDesc

Performs a one-layer simple recurrent neural network (RNN) function on the input. This function is often referred to as the Input Gate. This operator performs this function multiple times in a loop, dictated by the sequence length dimension and the SequenceLengthsTensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* A tensor containing the input data, X. Packed (and potentially padded) into one 4-D tensor with the sizes of [ 1, seq_length, batch_size, input_size ]. seq_length is the dimension that is mapped to the index, t.
weightTensor const TensorDesc* A tensor containing the weight data, W. Concatenation of W_i and W_Bi (if bidirectional). The tensor has sizes [ 1, num_directions, hidden_size, input_size ].
recurrenceTensor const TensorDesc* An optional tensor containing the recurrence weight data, R. Concatenation of R_i and R_Bi (if bidirectional). This tensor has sizes [ 1, num_directions, hidden_size, hidden_size ].
biasTensor const TensorDesc* An optional tensor containing the bias data for the input gate, B. Concatenation of { W_bi, R_bi }, and { W_Bbi, R_Bbi } (if bidirectional). This tensor has sizes [ 1, 1, num_directions, 2 * hidden_size ]. If not specified, then defaults to 0.
hiddenInitTensor const TensorDesc* An optional tensor containing the hidden node initializer tensor, H_[t-1] for the first loop index t. If not specified, then defaults to 0. This tensor has sizes [ 1, num_directions, batch_size, hidden_size ].
sequenceLengthsTensor const TensorDesc* An optional tensor containing an independent seq_length for each element in the batch. If not specified, then all sequences in the batch have length seq_length. This tensor has sizes [ 1, 1, 1, batch_size ].
outputSequenceTensor const TensorDesc* An optional tensor with which to write the concatenation of all the intermediate layer output values of the hidden nodes, H_t. This tensor has sizes [ seq_length, num_directions, batch_size, hidden_size ]. seq_length is mapped to the loop index t.
outputSingleTensor const TensorDesc* An optional tensor with which to write the final output value of the hidden nodes, H_t. This tensor has sizes [ 1, num_directions, batch_size, hidden_size ].
activationDescCount UInt32 This parameter determines the size of the ActivationDescs array.
activationDescs const OperatorDesc* An array of DML::OperatorDesc containing the descriptions of the activation operators. The number of activation functions is equal to the number of directions. For forwards and backwards directions there is expected to be 1 activation function. For Bidirectional there are expected to be 2.
direction RecurrentNetworkDirection The direction of the operator: forward, backward, or bidirectional.

Derived from BaseOperatorDesc.

LSTMOperatorDesc

Performs a one-layer long short term memory (LSTM) function on the input. This operator uses multiple gates to perform this layer. These gates are performed multiple times in a loop, dictated by the sequence length dimension and the SequenceLengthsTensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* A tensor containing the input data, X. Packed (and potentially padded) into one 4-D tensor with the sizes of [ 1, seq_length, batch_size, input_size ]. seq_length is the dimension that is mapped to the index, t.
weightTensor const TensorDesc* A tensor containing the weight data, W. Concatenation of W_[iofc] and W_B[iofc] (if bidirectional). The tensor has sizes [ 1, num_directions, 4 * hidden_size, input_size ].
recurrenceTensor const TensorDesc* A tensor containing the recurrence data, R. Concatenation of R_[iofc] and R_B[iofc] (if bidirectional). This tensor has sizes [ 1, num_directions, 4 * hidden_size, hidden_size ].
biasTensor const TensorDesc* An optional tensor containing the bias data, B. Concatenation of { W_b[iofc], R_b[iofc] }, and { W_Bb[iofc], R_Bb[iofc] } (if bidirectional). This tensor has sizes [ 1, 1, num_directions, 8 * hidden_size ]. If not specified, then defaults to 0 bias.
hiddenInitTensor const TensorDesc* An optional tensor containing the hidden node initializer data, H_(t-1). Contents of this tensor are only used on the first loop index t. If not specified, then defaults to 0. This tensor has sizes [ 1, num_directions, batch_size, hidden_size ].
cellMemInitTensor const TensorDesc* An optional tensor containing the cell initializer data, C_(t-1). Contents of this tensor are only used on the first loop index t. If not specified, then defaults to 0. This tensor has sizes [ 1, num_directions, batch_size, hidden_size ].
sequenceLengthsTensor const TensorDesc* An optional tensor containing an independent seq_length for each element in the batch. If not specified, then all sequences in the batch have length seq_length. This tensor has sizes [ 1, 1, 1, batch_size ].
peepholeTensor const TensorDesc* An optional tensor containing the weight data for peepholes, P. If not specified, then defaults to 0. Concatenation of P_[iof] and P_B[iof] (if bidirectional). This tensor has sizes [ 1, 1, num_directions, 3 * hidden_size ].
outputSequenceTensor const TensorDesc* An optional tensor with which to write the concatenation of all the intermediate output values of the hidden nodes, H_t. This tensor has sizes [ seq_length, num_directions, batch_size, hidden_size ]. seq_length is mapped to the loop index t.
outputSingleTensor const TensorDesc* An optional tensor with which to write the last output value of the hidden nodes, H_t. This tensor has sizes [ 1, num_directions, batch_size, hidden_size ].
outputCellSingleTensor const TensorDesc* An optional tensor with which to write the last output value of the cell, C_t. This tensor has sizes [ 1, num_directions, batch_size, hidden_size ].
activationDescCount UInt32 The size of the activationDescs array.
activationDescs const OperatorDesc* An array of DML::OperatorDesc containing the descriptions of the activation operators f(), g(), and h(). f(), g(), and h() are defined independently of direction, meaning that if RecurrentNetworkDirection::Forward or RecurrentNetworkDirection::Backward are supplied in Direction, then three activations must be provided. If RecurrentNetworkDirection::Bidirectional is defined, then six activations must be provided. For bidirectional, activations must be provided f(), g(), and h() for forward followed by f(), g(), and h() for backwards.
direction RecurrentNetworkDirection The direction of the operator: forward, backward, or bidirectional.
clipThreshold float The cell clip threshold. Clipping bounds the elements of a tensor in the range of [-ClipThreshold, +ClipThreshold], and is applied to the input of activations.
useClipThreshold bool true if clipThreshold should be used. Otherwise, false.
coupleInputForget bool true if the input and forget gates should be coupled. Otherwise, false.

Derived from BaseOperatorDesc.

GRUOperatorDesc

Performs a (standard layers) one-layer gated recurrent unit (GRU) function on the input. This operator uses multiple gates to perform this layer. These gates are performed multiple times in a loop dictated by the sequence length dimension and the SequenceLengthsTensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* A tensor containing the input data, X. Packed (and potentially padded) into one 4D tensor with the Sizes of [ 1, seq_length, batch_size, input_size ]. seq_length is the dimension that is mapped to the index, t.
weightTensor const TensorDesc* A tensor containing the weight data, W. Concatenation of W_[zrh] and W_B[zrh] (if bidirectional). The tensor has Sizes [ 1, num_directions, 3 * hidden_size, input_size ].
recurrenceTensor const TensorDesc* A tensor containing the recurrence data, R. Concatenation of R_[zrh] and R_B[zrh] (if bidirectional). The tensor has Sizes [ 1, num_directions, 3 * hidden_size, hidden_size ].
biasTensor const TensorDesc* An optional tensor containing the bias data, B. Concatenation of (W_b[zrh], R_b[zrh]) and (W_Bb[zrh], R_Bb[zrh]) (if bidirectional). The tensor has Sizes [ 1, 1, num_directions, 6 * hidden_size ].
hiddenInitTensor const TensorDesc* An optional tensor containing the hidden node initializer tensor, H_t-1 for the first loop index t. If not specified, then defaults to 0. This tensor has Sizes [ 1, num_directions, batch_size, hidden_size ].
sequenceLengthsTensor const TensorDesc* An optional tensor containing an independent seq_length for each element in the batch. If not specified, then all sequences in the batch have length seq_length. This tensor has Sizes [ 1, 1, 1, batch_size ].
outputSequenceTensor const TensorDesc* An optional tensor with which to write the concatenation of all the intermediate output values of the hidden nodes, H_t. This tensor has Sizes [ seq_length, num_directions, batch_size, hidden_size ]. seq_length is mapped to the loop index t.
outputSingleTensor const TensorDesc* An optional tensor with which to write the last output value of the hidden nodes, H_t. This tensor has Sizes [ 1, num_directions, batch_size, hidden_size ].
activationDescCount UInt32 This size of the activationDescs array.
activationDescs const OperatorDesc* An array of DML_OPERATOR_DESC containing the descriptions of the activation operators, f() and g(). Both f() and g() are defined independently of direction, meaning that if RecurrentNetworkDirection::Forward or RecurrentNetworkDirection::Backward are supplied in Direction, then two activations must be provided. If RecurrentNetworkDirection::Bidirectional is supplied, then four activations must be provided. For bidirectional, activations must be provided f() and g() for forward followed by f() and g() for backwards.
direction RecurrentNetworkDirection The direction of the operator: forward, backward, or bidirectional.
linearBeforeReset bool true to specify that, when computing the output of the hidden gate, the linear transformation should be applied before multiplying by the output of the reset gate. Otherwise, false.

Derived from BaseOperatorDesc.

ElementWiseSignOperatorDesc

Returns a value representing the sign of each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = \begin{cases} -1, & \text{ if } x < 0\\ 0, & \text{ if } x = 0\\ 1, & \text{ if } x > 0 \end{cases}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from UnaryOperatorDesc.

ElementWiseIsNaNOperatorDesc

For each element of the input tensor, returns 1 if the input is NaN (as defined by IEEE-754), and 0 otherwise. The result is placed into the corresponding element of the output tensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from UnaryOperatorDesc.

ElementWiseErfOperatorDesc

Performs the Gaussian error function (erf) on each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[erf(x)=\frac{2}{\sqrt{\pi}}\int_{0}^{x}e^{-t^{2}}\, dt\]

or

\[erf(x)=\frac{2}{\sqrt{\pi}}\int_{0}^{x \times scale + bias}e^{-t^{2}}\, dt\]

when scaleBias is specified.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseSinHOperatorDesc

Computes the hyperbolic sine of each element of InputTensor, placing the result into the corresponding element of OutputTensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseCosHOperatorDesc

Computes the hyperbolic cosine of each element of InputTensor, placing the result into the corresponding element of OutputTensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseTanHOperatorDesc

Computes the hyperbolic tangent of element of InputTensor, placing the result into the corresponding element of OutputTensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseASinHOperatorDesc

Computes the hyperbolic arcsine for each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = log_e(x + \sqrt{x^2 + 1})\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseACosHOperatorDesc

Computes the hyperbolic arccosine for each element of InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = log_e(x + \sqrt{x^2 - 1})\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseATanHOperatorDesc

Computes the hyperbolic arctangent for each element of InputTensor, placing the result into the corresponding element of OutputTensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
scaleBias const DML::ScaleBias* An optional scale and bias to apply to the input. If present, this has the effect of applying the function \(g(x) = x \times scale + bias\) to each input element prior to computing this operator.

Derived from UnaryOperatorWithScaleBiasDesc.

This operator supports in-place execution, meaning that OutputTensor can be bound to the same tensor as InputTensor during binding.

ElementWiseIfOperatorDesc

Selects elements either from ATensor or BTensor, depending on the value of the corresponding element in ConditionTensor. Non-zero elements of ConditionTensor select from ATensor, while zero-valued elements select from BTensor.

Constructor parameters:

Parameter Type Description
conditionTensor const TensorDesc* Describes the condition tensor to read from.
inputTensorA const TensorDesc* Describes the first input tensor to read from.
inputTensorB const TensorDesc* Describes the second input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.

Derived from BaseOperatorDesc.

ActivationShrinkOperatorDesc

Performs the shrink activation function on every element in InputTensor, placing the result into the corresponding element of OutputTensor.

\[f(x) = \begin{cases} x - Bias, & \text{ if } x > Threshold\\ x + Bias, & \text{ if } if x < -Threshold\\ 0, & \text{ otherwise } \end{cases}\]

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
bias float The value of the bias. The default for this parameter is 0.0.
threshold float The value of the threshold. The default for this parameter is 0.5.

Derived from UnaryOperatorWithScaleBiasDesc.

MaxPooling1OperatorDesc

Computes the maximum value across the elements within the sliding window over the input tensor, and optionally returns the indices of the maximum values selected.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from, with Sizes [ BatchCount, ChannelCount, Height, Width ] if InputTensor.DimensionCount is 4, and [ BatchCount, ChannelCount, Depth, Height, Weight ] if InputTensor.DimensionCount is 5.
outputTensor const TensorDesc* Describes the output tensor to write the results to.
outputIndicesTensor const TensorDesc* An optional output tensor of indices to the input tensor InputTensor of the maximum values produced and stored in the OutputTensor. These index values are zero-based and treat the input tensor as a contiguous one-dimensional array. When multiple elements within the sliding window have the same value, the later equal values are ignored and the index points to the first value encountered. Both the OutputTensor and OutputIndicesTensor have the same tensor sizes.
dimensionCount UInt32 The number of spatial dimensions of the input tensor inputTensor, which also corresponds to the number of dimensions of the sliding window windowSize. This parameter is also the size of the strides, startPadding, and endPadding arrays. It should be set to 2 when inputTensor is 4D, and 3 when it’s a 5D tensor.
strides const UInt32* The strides for the sliding window dimensions of sizes [ Height, Width ] when the dimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
windowSize const UInt32* The dimensions of the sliding window in [ Height, Width ] when dimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
startPadding const UInt32* The number of padding elements to be applied to the beginning of each spatial dimension of the input tensor InputTensor. The values are in [ Height, Width ] when dimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.
endPadding const UInt32* The number of padding elements to be applied to the end of each spatial dimension of the input tensor InputTensor. The values are in [ Height, Width ] when dimensionCount is set to 2, or [ Depth, Height, Width ] when set to 3.

The sizes of the output tensor can be calculated like this:

OutputTensor.Sizes[0] = InputTensor.Sizes[0];
OutputTensor.Sizes[1] = InputTensor.Sizes[1];

for (UInt32 i = 0; i < DimensionCount; ++i) 
{
  UInt32 PaddedSize = InputTensor.Sizes[i + 2] + StartPadding[i] + EndPadding[i];
  OutputTensor.Sizes[i + 2] = (PaddedSize - WindowSizes[i]) / Strides[i] + 1;
}

Derived from UnaryOperatorDesc.

MaxUnpoolingOperatorDesc

Inverts a max-pooling operation (see MaxPooling1OperatorDesc for details) by filling the output tensor OutputTensor with the values in the input tensor InputTensor, as obtained from a max-pooling operation, according to the index values provided in the IndicesTensor. The elements in the output tensor untouched by this process are left with zero values.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* Describes the input tensor to read from, with Sizes [ Batch, Channel, Height, Width ]. The tensor values are obtained from the values in the OutputTensor of a max-pooling operation.
indicesTensor const TensorDesc* A tensor of indices to the output tensor OutputTensor for the values given in the input tensor InputTensor. These index values are zero-based, and treat the output tensor as a contiguous one-dimensional array. Both the InputTensor and IndicesTensor have the same tensor sizes. The tensor values are obtained from the OutputIndicesTensor of a max-pooling operation.
outputTensor const TensorDesc* Describes the output tensor to write the results to with the same number of dimensions as the input tensor.

Derived from BaseOperatorDesc.

DiagonalMatrixOperatorDesc

Generates an identity-like matrix with ones (or other explicit value) on the major diagonal, and zeros everywhere else. The diagonal ones may be shifted (via Offset) where OutputTensor[i, i + Offset] = Value, meaning that an argument of Offset greater than zero shifts all values to the right, and less than zero shifts them to the left. This generator operator is useful for models to avoid storing a large constant tensor. Any leading dimensions before the last two are treated as a batch count, meaning that the tensor is treated as stack of 2D matrices.

Constructor parameters:

Parameter Type Description
outputTensor const TensorDesc* Describes the output tensor to write the results to. The dimensions are [ Batch1, Batch2, OutputHeight, OutputWidth ]. The height and width don’t need to be square.
offset Int32 An offset to shift the diagonal lines of Value, with positive offsets shifting the written value to the right/up (viewing the output as a matrix with the top left as 0,0), and negative offsets to the left/down.
value float A value to fill along the 2D diagonal. The default value is 1.0. Note that if the DataType of the tensors is not DML::TensorDataType::Float16 or DML::TensorDataType::Float32, then the value might be truncated (for example, 10.6 will become 10).

Derived from BaseOperatorDesc.

ScatterElementsOperatorDesc

Copies the whole input tensor to the output, then overwrites selected indices with corresponding values from the updates tensor.

Constructor parameters:

Parameter Type Description
inputTensor const TensorDesc* The tensor to read from.
indicesTensor const TensorDesc* A tensor containing the indices into the output tensor. The Sizes must match inputTensor.Sizes for every dimension except axis. Negative indices are interpreted as being relative to the end of the axis dimension. For example, an index of -1 refers to the last element along that dimension.
updatesTensor const TensorDesc* A tensor containing the new values to replace the existing input values at the corresponding indices. The Sizes of this tensor must match indicesTensor.Sizes. The DataType must match InputTensor.DataType.
outputTensor const TensorDesc* The tensor to write the results to. The Sizes and DataType of this tensor must match InputTensor.
axis UInt32 The axis dimension to use for indexing in OutputTensor, ranging [0, OutputTensor.DimensionCount).

This operator performs the following pseudocode:

output = input
output[index[i, j, k, ...], j, k, ...] = updates[i, j, k, ...] // if axis == 0
output[i, index[i, j, k, ...], k, ...] = updates[i, j, k, ...] // if axis == 1
output[i, j, index[i, j, k, ...], ...] = updates[i, j, k, ...] // if axis == 2
...

If two output element indices overlap (which is invalid), then there’s no guarantee of which last write wins.

ScatterOperatorDesc is the inverse of GatherOperatorDesc.

Derived from BaseOperatorDesc.