Tensors
Open tutorial in Github
In DiffKt there are many different types of differentiable tensors. Tensor means a multi-dimensional array. A float scalar is a 0D tensor. A vector is a 1D tensor. A 2D array is a 2D tensor. A 3D array is a 3D tensor, and so on.
DTensor is the interface for all differentiable tensors in DiffKt. A differentiable tensor can be a scalar, a 1D tensor, a 2D tensor, a 3D tensor, or have even more dimensions. Scalars also inherit from DTensor. A tensor has a number of properties, functions, or extensions defined in the interface. Properties we will discuss about DTensor are size, rank, shape, isScalar, and indexing.
A tensor has a size, which is the number of elements in the tensor,
A tensor has a rank, which indicates the number of dimensions: rank 0 - scalar, rank 1 - 1D tensor, rank 2 - 2D tensor, rank 3 - 3D tensor, and so on.
A tensor has a shape, which indicates the number of axes and the length of each axis of the tensor.
A tensor has an boolean property to see if it is a scalar, isScalar.
Retrieve an element of a tensor use indexing, with the indices indicating the location of the element,
such as [0,0]
to get the first element of the 2D array.
FloatTensor is an an abstract class for the implementation of DTensor for floating point numbers. There are multiple types of implementations such as scalar, dense, and sparse tensors.
DScalar is the interface for all differentiable scalars.
FloatScalar is an implementation of the interfaces DScalar and FloatTensor.
tensorOf is a factory function that creates a FloatTensor from a set of float numbers. The initial tensor is a 1D array. After creating a tensor with tensorOf, you may need to reshape the tensor to the shape you want.
Tensor Operations
The DTensor interface has many operations that can be applied to a tensor. Click on the Extentions tab in the Kotlin docs of DTensor to see all the operations. Some of the operations allow the use of traditional arithmatic notation, or operator overloading. We will look at a few of the operations in the below examples:
pow,
sin,
cos,
sum and,
Calculating the Derivative of a Scalar Function
There are two different algorithms for calculating the derivative of a function over a DScalar variable, the forward derivative algorithm and the reverse derivative algorithm. The forward derivative algorithm is more efficient for when a function has more output variables than input variables. The reverse derivative algorithm is more efficient for a function that has more input variables that output variables. For most situations of optimizing a scalar function, where the output of the function is a single variable, the reverse derivative algorithm is more efficient.
In calling the below functions, one passes a scalar variable,
a DScalar,
to be differentiated and a lambda of the function of the variable. In Kotlin, if you
declare the function fun f(x)
then the lambda is ::f
.
forwardDerivative calculates
the derivative of a function over a DScalar
evaluated at the DScalar x
using
the forward derivative algorithm.
reverseDerivative calculates the
derivative of a function over a DScalar
evaluated at the DScalar x
using
the reverse derivative algorithm.
In many cases it is more efficient to calculate the orignal scalar function and its derivative at the same time.
In the below functions, they return a Pair<DTensor, DTensor>
where the first value is called the primal
,
which is the value of a function evaluated at x
, where x
is a tensor, and the second value is called
the tangent
, which is the derivative of a function evaluated at x
, where x
is a tensor.
primalAndForwardDerivative
calculates a function over DScalar and its
derivative evaluated at the DScalar x
using
the forward derivative algorithm.
primalAndReverseDerivative
calculates a function over a DScalar and its
derivative evaluated at the DScalar x
using
the reverse derivative algorithm.
Derivatives of a Function over a Tensor
The symbol nabla, , is an inverted greek symbol . The gradient of a function over a vector of variables is , and is the partial derivatives of the function with respect to each variable. The Jacobian of a vector valued function, either or is the gradient of each vector component of the function, or the partial derivatives of each vector component of the function with respect to each variable.
The partial derivatives of a function with N inputs and 1 output at a point , where is a vector of size N, or a function , is the gradient of the function, which is a function . The gradient of a function of N variables, where is
.
For example, if then , where .
The partial derivatives of a function with N inputs and M outputs at a point , where is of size N, or a function , is the Jacobian of the function, or . The point is a vector of variables, . The function is a vector of functions evaluated at , .
The Jacobian of a function is the partial derivatives of each component function by each variable.
.
For example, if then
the Jacobian is .
forwardDerivative
calculates the derivative of a function over a tensor, evaluated at the tensor x
, using
the forward derivative algorithm.
reverseDerivative calculates the
derivative of a function over a tensor, evaluated at the tensor x
, using the reverse derivative algorithm.
The reverse derivative algorithm returns the transpose of the derivative calculation, compared to the forward
derivative algorithm, when the result is a Jacobian or 2D tensor.
In many cases it is more efficient to calculate the orignal function and its partial derivatives at the same time.
In the below functions, they return a Pair<DTensor, DTensor>
. The first value is called the primal
, which is
the value of a function evaluated at x
, where x
is a tensor. The second value is called the tangent
, which is
the derivative of a function evaluated at x
, where x
is a tensor.
primalAndForwardDerivative
calculates a function over a tensor x
and its derivative, evaluated at the tensor x,
using the
forward derivative algorithm.
primalAndReverseDerivative calculates
a function over a tensor x
and its derivative, evaluated at the tensor x
, using the reverse derivative algorithm.
The reverse derivative algorithm returns the transpose of the derivative calculation, compared to the forward
derivative algorithms, when the result is a Jacobian or 2D tensor.