Python Class Documentation & Reference

This section provides a breakdown of the Python classes and what each of their functions provide. Below is a diagram that provides insights on the relationship between Kompute objects and Vulkan SDK resources, which primarily encompass ownership of either CPU and/or GPU memory.

Manager

The Kompute Manager provides a high level interface to simplify interaction with underlying kp.Sequence of Operations.

class kp.Manager

Base orchestrator which creates and manages device and child components

algorithm(*args, **kwargs)

Overloaded function.

  1. algorithm(self: kp.Manager, tensors: List[kp.Tensor], spirv: bytes, workgroup: List[int[3]] = [0, 0, 0], spec_consts: List[float] = [], push_consts: List[float] = []) -> kp.Algorithm

Create a managed algorithm that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param tensors (optional) The tensors to initialise the algorithm with @param spirv (optional) The SPIRV bytes for the algorithm to dispatch @param workgroup (optional) kp::Workgroup for algorithm to use, and defaults to (tensor[0].size(), 1, 1) @param specializationConstants (optional) kp::Constant to use for specialization constants, and defaults to an empty constant @param pushConstants (optional) kp::Constant to use for push constants, and defaults to an empty constant @returns Shared pointer with initialised algorithm

  1. algorithm(self: kp.Manager, tensors: List[kp.Tensor], spirv: bytes, workgroup: List[int[3]] = [0, 0, 0], spec_consts: numpy.ndarray = [], push_consts: numpy.ndarray = []) -> kp.Algorithm

Create a managed algorithm that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param tensors (optional) The tensors to initialise the algorithm with @param spirv (optional) The SPIRV bytes for the algorithm to dispatch @param workgroup (optional) kp::Workgroup for algorithm to use, and defaults to (tensor[0].size(), 1, 1) @param specializationConstants (optional) kp::Constant to use for specialization constants, and defaults to an empty constant @param pushConstants (optional) kp::Constant to use for push constants, and defaults to an empty constant @returns Shared pointer with initialised algorithm

destroy(self: kp.Manager) → None

Destroy the GPU resources and all managed resources by manager.

get_device_properties(self: kp.Manager) → dict

Return a dict containing information about the device

list_devices(self: kp.Manager) → list

Return a dict containing information about the device

sequence(self: kp.Manager, queue_index: int = 0, total_timestamps: int = 0)kp.Sequence

Create a managed sequence that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param queueIndex The queue to use from the available queues @param nrOfTimestamps The maximum number of timestamps to allocate. If zero (default), disables latching of timestamps. @returns Shared pointer with initialised sequence

tensor(self: kp.Manager, data: numpy.ndarray[numpy.float32], tensor_type: kp.TensorTypes = <TensorTypes.device: 0>)kp.Tensor
tensor_t(self: kp.Manager, data: numpy.ndarray, tensor_type: kp.TensorTypes = <TensorTypes.device: 0>)kp.Tensor

Create a managed tensor that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param data The data to initialize the tensor with @param tensorType The type of tensor to initialize @returns Shared pointer with initialised tensor

Sequence

The Kompute Sequence consists of batches of Kompute Operations, which are executed on a respective GPU queue. The execution of sequences can be synchronous or asynchronous, and it can be coordinated through its respective vk::Fence.

class kp.Sequence
clear(self: kp.Sequence) → None

Clear function clears all operations currently recorded and starts recording again.

destroy(self: kp.Sequence) → None

Destroys and frees the GPU resources which include the buffer and memory and sets the sequence as init=False.

eval(*args, **kwargs)

Overloaded function.

  1. eval(self: kp.Sequence) -> kp.Sequence

Eval sends all the recorded and stored operations in the vector of operations into the gpu as a submit job synchronously (with a barrier).

@return shared_ptr<Sequence> of the Sequence class itself

  1. eval(self: kp.Sequence, arg0: kp.OpBase) -> kp.Sequence

Resets all the recorded and stored operations, records the operation provided and submits into the gpu as a submit job synchronously (with a barrier).

@return shared_ptr<Sequence> of the Sequence class itself

eval_async(*args, **kwargs)

Overloaded function.

  1. eval_async(self: kp.Sequence) -> kp.Sequence

Eval Await waits for the fence to finish processing and then once it finishes, it runs the postEval of all operations.

@param waitFor Number of milliseconds to wait before timing out. @return shared_ptr<Sequence> of the Sequence class itself

  1. eval_async(self: kp.Sequence, arg0: kp.OpBase) -> kp.Sequence

Eval Async sends all the recorded and stored operations in the vector of operations into the gpu as a submit job without a barrier. EvalAwait() must ALWAYS be called after to ensure the sequence is terminated correctly.

@return Boolean stating whether execution was successful.

eval_await(*args, **kwargs)

Overloaded function.

  1. eval_await(self: kp.Sequence) -> kp.Sequence

Eval Await waits for the fence to finish processing and then once it finishes, it runs the postEval of all operations.

@param waitFor Number of milliseconds to wait before timing out. @return shared_ptr<Sequence> of the Sequence class itself

  1. eval_await(self: kp.Sequence, arg0: int) -> kp.Sequence

Eval Await waits for the fence to finish processing and then once it finishes, it runs the postEval of all operations.

@param waitFor Number of milliseconds to wait before timing out. @return shared_ptr<Sequence> of the Sequence class itself

get_timestamps(self: kp.Sequence) → List[int]

Return the timestamps that were latched at the beginning and after each operation during the last eval() call.

is_init(self: kp.Sequence) → bool

Returns true if the sequence has been initialised, and it’s based on the GPU resources being refrenced.

@return Boolean stating if is initialized

is_recording(self: kp.Sequence) → bool

Returns true if the sequence is currently in recording activated.

@return Boolean stating if recording ongoing.

is_running(self: kp.Sequence) → bool

Returns true if the sequence is currently running - mostly used for async workloads.

@return Boolean stating if currently running.

record(self: kp.Sequence, arg0: kp.OpBase)kp.Sequence

Record function for operation to be added to the GPU queue in batch. This template requires classes to be derived from the OpBase class. This function also requires the Sequence to be recording, otherwise it will not be able to add the operation.

@param op Object derived from kp::BaseOp that will be recoreded by the sequence which will be used when the operation is evaluated. @return shared_ptr<Sequence> of the Sequence class itself

rerecord(self: kp.Sequence) → None

Clears command buffer and triggers re-record of all the current operations saved, which is useful if the underlying kp::Tensors or kp::Algorithms are modified and need to be re-recorded.

Tensor

The Kompute Tensor is the atomic unit in Kompute, and it is used primarily for handling Host and GPU Device data.

class kp.Tensor

Structured data used in GPU operations.

Tensors are the base building block in Kompute to perform operations across GPUs. Each tensor would have a respective Vulkan memory and buffer, which would be used to store their respective data. The tensors can be used for GPU data storage or transfer.

data(self: kp.Tensor) → numpy.ndarray
data_type(self: kp.Tensor) → kp::Tensor::TensorDataTypes

Retrieve the underlying data type of the Tensor

@return Data type of tensor of type kp::Tensor::TensorDataTypes

destroy(self: kp.Tensor) → None

Destroys and frees the GPU resources which include the buffer and memory.

is_init(self: kp.Tensor) → bool

Check whether tensor is initialized based on the created gpu resources.

@returns Boolean stating whether tensor is initialized

size(self: kp.Tensor) → int

Returns the size/magnitude of the Tensor, which will be the total number of elements across all dimensions

@return Unsigned integer representing the total number of elements

tensor_type(self: kp.Tensor)kp.TensorTypes

Retrieve the tensor type of the Tensor

@return Tensor type of tensor

TensorType

class kp.Algorithm

Main constructor for algorithm with configuration parameters to create the underlying resources.

@param device The Vulkan device to use for creating resources @param tensors (optional) The tensors to use to create the descriptor resources @param spirv (optional) The spirv code to use to create the algorithm @param workgroup (optional) The kp::Workgroup to use for the dispatch which defaults to kp::Workgroup(tensor[0].size(), 1, 1) if not set. @param specializationConstants (optional) The std::vector<float> to use to initialize the specialization constants which cannot be changed once set. @param pushConstants (optional) The std::vector<float> to use when initializing the pipeline, which set the size of the push constants - these can be modified but all new values must have the same vector size as this initial value.

destroy(self: kp.Algorithm) → None
get_tensors(self: kp.Algorithm) → List[kp::Tensor]

Gets the current tensors that are used in the algorithm.

@returns The list of tensors used in the algorithm.

is_init(self: kp.Algorithm) → bool

function that checks all the gpu resource components to verify if these have been created and returns true if all are valid.

@returns returns true if the algorithm is currently initialized.

class kp.Manager

Base orchestrator which creates and manages device and child components

algorithm(*args, **kwargs)

Overloaded function.

  1. algorithm(self: kp.Manager, tensors: List[kp.Tensor], spirv: bytes, workgroup: List[int[3]] = [0, 0, 0], spec_consts: List[float] = [], push_consts: List[float] = []) -> kp.Algorithm

Create a managed algorithm that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param tensors (optional) The tensors to initialise the algorithm with @param spirv (optional) The SPIRV bytes for the algorithm to dispatch @param workgroup (optional) kp::Workgroup for algorithm to use, and defaults to (tensor[0].size(), 1, 1) @param specializationConstants (optional) kp::Constant to use for specialization constants, and defaults to an empty constant @param pushConstants (optional) kp::Constant to use for push constants, and defaults to an empty constant @returns Shared pointer with initialised algorithm

  1. algorithm(self: kp.Manager, tensors: List[kp.Tensor], spirv: bytes, workgroup: List[int[3]] = [0, 0, 0], spec_consts: numpy.ndarray = [], push_consts: numpy.ndarray = []) -> kp.Algorithm

Create a managed algorithm that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param tensors (optional) The tensors to initialise the algorithm with @param spirv (optional) The SPIRV bytes for the algorithm to dispatch @param workgroup (optional) kp::Workgroup for algorithm to use, and defaults to (tensor[0].size(), 1, 1) @param specializationConstants (optional) kp::Constant to use for specialization constants, and defaults to an empty constant @param pushConstants (optional) kp::Constant to use for push constants, and defaults to an empty constant @returns Shared pointer with initialised algorithm

destroy(self: kp.Manager) → None

Destroy the GPU resources and all managed resources by manager.

get_device_properties(self: kp.Manager) → dict

Return a dict containing information about the device

list_devices(self: kp.Manager) → list

Return a dict containing information about the device

sequence(self: kp.Manager, queue_index: int = 0, total_timestamps: int = 0)kp.Sequence

Create a managed sequence that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param queueIndex The queue to use from the available queues @param nrOfTimestamps The maximum number of timestamps to allocate. If zero (default), disables latching of timestamps. @returns Shared pointer with initialised sequence

tensor(self: kp.Manager, data: numpy.ndarray[numpy.float32], tensor_type: kp.TensorTypes = <TensorTypes.device: 0>)kp.Tensor
tensor_t(self: kp.Manager, data: numpy.ndarray, tensor_type: kp.TensorTypes = <TensorTypes.device: 0>)kp.Tensor

Create a managed tensor that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param data The data to initialize the tensor with @param tensorType The type of tensor to initialize @returns Shared pointer with initialised tensor

class kp.OpAlgoDispatch

Operation that provides a general abstraction that simplifies the use of algorithm and parameter components which can be used with shaders. By default it enables the user to provide a dynamic number of tensors which are then passed as inputs.

class kp.OpBase

Base Operation which provides the high level interface that Kompute operations implement in order to perform a set of actions in the GPU.

Operations can perform actions on tensors, and optionally can also own an Algorithm with respective parameters. kp::Operations with kp::Algorithms would inherit from kp::OpBaseAlgo.

class kp.OpMult

Operation that performs multiplication on two tensors and outpus on third tensor.

class kp.OpTensorCopy

Operation that copies the data from the first tensor to the rest of the tensors provided, using a record command for all the vectors. This operation does not own/manage the memory of the tensors passed to it. The operation must only receive tensors of type

class kp.OpTensorSyncDevice

Operation that syncs tensor’s device by mapping local data into the device memory. For TensorTypes::eDevice it will use a record operation for the memory to be syncd into GPU memory which means that the operation will be done in sync with GPU commands. For TensorTypes::eHost it will only map the data into host memory which will happen during preEval before the recorded commands are dispatched.

class kp.OpTensorSyncLocal

Operation that syncs tensor’s local memory by mapping device data into the local CPU memory. For TensorTypes::eDevice it will use a record operation for the memory to be syncd into GPU memory which means that the operation will be done in sync with GPU commands. For TensorTypes::eHost it will only map the data into host memory which will happen during preEval before the recorded commands are dispatched.

class kp.Tensor

Structured data used in GPU operations.

Tensors are the base building block in Kompute to perform operations across GPUs. Each tensor would have a respective Vulkan memory and buffer, which would be used to store their respective data. The tensors can be used for GPU data storage or transfer.

data(self: kp.Tensor) → numpy.ndarray
data_type(self: kp.Tensor) → kp::Tensor::TensorDataTypes

Retrieve the underlying data type of the Tensor

@return Data type of tensor of type kp::Tensor::TensorDataTypes

destroy(self: kp.Tensor) → None

Destroys and frees the GPU resources which include the buffer and memory.

is_init(self: kp.Tensor) → bool

Check whether tensor is initialized based on the created gpu resources.

@returns Boolean stating whether tensor is initialized

size(self: kp.Tensor) → int

Returns the size/magnitude of the Tensor, which will be the total number of elements across all dimensions

@return Unsigned integer representing the total number of elements

tensor_type(self: kp.Tensor)kp.TensorTypes

Retrieve the tensor type of the Tensor

@return Tensor type of tensor

class kp.TensorTypes

Members:

device : < Type is device memory, source and destination

host : < Type is host memory, source and destination

storage : < Type is Device memory (only)

property name