Python Class Documentation & Reference¶

This section provides a breakdown of the Python classes and what each of their functions provide. Below is a diagram that provides insights on the relationship between Kompute objects and Vulkan SDK resources, which primarily encompass ownership of either CPU and/or GPU memory.

Manager¶

The Kompute Manager provides a high level interface to simplify interaction with underlying kp.Sequence of Operations.

class kp.Manager¶

Base orchestrator which creates and manages device and child components

algorithm(*args, **kwargs)¶

Overloaded function.

algorithm(self: kp.Manager, tensors: List[kp.Tensor], spirv: bytes, workgroup: List[int[3]] = [0, 0, 0], spec_consts: List[float] = [], push_consts: List[float] = []) -> kp.Algorithm

Create a managed algorithm that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param tensors (optional) The tensors to initialise the algorithm with @param spirv (optional) The SPIRV bytes for the algorithm to dispatch @param workgroup (optional) kp::Workgroup for algorithm to use, and defaults to (tensor[0].size(), 1, 1) @param specializationConstants (optional) kp::Constant to use for specialization constants, and defaults to an empty constant @param pushConstants (optional) kp::Constant to use for push constants, and defaults to an empty constant @returns Shared pointer with initialised algorithm

algorithm(self: kp.Manager, tensors: List[kp.Tensor], spirv: bytes, workgroup: List[int[3]] = [0, 0, 0], spec_consts: numpy.ndarray = [], push_consts: numpy.ndarray = []) -> kp.Algorithm

Create a managed algorithm that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param tensors (optional) The tensors to initialise the algorithm with @param spirv (optional) The SPIRV bytes for the algorithm to dispatch @param workgroup (optional) kp::Workgroup for algorithm to use, and defaults to (tensor[0].size(), 1, 1) @param specializationConstants (optional) kp::Constant to use for specialization constants, and defaults to an empty constant @param pushConstants (optional) kp::Constant to use for push constants, and defaults to an empty constant @returns Shared pointer with initialised algorithm

destroy(self: kp.Manager) → None¶: Destroy the GPU resources and all managed resources by manager.

get_device_properties(self: kp.Manager) → dict¶: Return a dict containing information about the device

list_devices(self: kp.Manager) → list¶: Return a dict containing information about the device

sequence(self: kp.Manager, queue_index: int = 0, total_timestamps: int = 0) → kp.Sequence ¶

Create a managed sequence that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param queueIndex The queue to use from the available queues @param nrOfTimestamps The maximum number of timestamps to allocate. If zero (default), disables latching of timestamps. @returns Shared pointer with initialised sequence

tensor(self: kp.Manager, data: numpy.ndarray[numpy.float32], tensor_type: kp.TensorTypes = <TensorTypes.device: 0>) → kp.Tensor ¶

tensor_t(self: kp.Manager, data: numpy.ndarray, tensor_type: kp.TensorTypes = <TensorTypes.device: 0>) → kp.Tensor ¶

Create a managed tensor that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param data The data to initialize the tensor with @param tensorType The type of tensor to initialize @returns Shared pointer with initialised tensor

Sequence¶

The Kompute Sequence consists of batches of Kompute Operations, which are executed on a respective GPU queue. The execution of sequences can be synchronous or asynchronous, and it can be coordinated through its respective vk::Fence.

class kp.Sequence¶

clear(self: kp.Sequence) → None¶: Clear function clears all operations currently recorded and starts recording again.

destroy(self: kp.Sequence) → None¶: Destroys and frees the GPU resources which include the buffer and memory and sets the sequence as init=False.

eval(*args, **kwargs)¶

Overloaded function.

eval(self: kp.Sequence) -> kp.Sequence

Eval sends all the recorded and stored operations in the vector of operations into the gpu as a submit job synchronously (with a barrier).

@return shared_ptr<Sequence> of the Sequence class itself

eval(self: kp.Sequence, arg0: kp.OpBase) -> kp.Sequence

Resets all the recorded and stored operations, records the operation provided and submits into the gpu as a submit job synchronously (with a barrier).

@return shared_ptr<Sequence> of the Sequence class itself

eval_async(*args, **kwargs)¶

Overloaded function.

eval_async(self: kp.Sequence) -> kp.Sequence

Eval Await waits for the fence to finish processing and then once it finishes, it runs the postEval of all operations.

@param waitFor Number of milliseconds to wait before timing out. @return shared_ptr<Sequence> of the Sequence class itself

eval_async(self: kp.Sequence, arg0: kp.OpBase) -> kp.Sequence

Eval Async sends all the recorded and stored operations in the vector of operations into the gpu as a submit job without a barrier. EvalAwait() must ALWAYS be called after to ensure the sequence is terminated correctly.

@return Boolean stating whether execution was successful.

eval_await(*args, **kwargs)¶

Overloaded function.

eval_await(self: kp.Sequence) -> kp.Sequence

Eval Await waits for the fence to finish processing and then once it finishes, it runs the postEval of all operations.

@param waitFor Number of milliseconds to wait before timing out. @return shared_ptr<Sequence> of the Sequence class itself

eval_await(self: kp.Sequence, arg0: int) -> kp.Sequence

Eval Await waits for the fence to finish processing and then once it finishes, it runs the postEval of all operations.

@param waitFor Number of milliseconds to wait before timing out. @return shared_ptr<Sequence> of the Sequence class itself

get_timestamps(self: kp.Sequence) → List[int]¶: Return the timestamps that were latched at the beginning and after each operation during the last eval() call.

is_init(self: kp.Sequence) → bool¶

Returns true if the sequence has been initialised, and it’s based on the GPU resources being refrenced.

@return Boolean stating if is initialized

is_recording(self: kp.Sequence) → bool¶

Returns true if the sequence is currently in recording activated.

@return Boolean stating if recording ongoing.

is_running(self: kp.Sequence) → bool¶

Returns true if the sequence is currently running - mostly used for async workloads.

@return Boolean stating if currently running.

record(self: kp.Sequence, arg0: kp.OpBase) → kp.Sequence ¶

Record function for operation to be added to the GPU queue in batch. This template requires classes to be derived from the OpBase class. This function also requires the Sequence to be recording, otherwise it will not be able to add the operation.

@param op Object derived from kp::BaseOp that will be recoreded by the sequence which will be used when the operation is evaluated. @return shared_ptr<Sequence> of the Sequence class itself

rerecord(self: kp.Sequence) → None¶: Clears command buffer and triggers re-record of all the current operations saved, which is useful if the underlying kp::Tensors or kp::Algorithms are modified and need to be re-recorded.

Tensor¶

The Kompute Tensor is the atomic unit in Kompute, and it is used primarily for handling Host and GPU Device data.

class kp.Tensor¶

Structured data used in GPU operations.

Tensors are the base building block in Kompute to perform operations across GPUs. Each tensor would have a respective Vulkan memory and buffer, which would be used to store their respective data. The tensors can be used for GPU data storage or transfer.

data(self: kp.Tensor) → numpy.ndarray¶

data_type(self: kp.Tensor) → kp::Tensor::TensorDataTypes¶

Retrieve the underlying data type of the Tensor

@return Data type of tensor of type kp::Tensor::TensorDataTypes

destroy(self: kp.Tensor) → None¶: Destroys and frees the GPU resources which include the buffer and memory.

is_init(self: kp.Tensor) → bool¶

Check whether tensor is initialized based on the created gpu resources.

@returns Boolean stating whether tensor is initialized

size(self: kp.Tensor) → int¶

Returns the size/magnitude of the Tensor, which will be the total number of elements across all dimensions

@return Unsigned integer representing the total number of elements

tensor_type(self: kp.Tensor) → kp.TensorTypes ¶

Retrieve the tensor type of the Tensor

@return Tensor type of tensor

TensorType¶

class kp.Algorithm¶

Main constructor for algorithm with configuration parameters to create the underlying resources.

@param device The Vulkan device to use for creating resources @param tensors (optional) The tensors to use to create the descriptor resources @param spirv (optional) The spirv code to use to create the algorithm @param workgroup (optional) The kp::Workgroup to use for the dispatch which defaults to kp::Workgroup(tensor[0].size(), 1, 1) if not set. @param specializationConstants (optional) The std::vector<float> to use to initialize the specialization constants which cannot be changed once set. @param pushConstants (optional) The std::vector<float> to use when initializing the pipeline, which set the size of the push constants - these can be modified but all new values must have the same vector size as this initial value.

destroy(self: kp.Algorithm) → None¶

get_tensors(self: kp.Algorithm) → List[kp::Tensor]¶

Gets the current tensors that are used in the algorithm.

@returns The list of tensors used in the algorithm.

is_init(self: kp.Algorithm) → bool¶

function that checks all the gpu resource components to verify if these have been created and returns true if all are valid.

@returns returns true if the algorithm is currently initialized.

class kp.Manager¶

Base orchestrator which creates and manages device and child components

algorithm(*args, **kwargs)¶

Overloaded function.

algorithm(self: kp.Manager, tensors: List[kp.Tensor], spirv: bytes, workgroup: List[int[3]] = [0, 0, 0], spec_consts: List[float] = [], push_consts: List[float] = []) -> kp.Algorithm

Create a managed algorithm that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param tensors (optional) The tensors to initialise the algorithm with @param spirv (optional) The SPIRV bytes for the algorithm to dispatch @param workgroup (optional) kp::Workgroup for algorithm to use, and defaults to (tensor[0].size(), 1, 1) @param specializationConstants (optional) kp::Constant to use for specialization constants, and defaults to an empty constant @param pushConstants (optional) kp::Constant to use for push constants, and defaults to an empty constant @returns Shared pointer with initialised algorithm

algorithm(self: kp.Manager, tensors: List[kp.Tensor], spirv: bytes, workgroup: List[int[3]] = [0, 0, 0], spec_consts: numpy.ndarray = [], push_consts: numpy.ndarray = []) -> kp.Algorithm

Create a managed algorithm that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param tensors (optional) The tensors to initialise the algorithm with @param spirv (optional) The SPIRV bytes for the algorithm to dispatch @param workgroup (optional) kp::Workgroup for algorithm to use, and defaults to (tensor[0].size(), 1, 1) @param specializationConstants (optional) kp::Constant to use for specialization constants, and defaults to an empty constant @param pushConstants (optional) kp::Constant to use for push constants, and defaults to an empty constant @returns Shared pointer with initialised algorithm

destroy(self: kp.Manager) → None¶: Destroy the GPU resources and all managed resources by manager.

get_device_properties(self: kp.Manager) → dict¶: Return a dict containing information about the device

list_devices(self: kp.Manager) → list¶: Return a dict containing information about the device

sequence(self: kp.Manager, queue_index: int = 0, total_timestamps: int = 0) → kp.Sequence ¶

Create a managed sequence that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param queueIndex The queue to use from the available queues @param nrOfTimestamps The maximum number of timestamps to allocate. If zero (default), disables latching of timestamps. @returns Shared pointer with initialised sequence

tensor(self: kp.Manager, data: numpy.ndarray[numpy.float32], tensor_type: kp.TensorTypes = <TensorTypes.device: 0>) → kp.Tensor ¶

tensor_t(self: kp.Manager, data: numpy.ndarray, tensor_type: kp.TensorTypes = <TensorTypes.device: 0>) → kp.Tensor ¶

Create a managed tensor that will be destroyed by this manager if it hasn’t been destroyed by its reference count going to zero.

@param data The data to initialize the tensor with @param tensorType The type of tensor to initialize @returns Shared pointer with initialised tensor

class kp.OpAlgoDispatch¶: Operation that provides a general abstraction that simplifies the use of algorithm and parameter components which can be used with shaders. By default it enables the user to provide a dynamic number of tensors which are then passed as inputs.

class kp.OpBase¶

Base Operation which provides the high level interface that Kompute operations implement in order to perform a set of actions in the GPU.

Operations can perform actions on tensors, and optionally can also own an Algorithm with respective parameters. kp::Operations with kp::Algorithms would inherit from kp::OpBaseAlgo.

class kp.OpMult¶: Operation that performs multiplication on two tensors and outpus on third tensor.

class kp.OpTensorCopy¶: Operation that copies the data from the first tensor to the rest of the tensors provided, using a record command for all the vectors. This operation does not own/manage the memory of the tensors passed to it. The operation must only receive tensors of type

class kp.OpTensorSyncDevice¶: Operation that syncs tensor’s device by mapping local data into the device memory. For TensorTypes::eDevice it will use a record operation for the memory to be syncd into GPU memory which means that the operation will be done in sync with GPU commands. For TensorTypes::eHost it will only map the data into host memory which will happen during preEval before the recorded commands are dispatched.

class kp.OpTensorSyncLocal¶: Operation that syncs tensor’s local memory by mapping device data into the local CPU memory. For TensorTypes::eDevice it will use a record operation for the memory to be syncd into GPU memory which means that the operation will be done in sync with GPU commands. For TensorTypes::eHost it will only map the data into host memory which will happen during preEval before the recorded commands are dispatched.

class kp.Tensor¶

Structured data used in GPU operations.

Tensors are the base building block in Kompute to perform operations across GPUs. Each tensor would have a respective Vulkan memory and buffer, which would be used to store their respective data. The tensors can be used for GPU data storage or transfer.

data(self: kp.Tensor) → numpy.ndarray¶

data_type(self: kp.Tensor) → kp::Tensor::TensorDataTypes¶

Retrieve the underlying data type of the Tensor

@return Data type of tensor of type kp::Tensor::TensorDataTypes

destroy(self: kp.Tensor) → None¶: Destroys and frees the GPU resources which include the buffer and memory.

is_init(self: kp.Tensor) → bool¶

Check whether tensor is initialized based on the created gpu resources.

@returns Boolean stating whether tensor is initialized

size(self: kp.Tensor) → int¶

Returns the size/magnitude of the Tensor, which will be the total number of elements across all dimensions

@return Unsigned integer representing the total number of elements

tensor_type(self: kp.Tensor) → kp.TensorTypes ¶

Retrieve the tensor type of the Tensor

@return Tensor type of tensor

class kp.TensorTypes¶

Members:

device : < Type is device memory, source and destination

host : < Type is host memory, source and destination

storage : < Type is Device memory (only)

property name¶