Kompute
|
#include <Sequence.hpp>
Public Member Functions | |
Sequence (std::shared_ptr< vk::PhysicalDevice > physicalDevice, std::shared_ptr< vk::Device > device, std::shared_ptr< vk::Queue > computeQueue, uint32_t queueIndex, uint32_t totalTimestamps=0) | |
~Sequence () | |
std::shared_ptr< Sequence > | record (std::shared_ptr< OpBase > op) |
template<typename T , typename... TArgs> | |
std::shared_ptr< Sequence > | record (std::vector< std::shared_ptr< Tensor >> tensors, TArgs &&... params) |
template<typename T , typename... TArgs> | |
std::shared_ptr< Sequence > | record (std::shared_ptr< Algorithm > algorithm, TArgs &&... params) |
std::shared_ptr< Sequence > | eval () |
std::shared_ptr< Sequence > | eval (std::shared_ptr< OpBase > op) |
template<typename T , typename... TArgs> | |
std::shared_ptr< Sequence > | eval (std::vector< std::shared_ptr< Tensor >> tensors, TArgs &&... params) |
template<typename T , typename... TArgs> | |
std::shared_ptr< Sequence > | eval (std::shared_ptr< Algorithm > algorithm, TArgs &&... params) |
std::shared_ptr< Sequence > | evalAsync () |
std::shared_ptr< Sequence > | evalAsync (std::shared_ptr< OpBase > op) |
template<typename T , typename... TArgs> | |
std::shared_ptr< Sequence > | evalAsync (std::vector< std::shared_ptr< Tensor >> tensors, TArgs &&... params) |
template<typename T , typename... TArgs> | |
std::shared_ptr< Sequence > | evalAsync (std::shared_ptr< Algorithm > algorithm, TArgs &&... params) |
std::shared_ptr< Sequence > | evalAwait (uint64_t waitFor=UINT64_MAX) |
void | clear () |
std::vector< std::uint64_t > | getTimestamps () |
void | begin () |
void | end () |
bool | isRecording () const |
bool | isInit () const |
void | rerecord () |
bool | isRunning () const |
void | destroy () |
Container of operations that can be sent to GPU as batch
kp::Sequence::Sequence | ( | std::shared_ptr< vk::PhysicalDevice > | physicalDevice, |
std::shared_ptr< vk::Device > | device, | ||
std::shared_ptr< vk::Queue > | computeQueue, | ||
uint32_t | queueIndex, | ||
uint32_t | totalTimestamps = 0 |
||
) |
Main constructor for sequence which requires core vulkan components to generate all dependent resources.
physicalDevice | Vulkan physical device |
device | Vulkan logical device |
computeQueue | Vulkan compute queue |
queueIndex | Vulkan compute queue index in device |
totalTimestamps | Maximum number of timestamps to allocate |
kp::Sequence::~Sequence | ( | ) |
Destructor for sequence which is responsible for cleaning all subsequent owned operations.
void kp::Sequence::begin | ( | ) |
Begins recording commands for commands to be submitted into the command buffer.
void kp::Sequence::clear | ( | ) |
Clear function clears all operations currently recorded and starts recording again.
void kp::Sequence::destroy | ( | ) |
Destroys and frees the GPU resources which include the buffer and memory and sets the sequence as init=False.
void kp::Sequence::end | ( | ) |
Ends the recording and stops recording commands when the record command is sent.
std::shared_ptr<Sequence> kp::Sequence::eval | ( | ) |
Eval sends all the recorded and stored operations in the vector of operations into the gpu as a submit job synchronously (with a barrier).
|
inline |
Eval sends all the recorded and stored operations in the vector of operations into the gpu as a submit job with a barrier.
algorithm | Algorithm to use for the record often used for OpAlgo operations |
TArgs | Template parameters that are used to initialise operation which allows for extensible configurations on initialisation. |
Resets all the recorded and stored operations, records the operation provided and submits into the gpu as a submit job synchronously (with a barrier).
|
inline |
Eval sends all the recorded and stored operations in the vector of operations into the gpu as a submit job with a barrier.
tensors | Vector of tensors to use for the operation |
TArgs | Template parameters that are used to initialise operation which allows for extensible configurations on initialisation. |
std::shared_ptr<Sequence> kp::Sequence::evalAsync | ( | ) |
Eval Async sends all the recorded and stored operations in the vector of operations into the gpu as a submit job without a barrier. EvalAwait() must ALWAYS be called after to ensure the sequence is terminated correctly.
|
inline |
Eval sends all the recorded and stored operations in the vector of operations into the gpu as a submit job with a barrier.
algorithm | Algorithm to use for the record often used for OpAlgo operations |
TArgs | Template parameters that are used to initialise operation which allows for extensible configurations on initialisation. |
Clears currnet operations to record provided one in the vector of operations into the gpu as a submit job without a barrier. EvalAwait() must ALWAYS be called after to ensure the sequence is terminated correctly.
|
inline |
Eval sends all the recorded and stored operations in the vector of operations into the gpu as a submit job with a barrier.
tensors | Vector of tensors to use for the operation |
TArgs | Template parameters that are used to initialise operation which allows for extensible configurations on initialisation. |
std::shared_ptr<Sequence> kp::Sequence::evalAwait | ( | uint64_t | waitFor = UINT64_MAX | ) |
Eval Await waits for the fence to finish processing and then once it finishes, it runs the postEval of all operations.
waitFor | Number of milliseconds to wait before timing out. |
std::vector<std::uint64_t> kp::Sequence::getTimestamps | ( | ) |
Return the timestamps that were latched at the beginning and after each operation during the last eval() call.
bool kp::Sequence::isInit | ( | ) | const |
Returns true if the sequence has been initialised, and it's based on the GPU resources being referenced.
bool kp::Sequence::isRecording | ( | ) | const |
Returns true if the sequence is currently in recording activated.
bool kp::Sequence::isRunning | ( | ) | const |
Returns true if the sequence is currently running - mostly used for async workloads.
|
inline |
Record function for operation to be added to the GPU queue in batch. This template requires classes to be derived from the OpBase class. This function also requires the Sequence to be recording, otherwise it will not be able to add the operation.
algorithm | Algorithm to use for the record often used for OpAlgo operations |
TArgs | Template parameters that are used to initialise operation which allows for extensible configurations on initialisation. |
Record function for operation to be added to the GPU queue in batch. This template requires classes to be derived from the OpBase class. This function also requires the Sequence to be recording, otherwise it will not be able to add the operation.
op | Object derived from kp::BaseOp that will be recoreded by the sequence which will be used when the operation is evaluated. |
|
inline |
Record function for operation to be added to the GPU queue in batch. This template requires classes to be derived from the OpBase class. This function also requires the Sequence to be recording, otherwise it will not be able to add the operation.
tensors | Vector of tensors to use for the operation |
TArgs | Template parameters that are used to initialise operation which allows for extensible configurations on initialisation. |
void kp::Sequence::rerecord | ( | ) |
Clears command buffer and triggers re-record of all the current operations saved, which is useful if the underlying kp::Tensors or kp::Algorithms are modified and need to be re-recorded.