SBgl 0.1.0
A graphics framework in C99
Loading...
Searching...
No Matches
Performance Telemetry System

SBgl provides a high-precision telemetry system designed to identify CPU and GPU bottlenecks in real-time. By utilizing hardware-native timing mechanisms, the system delivers ground-truth performance data with minimal execution overhead.


Architecture

The telemetry system independently measures three primary components of the frame lifecycle:

  1. GPU Execution Time: Measures the absolute duration the hardware was processing commands. This is implemented using Vulkan Timestamp Queries (vkCmdWriteTimestamp). To prevent CPU stalls, results are read back asynchronously from the previous frame.
  2. CPU Frame Duration: Measures the total time between sbgl_BeginDrawing and sbgl_EndDrawing.
  3. CPU Sorting/Baking Overhead: Measures the specific duration spent in the Data-Oriented sorting and indirect command generation phase.

Timing Precision

  • Internal: Utilizes high-resolution performance counters via the Platform HAL (sbgl_os_GetPerfCount).
  • Linux: Utilizes clock_gettime with CLOCK_MONOTONIC via the HAL.
  • Windows: Utilizes QueryPerformanceCounter via the HAL.
  • GPU: Precision is determined by the hardware's timestampPeriod (nanoseconds per tick).

The Telemetry Structure

Performance metrics are encapsulated in the sbgl_Telemetry structure:

typedef struct {
float cpu_frame_time;
float cpu_sort_time;
float gpu_render_time;
uint32_t draw_calls;
uint32_t instance_count;
Performance telemetry data for a single frame.
Definition sbgl_types.h:175

Usage Example

Telemetry data for the previous frame is retrieved via the engine context. This asynchronous pattern ensures that the CPU never waits for the GPU to finish its work before starting the next frame's logic.

#include "sbgl.h"
#include <stdio.h>
int main() {
sbgl_Context* ctx = sbgl_Init(...).ctx;
while (!sbgl_WindowShouldClose(ctx)) {
double frame_start = sbgl_GetTime(ctx);
// ... rendering logic ...
// Retrieve and display metrics
printf("CPU: %.2fms | GPU: %.2fms | Batches: %u\n",
stats.draw_calls);
}
return 0;
}
int main(void)
Definition batch_main.c:14
API for the SiputBiru Graphics Library (SBgl).
sbgl_InitResult sbgl_Init(int w, int h, const char *title)
Initializes the engine and opens a window.
Definition sbgl_core.c:193
sbgl_Telemetry sbgl_GetTelemetry(sbgl_Context *ctx)
Retrieves the performance telemetry data for the previous frame.
Definition sbgl_core.c:436
bool sbgl_WindowShouldClose(sbgl_Context *ctx)
Checks if the user or OS has requested to close the window.
Definition sbgl_core.c:238
void sbgl_EndDrawing(sbgl_Context *ctx)
Finalizes the current frame and presents it to the screen.
Definition sbgl_core.c:315
double sbgl_GetTime(sbgl_Context *ctx)
Retrieves the current monotonic system time in seconds.
Definition sbgl_core.c:245
void sbgl_Shutdown(sbgl_Context *ctx)
Gracefully shuts down the engine and releases all resources.
Definition sbgl_core.c:204
void sbgl_BeginDrawing(sbgl_Context *ctx)
Prepares the engine for a new frame of drawing.
Definition sbgl_core.c:262
Engine context.
Definition sbgl_types.h:268
sbgl_Context * ctx
Definition sbgl_types.h:340
float gpu_render_time
Definition sbgl_types.h:178
uint32_t draw_calls
Definition sbgl_types.h:179
float cpu_frame_time
Definition sbgl_types.h:176

Bottleneck Identification

The telemetry data allows for clear identification of the limiting factor:

Observation Probable Bottleneck Recommended Action
High gpu_render_time Fragment Shading / Fill-rate Reduce render resolution or simplify shaders.
High cpu_sort_time Batcher Overhead Optimize sorting keys or reduce total draw packets.
Large cpu_frame_time difference Driver / CPU Logic Profile application-level update logic or driver submission.

Future Considerations

Planned extensions include Granular Profiling, which will enable timestamp markers for individual render queues and compute passes, allowing for deep-dive analysis of complex frame structures.