The SBgl Rendering Pipeline is built on Vulkan 1.3's Dynamic Rendering, eliminating the need for complex RenderPass and Framebuffer boilerplate. It follows Data-Oriented Design (DOD) principles by using lightweight integer handles for resource management.
Key Concepts
Handle-Based Resource Management
Resources like Buffers, Shaders, and Pipelines are referenced via sbgl_Buffer, sbgl_Shader, and sbgl_Pipeline handles (32-bit unsigned integers). Internally, these handles map to contiguous arrays in the Vulkan backend, ensuring cache-efficient access.
Configurable Resource Limits
The maximum number of resources (buffers, shaders, pipelines) is configurable at context initialization via sbgl_InitWithConfig(). See Window Setup for details on initialization options.
Explicit Pipeline State Objects (PSO)
SBgl requires explicit creation of graphics pipelines. A pipeline encapsulates the vertex/fragment shaders and the vertex input layout. This design ensures predictable performance by moving pipeline compilation to initialization time.
Vertex Input Layout
The sbgl_VertexLayout structure defines how vertex data in a buffer is mapped to shader input locations.
- Stride: Total size in bytes of a single vertex.
- Attributes: Array of
sbgl_VertexAttribute defining the location and offset of each component.
Shader Loading Strategies
SBgl supports two primary ways to load SPIR-V shaders: dynamic file loading and hardcoded byte arrays.
Dynamic File Loading
Ideal for development and modding, allowing shaders to be recompiled without rebuilding the application.
size_t size;
uint32_t* bytecode = read_spirv_file("examples/shaders/example_shader.vert.spv", &size);
sbgl_Shader sbgl_LoadShader(sbgl_Context *ctx, sbgl_ShaderStage stage, const uint32_t *bytecode, size_t size)
Loads a shader from SPIR-V bytecode.
uint32_t sbgl_Shader
Handle for a shader module.
@ SBGL_SHADER_STAGE_VERTEX
Hardcoded (Static) Loading
Recommended for production releases to ensure the application is self-contained and prevents tampering with core assets.
The xxd tool facilitates converting a compiled .spv file into a C header: xxd -i examples/shaders/example_shader.vert.spv > example_shader_vert.h
#include "example_shader_vert.h"
(uint32_t*)example_shader_vert_spv,
example_shader_vert_spv_len);
Vertex Layout Use Case
Defining a standard vertex with Position and Color attributes using the library-provided sbgl_Vertex structure:
};
.attributeCount = 2,
.attributes = attributes
};
@ SBGL_FORMAT_R32G32B32_SFLOAT
Vertex attribute definition.
Vertex input layout definition.
Standard vertex structure for basic geometry rendering. Optimized for cache density (16 bytes).
Complete Drawing Example
float time = get_current_time();
void sbgl_SetClearColor(sbgl_Context *ctx, float r, float g, float b, float a)
Sets the clear color for the next frame.
void sbgl_PushConstants(sbgl_Context *ctx, size_t size, const void *data)
Updates push constants for the currently bound pipeline.
void sbgl_EndDrawing(sbgl_Context *ctx)
Finalizes the current frame and presents it to the screen.
void sbgl_BindPipeline(sbgl_Context *ctx, sbgl_Pipeline pipeline)
Binds a graphics pipeline for subsequent draw calls.
void sbgl_DestroyBuffer(sbgl_Context *ctx, sbgl_Buffer buffer)
Destroys a GPU buffer.
void sbgl_DeviceWaitIdle(sbgl_Context *ctx)
Synchronizes the CPU with the GPU, waiting for all commands to complete.
void sbgl_Draw(sbgl_Context *ctx, uint32_t vertexCount, uint32_t firstVertex, uint32_t instanceCount)
Submits a non-indexed draw command.
void sbgl_BeginDrawing(sbgl_Context *ctx)
Prepares the engine for a new frame of drawing.
void sbgl_BindBuffer(sbgl_Context *ctx, sbgl_Buffer buffer, sbgl_BufferUsage usage)
Binds a buffer to the pipeline.
void sbgl_DestroyPipeline(sbgl_Context *ctx, sbgl_Pipeline pipeline)
Destroys a graphics pipeline.
@ SBGL_BUFFER_USAGE_VERTEX
Depth Buffering & 3D Sorting
SBgl utilizes a dedicated depth attachment to ensure correct geometry sorting in 3D space.
- Automatic Management: The engine automatically creates a depth buffer matching the window resolution.
- Pipeline Integration: All pipelines created via
sbgl_CreatePipeline have depth testing and depth writing enabled by default.
- Clearing: Every frame, the depth buffer is automatically cleared to
1.0 during the sbgl_BeginDrawing phase. The clear color is set via sbgl_SetClearColor (formerly sbgl_Clear).
Synchronization & Frames in Flight
To maximize efficiency and prevent CPU/GPU bottlenecks, SBgl implements a Frames in Flight model.
- Double Buffering: The engine uses 2 sets of command buffers and synchronization primitives (semaphores, fences).
- Overlapping Execution: This allows the CPU to begin recording the next frame while the GPU is still processing the previous one.
- Safe Teardown: Before destroying resources (e.g., exiting an application),
sbgl_DeviceWaitIdle(ctx) MUST be called to ensure all in-flight GPU work is complete.
Automated Batching
SBgl provides an automated batching system to minimize CPU-to-GPU communication overhead. This system collects multiple draw requests into a single submission, leveraging GPU-side features like Indirect Drawing.
Render Queues
A sbgl_RenderQueue acts as a collector for draw commands. It is initialized using an SblArena to ensure efficient, contiguous memory allocation for the queued data.
sbgl_RenderQueue * sbgl_CreateRenderQueue(sbgl_Context *ctx, struct SblArena *arena)
Creates a thread-local render queue for collecting draw commands.
SBL_ARENA_DEF bool sbl_arena_init(SblArena *arena, uint64_t initial_size)
Internal storage for draw packets awaiting submission.
Submitting Draws
Instead of immediate execution, draw calls are submitted to a queue. The system stores the vertex count, first vertex, and instance count for each draw.
for (int i = 0; i < 10000; i++) {
}
void sbgl_SubmitDraw(sbgl_RenderQueue *queue, uint32_t mesh, uint32_t material, uint32_t blendMode, uint32_t sidedness, uint32_t tags, sbgl_SortKey key, const sbgl_InstanceData *data)
Appends a draw command to the render queue.
static sbgl_Vec3 sbgl_Vec3Set(float x, float y, float z)
Creates a Vec3, correctly padded.
static sbgl_Mat4 sbgl_Mat4Translate(sbgl_Vec3 v)
Creates a translation matrix.
static sbgl_Vec4 sbgl_Vec4Set(float x, float y, float z, float w)
Creates a Vec4.
Per-instance data for automated batching.
Executing the Queue
The sbgl_RenderQueues function processes one or more queues. It performs internal sorting (e.g., via Radix Sort in the backend) and "bakes" the draws into a single Vulkan Indirect Draw command.
void sbgl_RenderQueues(sbgl_Context *ctx, sbgl_RenderQueue **queues, uint32_t queueCount, const sbgl_Mat4 *viewProj)
Merges, sorts, and submits pending draw commands to the GPU.
static sbgl_Mat4 sbgl_Mat4Identity(void)
Returns an identity matrix.
4x4 Matrix, 16-byte aligned, column-major.
Optimized Batch Submission
To resolve CPU bottlenecks during high-frequency geometry submission (e.g., voxels, particle systems), SBgl utilizes an internal Transient Allocation system.
Persistent Mapping
Dynamic data required for each frame—such as per-instance transformation matrices and indirect draw commands—is written directly into a set of persistently mapped GPU buffers.
- Zero Allocation Overhead: Unlike standard buffer creation, transient allocation simply increments a pointer in a pre-allocated pool.
- Cache Efficiency: Data is written sequentially by the CPU and read sequentially by the GPU, maximizing throughput.
Multi-Draw Indirect (MDI)
The system "bakes" sorted draw packets into sbgl_IndirectCommand structures. These are submitted using vkCmdDrawIndexedIndirect, allowing the GPU to process massive numbers of draw calls with a single command dispatch from the CPU.
Batch Rendering & DOD Alignment
The handle system is designed for batching. By iterating through arrays of transformation data (SoA) and binding buffers once, geometry submission is achieved. The automated batcher utilizes Multi-Draw Indirect (MDI) to reduce CPU overhead by submitting geometry with a single Vulkan command.
Dynamic Vertex Updates
For CPU-driven effects (e.g., particle systems), vertex buffers can be recreated or updated every frame. The system handles the underlying synchronization to ensure the GPU has finished using the old buffer before it is destroyed.
simulate_particles(particles, 100);
sbgl_DestroyBufferDeferred(ctx, vbo);
sbgl_Buffer sbgl_CreateBuffer(sbgl_Context *ctx, sbgl_BufferUsage usage, size_t size, const void *data)
Creates a GPU buffer.
uint32_t sbgl_Buffer
Handle for a GPU-side buffer.
Multi-Queue Submission
Managing separate render queues allows for logical separation of draw calls (e.g., Opaque, Transparent, UI). These queues can be merged and processed in a single backend call to optimize pipeline state changes.
Multithreading Considerations
The current implementation records commands into a single primary command buffer per context. To support multithreading:
- The backend will be extended to support Secondary Command Buffers.
- Each worker thread will record commands into its own buffer.
- The main thread will execute all recorded buffers in a single submission.