SBgl 0.1.0
A graphics framework in C99
Loading...
Searching...
No Matches
SBgl Rendering Pipeline

The SBgl Rendering Pipeline is built on Vulkan 1.3's Dynamic Rendering, eliminating the need for complex RenderPass and Framebuffer boilerplate. It follows Data-Oriented Design (DOD) principles by using lightweight integer handles for resource management.

Key Concepts

Handle-Based Resource Management

Resources like Buffers, Shaders, and Pipelines are referenced via sbgl_Buffer, sbgl_Shader, and sbgl_Pipeline handles (32-bit unsigned integers). Internally, these handles map to contiguous arrays in the Vulkan backend, ensuring cache-efficient access.

Configurable Resource Limits

The maximum number of resources (buffers, shaders, pipelines) is configurable at context initialization via sbgl_InitWithConfig(). See Window Setup for details on initialization options.

Explicit Pipeline State Objects (PSO)

SBgl requires explicit creation of graphics pipelines. A pipeline encapsulates the vertex/fragment shaders and the vertex input layout. This design ensures predictable performance by moving pipeline compilation to initialization time.

Vertex Input Layout

The sbgl_VertexLayout structure defines how vertex data in a buffer is mapped to shader input locations.

  • Stride: Total size in bytes of a single vertex.
  • Attributes: Array of sbgl_VertexAttribute defining the location and offset of each component.

Shader Loading Strategies

SBgl supports two primary ways to load SPIR-V shaders: dynamic file loading and hardcoded byte arrays.

Dynamic File Loading

Ideal for development and modding, allowing shaders to be recompiled without rebuilding the application.

size_t size;
uint32_t* bytecode = read_spirv_file("examples/shaders/example_shader.vert.spv", &size);
sbgl_Shader shader = sbgl_LoadShader(ctx, SBGL_SHADER_STAGE_VERTEX, bytecode, size);
// ... handle cleanup of bytecode if necessary ...
sbgl_Shader sbgl_LoadShader(sbgl_Context *ctx, sbgl_ShaderStage stage, const uint32_t *bytecode, size_t size)
Loads a shader from SPIR-V bytecode.
Definition sbgl_core.c:523
uint32_t sbgl_Shader
Handle for a shader module.
Definition sbgl_types.h:42
@ SBGL_SHADER_STAGE_VERTEX
Definition sbgl_types.h:134

Hardcoded (Static) Loading

Recommended for production releases to ensure the application is self-contained and prevents tampering with core assets.

The xxd tool facilitates converting a compiled .spv file into a C header: xxd -i examples/shaders/example_shader.vert.spv > example_shader_vert.h

#include "example_shader_vert.h"
// xxd generates an array named 'example_shader_vert_spv' and 'example_shader_vert_spv_len'
(uint32_t*)example_shader_vert_spv,
example_shader_vert_spv_len);

Vertex Layout Use Case

Defining a standard vertex with Position and Color attributes using the library-provided sbgl_Vertex structure:

// sbgl_Vertex is defined as:
// typedef struct {
// sbgl_Vec3 position;
// sbgl_Vec3 color;
// } sbgl_Vertex;
sbgl_VertexAttribute attributes[] = {
{ .location = 0, .offset = offsetof(sbgl_Vertex, position), .format = SBGL_FORMAT_R32G32B32_SFLOAT },
{ .location = 1, .offset = offsetof(sbgl_Vertex, color), .format = SBGL_FORMAT_R32G32B32_SFLOAT }
};
.stride = sizeof(sbgl_Vertex),
.attributeCount = 2,
.attributes = attributes
};
@ SBGL_FORMAT_R32G32B32_SFLOAT
Definition sbgl_types.h:157
Vertex attribute definition.
Definition sbgl_types.h:166
Vertex input layout definition.
Definition sbgl_types.h:186
Standard vertex structure for basic geometry rendering. Optimized for cache density (16 bytes).
Definition sbgl_types.h:28

Complete Drawing Example

sbgl_SetClearColor(ctx, 0.0f, 0.0f, 0.0f, 1.0f); // Sets clear color for next frame
sbgl_BindPipeline(ctx, example_pipeline);
// Update interactive data via Push Constants
float time = get_current_time();
sbgl_PushConstants(ctx, sizeof(float), &time);
sbgl_Draw(ctx, 3, 0, 1); // 3 vertices, starting at 0, 1 instance
// Teardown: Wait for GPU to finish before destroying resources
sbgl_DestroyPipeline(ctx, example_pipeline);
sbgl_DestroyBuffer(ctx, example_vbo);
void sbgl_SetClearColor(sbgl_Context *ctx, float r, float g, float b, float a)
Sets the clear color for the next frame.
Definition sbgl_core.c:401
void sbgl_PushConstants(sbgl_Context *ctx, size_t size, const void *data)
Updates push constants for the currently bound pipeline.
Definition sbgl_core.c:733
void sbgl_EndDrawing(sbgl_Context *ctx)
Finalizes the current frame and presents it to the screen.
Definition sbgl_core.c:315
void sbgl_BindPipeline(sbgl_Context *ctx, sbgl_Pipeline pipeline)
Binds a graphics pipeline for subsequent draw calls.
Definition sbgl_core.c:673
void sbgl_DestroyBuffer(sbgl_Context *ctx, sbgl_Buffer buffer)
Destroys a GPU buffer.
Definition sbgl_core.c:457
void sbgl_DeviceWaitIdle(sbgl_Context *ctx)
Synchronizes the CPU with the GPU, waiting for all commands to complete.
Definition sbgl_core.c:391
void sbgl_Draw(sbgl_Context *ctx, uint32_t vertexCount, uint32_t firstVertex, uint32_t instanceCount)
Submits a non-indexed draw command.
Definition sbgl_core.c:698
void sbgl_BeginDrawing(sbgl_Context *ctx)
Prepares the engine for a new frame of drawing.
Definition sbgl_core.c:262
void sbgl_BindBuffer(sbgl_Context *ctx, sbgl_Buffer buffer, sbgl_BufferUsage usage)
Binds a buffer to the pipeline.
Definition sbgl_core.c:681
void sbgl_DestroyPipeline(sbgl_Context *ctx, sbgl_Pipeline pipeline)
Destroys a graphics pipeline.
Definition sbgl_core.c:597
@ SBGL_BUFFER_USAGE_VERTEX
Definition sbgl_types.h:123

Depth Buffering & 3D Sorting

SBgl utilizes a dedicated depth attachment to ensure correct geometry sorting in 3D space.

  • Automatic Management: The engine automatically creates a depth buffer matching the window resolution.
  • Pipeline Integration: All pipelines created via sbgl_CreatePipeline have depth testing and depth writing enabled by default.
  • Clearing: Every frame, the depth buffer is automatically cleared to 1.0 during the sbgl_BeginDrawing phase. The clear color is set via sbgl_SetClearColor (formerly sbgl_Clear).

Synchronization & Frames in Flight

To maximize efficiency and prevent CPU/GPU bottlenecks, SBgl implements a Frames in Flight model.

  • Double Buffering: The engine uses 2 sets of command buffers and synchronization primitives (semaphores, fences).
  • Overlapping Execution: This allows the CPU to begin recording the next frame while the GPU is still processing the previous one.
  • Safe Teardown: Before destroying resources (e.g., exiting an application), sbgl_DeviceWaitIdle(ctx) MUST be called to ensure all in-flight GPU work is complete.

Automated Batching

SBgl provides an automated batching system to minimize CPU-to-GPU communication overhead. This system collects multiple draw requests into a single submission, leveraging GPU-side features like Indirect Drawing.

Render Queues

A sbgl_RenderQueue acts as a collector for draw commands. It is initialized using an SblArena to ensure efficient, contiguous memory allocation for the queued data.

SblArena arena;
sbl_arena_init(&arena, 16 * 1024 * 1024); // 16MB for high-frequency batching
sbgl_RenderQueue * sbgl_CreateRenderQueue(sbgl_Context *ctx, struct SblArena *arena)
Creates a thread-local render queue for collecting draw commands.
Definition sbgl_core.c:743
SBL_ARENA_DEF bool sbl_arena_init(SblArena *arena, uint64_t initial_size)
Arena allocator.
Definition sbl_arena.h:47
Internal storage for draw packets awaiting submission.
Definition sbgl_core.c:25

Submitting Draws

Instead of immediate execution, draw calls are submitted to a queue. The system stores the vertex count, first vertex, and instance count for each draw.

// Queue up 10,000 instances of a single triangle
for (int i = 0; i < 10000; i++) {
data.transform = sbgl_Mat4Translate(sbgl_Vec3Set((float)i * 0.1f, 0.0f, 0.0f));
data.color = sbgl_Vec4Set(1.0f, 1.0f, 1.0f, 1.0f);
// meshId=0, materialId=0, blend=0, sided=0, tags=0, sortKey=0
sbgl_SubmitDraw(queue, 0, 0, 0, 0, 0, 0, &data);
}
void sbgl_SubmitDraw(sbgl_RenderQueue *queue, uint32_t mesh, uint32_t material, uint32_t blendMode, uint32_t sidedness, uint32_t tags, sbgl_SortKey key, const sbgl_InstanceData *data)
Appends a draw command to the render queue.
Definition sbgl_core.c:772
static sbgl_Vec3 sbgl_Vec3Set(float x, float y, float z)
Creates a Vec3, correctly padded.
Definition sbgl_math.h:93
static sbgl_Mat4 sbgl_Mat4Translate(sbgl_Vec3 v)
Creates a translation matrix.
Definition sbgl_math.h:260
static sbgl_Vec4 sbgl_Vec4Set(float x, float y, float z, float w)
Creates a Vec4.
Definition sbgl_math.h:98
Per-instance data for automated batching.
Definition sbgl_types.h:19
sbgl_Mat4 transform
Definition sbgl_types.h:20
sbgl_Vec4 color
Definition sbgl_types.h:21

Executing the Queue

The sbgl_RenderQueues function processes one or more queues. It performs internal sorting (e.g., via Radix Sort in the backend) and "bakes" the draws into a single Vulkan Indirect Draw command.

sbgl_BindPipeline(ctx, pipeline);
// Process and execute all queued draws
sbgl_RenderQueues(ctx, &queue, 1, &vp);
void sbgl_RenderQueues(sbgl_Context *ctx, sbgl_RenderQueue **queues, uint32_t queueCount, const sbgl_Mat4 *viewProj)
Merges, sorts, and submits pending draw commands to the GPU.
Definition sbgl_core.c:808
static sbgl_Mat4 sbgl_Mat4Identity(void)
Returns an identity matrix.
Definition sbgl_math.h:200
4x4 Matrix, 16-byte aligned, column-major.
Definition sbgl_math.h:83

Optimized Batch Submission

To resolve CPU bottlenecks during high-frequency geometry submission (e.g., voxels, particle systems), SBgl utilizes an internal Transient Allocation system.

Persistent Mapping

Dynamic data required for each frame—such as per-instance transformation matrices and indirect draw commands—is written directly into a set of persistently mapped GPU buffers.

  • Zero Allocation Overhead: Unlike standard buffer creation, transient allocation simply increments a pointer in a pre-allocated pool.
  • Cache Efficiency: Data is written sequentially by the CPU and read sequentially by the GPU, maximizing throughput.

Multi-Draw Indirect (MDI)

The system "bakes" sorted draw packets into sbgl_IndirectCommand structures. These are submitted using vkCmdDrawIndexedIndirect, allowing the GPU to process massive numbers of draw calls with a single command dispatch from the CPU.

Batch Rendering & DOD Alignment

The handle system is designed for batching. By iterating through arrays of transformation data (SoA) and binding buffers once, geometry submission is achieved. The automated batcher utilizes Multi-Draw Indirect (MDI) to reduce CPU overhead by submitting geometry with a single Vulkan command.

Dynamic Vertex Updates

For CPU-driven effects (e.g., particle systems), vertex buffers can be recreated or updated every frame. The system handles the underlying synchronization to ensure the GPU has finished using the old buffer before it is destroyed.

// Updating a vertex array every frame
sbgl_Vertex particles[100];
simulate_particles(particles, 100);
// Create a new buffer for this frame
sbgl_Buffer vbo = sbgl_CreateBuffer(ctx, SBGL_BUFFER_USAGE_VERTEX, sizeof(particles), particles);
// Bind and draw (100 vertices, starting at vertex 0, 1 instance)
sbgl_Draw(ctx, 100, 0, 1);
// Use deferred destruction to safely release the buffer after the frame is retired
sbgl_DestroyBufferDeferred(ctx, vbo);
sbgl_Buffer sbgl_CreateBuffer(sbgl_Context *ctx, sbgl_BufferUsage usage, size_t size, const void *data)
Creates a GPU buffer.
Definition sbgl_core.c:445
uint32_t sbgl_Buffer
Handle for a GPU-side buffer.
Definition sbgl_types.h:37

Multi-Queue Submission

Managing separate render queues allows for logical separation of draw calls (e.g., Opaque, Transparent, UI). These queues can be merged and processed in a single backend call to optimize pipeline state changes.

sbgl_RenderQueue* opaque_queue;
sbgl_RenderQueue* ui_queue;
// ... populate queues ...
sbgl_RenderQueue* queues[] = { opaque_queue, ui_queue };
sbgl_Mat4 vp = get_view_proj();
// Process all queues, performing cross-queue sorting for state optimization
sbgl_RenderQueues(ctx, queues, 2, &vp);

Multithreading Considerations

The current implementation records commands into a single primary command buffer per context. To support multithreading:

  • The backend will be extended to support Secondary Command Buffers.
  • Each worker thread will record commands into its own buffer.
  • The main thread will execute all recorded buffers in a single submission.