Push versus Pull Renderers
The biggest choice a renderer has to make is whether its a push or pull type, however in practise most choose push and so never really think about the choice.
In these dicussions we split the HW into two types, one where the world simulation (physics, inputs, game code, etc.) which we call the CPU and another which performs computation related to rendering, the GPU. In some HW topologies its possible we use a physical CPU core as a GPU slave, and as such in this text is considered part of the GPU.
A push renderer is almost certainly what you’ve seen in 99% of places. Its defined as a system where a world database and simulation is held on the CPU and the CPU selects what to render and push the data required to render what its selected to the GPU. Its know as a ‘Push’ renderer as the CPU pushs data to the GPU.
A pull renderer is a very different beast. Here the renderer looks directly in the world database and uses that to render things to the user. A full pull renderer would have the CPU doing nothing for the scene to be rendered. The GPU decides what and how to render without the CPU doing anything, hence the GPU ‘Pulls’ render data directly.
Pull renderers are rare due to many issues, from graphics api having CPU only bindings, to the CPU hardware being the only device which has the flexibility to render something. Render APIs are almost always push biased, you have a object that you call Change State and Draw calls directly. Pull require a higher level API than currently in favour, it has to know about your chosen bounding volumes, understand your material and geometry structures potentially. It it also require choices about what to render to be in the GPU space.
A first glance this sounds like a classic render thread architecture, where a thread receives commands from the CPU and turns them into renders. However this is still a push renderer, as the CPU still has to push some data to the render thread. A pull renderer might use a render thread but if so it goes a fetchs what to render rather than just follow a command list.
First I’ll describe a PS3 pull renderer, I’ve written in the pass (should be safe from NDAs) and then some ideas for a future PC pull renderer.
The key to a PS3 pull renderer, is slaving at least one SPU to the GPU. Effectively using a SPU as a fancy front end for the GPU. Apart from writing the SPU pull code, getting the CPU synchronisation is the first part needed. A key point of any pull renderer is ensuring no data races when accessing the shared world database, the simplest method a copy inside a critical section, ensuring that accept during the lock period whilst it duplicates the world database.
Once per frame required, the CPU will issue a ‘kickoff’ signal, telling the GPU to start pulling (in theory it possible to have a free running pull renderer, but in practise it helps to have a kick off signal from the CPU side).
When signalled the GPU slave will start reading the world database and cull objects using there bounding volume and world space matrix. If visible, it decodes the render mesh data structure and writes a portion of a command list to render that object. As the GPU front end the entire render pipeline will be here, with the CPU getting back to its simulation, its entire render overhead is just the time for the signal and a lock.