Saturday, October 24, 2009

Fast Rendering With libtcod in Python

Python, being an interpreted language, is not all that speedy.  Obviously a great deal of effort has gone into optimisation of the interpreter, but by far the best way of producing fast code is via the liberal use of functionality implemented in C.  By this I do not mean that one should go out of their way to write one's own C extensions for every taxing operation.  It is usually sufficient to make use of existing libraries, both standard and extension.

The Chronicles of Doryen Library, or libtcod, is an ASCII rendering engine, primarily intended for rogue-like games.  It features full 24-bit colour and a very nice set of utility modules -- basically everything one might need to produce a rogue-like.  As I said, though, it is a rendering engine and not a game engine.  One still has to implement one's own data structures and game loop.

The most obvious way of printing characters on screen in libtcod is via the put_char function.  It places a character at the specified location in a buffer.  The character's colour may be set either by previously altering the default foreground and background colours (via set_background_color and set_foreground_color) or by afterward setting the colours for that particular location on screen (via set_back and set_fore).  The problem here is obvious, as what you have is a minimum of three function calls for each and every character on screen.  If one is rendering to a traditional 80 by 25 viewport, that is two-thousand characters or six-thousand function calls every frame.  That is not including whatever happens throughout the rest of the program loop.

Clearly another method is required.  Fortunately, libtcod provides functions capable of printing multiple characters (ie: a string) with a single call: print_left, print_center, and print_right.  Furthermore, it provides functionality though which one may specify colours within the string.  This is how I implemented my solution.

First off, let us start with a data structure:

class Cell (object):
    __slots__ = '_fg', '_bg', 'char'

    def __init__ (self, fg,  bg, char):
        self.fg = fg
        self.bg = bg
        self.char = char

To be honest, using a class is somewhat slower than simply storing this data in a tuple, but this makes the code more readable.  I used __slots__ because it does at least provide a small boost in speed.

The easiest way to store the scene is in a list.  More specifically in a list of lists.  The outer list is a collection of rows, with each inner list being the rows themselves.  Let us generate a random scene:

chars = ['\'', '`', ',', '.', '.', '.']
raw_scene = [[Cell ((1, 255, 1), (1, 1, 1), random.choice (chars)) for x in range (width)] for y in range (height)]

There: a Dwarf Fortress-like grassy field!

If none of that code makes sense to you, I suggest that you study Generator Expressions.  They are immensely useful.

Anyway, this is the raw data from which we will construct the scene.  We keep it in memory for cases where we must update the scene, as the compiled data may not be as intuitive to alter (though it is certainly possible).

The scene is compiled by converting every cell into a string consisting of libtcod colour codes and the cell's character:

def compile_scene (raw_scene):
    scene = []

    for y in range (height):
        scene.append ([])

        for x in range (width):
            cell = raw_scene[y][x]
            scene[y].append ('%c%c%c%c%c%c%c%c%c%c' % ((tcod.COLCTRL_FORE_RGB, ) + cell.fg + (tcod.COLCTRL_BACK_RGB, ) + cell.bg + (cell.char, tcod.COLCTRL_STOP)))

    return scene

Leaving the individual little strings separate from each other allows one to later swap it with a different one.  Besides, Python provides the functionality to quickly combine them when we draw the scene.

Drawing the scene is simply a matter of using the join function of Python strings to combine the strings in every row.  We could technically also add a newline character to the end of every row and then combine the rows, allowing one to draw the entire scene in a single function call, but for some reason that produced a strange bug for me whereby sometimes the scene would not fully render.  I even ported the program to C and encountered the same issue.  Printing each line one-by-one seems to work, though, and it is not much slower.

def draw (con, scene):
    scene_str = [''.join (scene[y]) for y in range (height)]

    for y in range (height):
        tcod.console_print_left (con, 0, y, tcod.BKGND_SET, scene_str[y])

We can even go further and draw only a portion of the scene:

def draw (con, scene, left, top):
    scene_str = [''.join (scene[y][left:left + scr_width]) for y in range (top, top + scr_height)]

    for y in range (scr_height):
        tcod.console_print_left (con, 0, y, tcod.BKGND_SET, scene_str[y])

This is useful in cases where the scene encompasses an area larger than the viewport.

The full version of my test program also incorporates an update function and several light sources altering the visibility of each cell.  It is in that version that the bug I mentioned appeared.

Anyway, I hope this helps anyone who has had problems with rendering speed in Python.  It has been a long time since I wrote an article on my blog, so at the very least it was worthwhile for me. . .

Followers