Monday, December 30, 2013

Simple Ring Buffer


 

A simple ring buffer
As an example for which classes are handy, let's consider the implementation of a circular queue or ring buffer. A ring buffer can be a useful structure for transmitting data between asynchronous processes. For instance, you might use a ring buffer to buffer character data coming from or going to a peripheral device such as a serial port.

A ring buffer is a first-in-first-out data structure. You insert data at the buffer's back end and remove it from the front end. A typical implementation for a character ring buffer uses three variables:
char array[N];
int head, tail;

where:

  • array is a fixed-size array of characters
  • head is the index of the front element in the buffer
  • tail is the index of the array element just beyond the back element, or equivalently, the index in the array at which the next element will be inserted
To insert a new element at the back of the buffer, store the new value at array[tail]; and increment tail; if the new value for tail is N (the number of array elements), then reset tail to zero.
To remove an element from the front of the buffer, simply increment head, resetting head to zero when it reaches N.









Figure 1: A ring buffer with a maximum capacity of N elements


In effect, head and tail chase each other around the array, as shown in Figure 1. Initially, the head and tail have the same value, typically zero, indicating that the ring buffer is empty. As the tail pulls away from the head, the buffer fills up. If and when the tail gets so far ahead that it wraps around and catches up to the head, the buffer will be full. As the head catches up to the tail, the buffer empties. When the head completely overtakes the tail, the buffer will be empty once again.
Implementing a ring buffer that you can safely share between asynchronous processes requires attention to details that I don't want to get into just yet. For now, I'm just interested in using the buffer as an example to illustrate the need for class types. I'll defer further discussion of concurrency problems for a later column and focus on building a ring buffer suitable for use within a single process.
A ring buffer in C
Many C programmers would implement a ring buffer as a loose collection of variables. They might name the variables with a common prefix to show that they're related, but do little else to package the ring buffer as a single entity. For example, the declarations for a ring buffer used to transmit data to a UART (a serial port) might look like:
#define TXBUFSIZ 32
char txbuf[TXBUFSIZ];
int txhead, txtail;

Using this style, operations on the ring buffers are typically scattered throughout the code. For example, a handler that responds to an interrupt when the UART transmitter is ready for more data might look like:
if (/* UART transmit interrupt pending */)
    {
    if (txhead != txtail)
        {
        put_to_UART(txbuf[txhead]);
        if (++txhead >= TXBUFSIZ)
            txhead = 0;
        }
    /* clear interrupt */;
    }

Once you know how the ring buffer works, this code isn't particularly hard to read. For instance, the condition txhead != txtail tests that the buffer isn't empty. The expression txbuf[txhead] refers to the front element in the buffer, so that:
put_to_UART(txbuf[txhead]);
sends the front character of the ring buffer to the UART. Finally:
if (++txhead >= sizeof(txbuf))
    txhead = 0;

removes the front character from the buffer.
The problem with this style of programming is that it's harder to read and maintain than it needs to be. For example, suppose you discover that implementing head and tail as pointers, rather than as integers, yields better (smaller or faster) code. To implement this change, you must change the declarations for the transmit buffer to:
#define TXBUFSIZ 32
char txbuf[TXBUFSIZ];
char *txhead, *txtail;

Then you have to scrutinize all uses of head and tail throughout the code, and rewrite them if necessary. For example, you have to rewrite the code for the handler as:
if (/* UART transmit interrupt pending */)
    {
    if (txhead != txtail)
        {
        put_to_UART(*txhead);
        if (++txhead >= txbuf + TXBUFSIZ)
            txhead = txbuf;
        }
    /* clear interrupt */;
    }

Notice that the condition txhead != txtail remains the same, but the code to copy and remove the front element changes slightly.
Using abstract types in C
Skilled C programmers anticipate such changes and package data structures such as ring buffers as abstract types. An abstract type is one that's packaged to separate its functional behavior from its implementation. This lets you use an abstract type with little or no regard for its underlying implementation. Ideally, changes in the type's implementation should not change the way you use the type.
For example, you can implement the ring buffer as a structure type, such as:
enum { rb_bufsiz = 32 };
struct ring_buffer
    {
    char array[rb_bufsiz];
    int head, tail;
    };
typedef struct ring_buffer ring_buffer;

This version of the ring buffer uses an enumeration rather than a macro to define the constant for the array size. Enumeration constants are often preferable to macros for defining symbolic constants. (See "Enumeration Constants vs. Constant Objects," December 2001, p13.) The typedef immediately after the structure definition elevates the name ring_buffer from a tag to a full-fledged type name. (See "Tag vs. Type Names," October 2002, p7.)
Using the ring_buffer structure, you can declare the ring buffer for the UART transmitter as simply:
ring_buffer tx;
The ring_buffer structure should come with a comprehensive set of functions to implement the fundamental ring buffer operations. One such function is:
bool rb_empty(ring_buffer const *b)
    {
    return b->head == b->tail;
    }

Calling rb_empty(&tx) returns true if and only if tx is empty. In C99, you'd probably declare this function with the keyword inline to eliminate the function call overhead. In earlier C dialects, you might implement it as a macro:
#define rb_empty(b) ((b)->head == (b)->tail)
Then you can rewrite:
if (txhead != txtail)
    ...

more abstractly as:
if (!rb_empty(&tx))
    ...

Listing 1: C99 header file
// ring_buffer.h
#ifndef RING_BUFFER_INCLUDED
#define RING_BUFFER_INCLUDED

enum { rb_size = 32 };
struct ring_buffer
        {
        char buffer[rb_size];
        int head, tail;
        };

inline
bool rb_empty(ring_buffer const *b)
        {
        return b->head == b->tail;
        }

inline
char rb_front(ring_buffer const *b)
        {
        return b->buffer[b->head];
        }

inline
void rb_pop_front(ring_buffer *b)
        {
        if (++b->head >= rb_size)
                b->head = 0;
        }

#endif
Listing 1 contains a C99 header file, ring_buffer.h, which defines ring_buffer as an abstract type with supporting functions. Using these functions, the code for the UART handler looks like:
if (/* UART transmit interrupt pending */)
    {
    if (!rb_empty(&tx))
        {
        put_to_UART(rb_front(&tx));
        rb_pop_front(&tx);
        }
    /* clear interrupt */;
    }

I borrowed the function names front and pop_front from functions that support similar operations in the Standard C++ Library. Calling rb_front(&tx) returns the front element of tx, and calling rb_pop_front(&tx) removes the element from the front of tx.
What's missing?
Although you can implement abstract types in C, it's much harder to do it in C than in C++. C compilers don't give you any help with catching mistakes that undermine the abstraction.
For example, ring_buffer.h in Listing 1 doesn't provide a function to initialize a ring_buffer. Before you insert elements into a ring_buffer, you should initialize the head and tail to the same value, typically zero. However, you can get away without initializing the head and tail if you declare the ring_buffer as a statically allocated object. Objects with static storage are initialized to zero by default.
Now, suppose you change the ring_buffer implementation so that head and tail are declared as pointers instead of integer-valued indices. In that case, head and tail should be initialized to point to the initial element of the ring_buffer's array. However, if your program contains a ring_buffer initialized to zero by default, its head and tail pointers will be initialized to null pointers.
Of course, you can fix the bug by adding code in your application to initialize the ring_buffer properly, as in:
ring_buffer tx;
...
tx.head = tx.tail = tx.array;

But this undermines the whole purpose of using abstract types. Again, an abstract type should provide a sufficiently complete set of basic operations so that you can use that type without knowing its implementation. Ideally, the compiler shouldn't allow the user of an abstract ring_buffer type to write statements such as:
tx.head = tx.tail = tx.array;
which depend on knowing how the type is implemented.
A language that truly supports abstract type can detect and complain about statements that undermine abstractions. C compilers can't do this. C++ compilers can. Next month, I'll show you how.