Arbitrary Data Sizes with C

Arbitrary Data Types in C

C includes an abundant amount of data types. It also allows you to create your own data types by using structs.

But the largest amount of precision C provides as a data type is a byte. So how can we get smaller data types, such as the 4-bit long nibble?

This is no theory question: the nibble certainly has some application. In situations where a full byte is not needed or RAM is tight, 2 nibbles fit in the same amount of space as a byte. On desktop computers, this technique is simply a novelty. On microcontrollers, this could be a very important optimization. The nibble and similar small-sized data types are also used in networking.

Bitfield Solution

Most of the suggestions I’ve read include this:

    struct t {
        unsigned char s : 4; // 4-bit long s
        unsigned char b : 4; // 4-bit long b
    };

This method has one major problem, memory alignment, when the compiler will pad (add zeros at the end or between elements of a structure) the struct to fit into a word (or into a multiple of the largest element of a structure, because the definition of how large to pad the structure varies as well). Although access is faster when the data type is padded by the compiler into a word, this may present problems where keeping RAM usage to an absolute minimum is important. The problem is that memory alignment is hard to control. Some compilers offer a packed struct, where no memory alignment can be used (__attribute__((packed)) or compile with -fpack-struct for gcc). This solution is compiler dependent. In addition, the size of a word varies from computer to computer, so trying to make sure your structs fit in a word all the time is not entirely possible. Modern compilers can usually optimize this to the point where it’s relatively fast, but a decent amount of the time the code will be slow. Bitfields cannot be used with arrays, so access time is not O(1).

This is not to say that this method doesn’t have it’s merits, such as ease of use and the fact that the compiler can optimize it easily.

Ribble Solution

The advantage of the bitfield solution should not be overlooked: it’s mostly native. There are no function calls involved and it can be implanted in very little developer-visible code. These were some of the things I wanted to include in my alternative.

I made a new solution. I call it the ribble (phonetically: Romeo India Bravo Bravo Lima Echo). It is basically just bitshifting and masking joined in a macro. I am sure someone out there has done it before.

    #define ribble_get(storage, rept, size, offset) (((storage) >> ((rept) * (size) + (offset))) & ((1 << (size)) - 1))

This can seem quite awkward at first, so let us break it down step by step.

  • storage – the data type which will contain the smaller ribble. It can be any size, but int is recommended because int is usually the size of a word.
  • rept – indicates that the rept^th ribble should be the one read.
  • size – the size of the ribble.
  • offset – the offset that the ribble should be read from.

First, the storage is shifted the appropriate amount of bytes. This is basically the number of ribbles times the size of the ribbles plus the offset. If the ribbles are of varying lengths, the number of ribbles should be set to 0 and the offset should be used to represent distance. In order to remove the other unimportant values (the other ribbles which were not removed by shifting), the ribble is bitwise-and’d with ((1 << (size)) - 1). This is 1 shifted to the left size spaces, and then subtracted by one (so 1 << 4 = 0b10000, 0b10000 - 0b00001 = 0b01111). The result is just the ribble we wanted!

This allows us to get a ribble. In order to set a ribble, you would use:

    #define ribble_set(storage, rept, size, offset, set) ((storage) = ((storage) & ~(((1 << (size)) - 1) << ((rept) * (size) + (offset)))) | ((set) << ((rept) * (size) + (offset))))

That is more complicated than our get method. Note that we added a new variable, set, which is what we want to set the ribble to.

A diagram might aid in understanding this:

t = 0b0110
        ^^ overall objective: set that to 01
        (storage) = ((storage)
t & 0b1100 = 0b0110 & 0b1100 = 0b0100
        ^^ create a mask such that everything but those two are set to 0.
        & ~((1 << ((rept * (size) + (offset)))) - 1))
0b0100 | 0b0001 = 0b0101
             ^^ create a mask such that everything is 0 except the part we want to set
             | (((set) << ((rept) * (size) + (offset))))

Well, that is essentially the ribble!


Note that this method still requires you to create a byte if you want just one nibble. The advantage is that most of the time you know that you’re just using a byte and you have complete control over that, unlike structs which can pack to an amount you might not know.

The Offset Only Ribble

Originally when creating the ribble, I planned to have a uniform size for every element. That doesn’t seem like a good idea now, especially because it relies on the compiler to calculate the constants at compile time, otherwise a significant speed penalty will occur. I have made an offset only version of the two ribble commands so you can fine tune exactly what you want to access.

    #define ribble_get(storage, size, offset) (((storage) >> (offset)) & ((1 << (size)) - 1))

    #define ribble_set(storage, size, offset, set) ((storage) = ((storage) & ~(((1 << (size)) - 1) << (offset))) | ((set) << (offset)))

Benchmarks

I have created benchmarks for the three methods: Bitfield and Ribble. The source code and results follow. I do know there is a bias against bitfield.c (when we use a conditional) here, because if you really wanted to access the nth element you would either use a switch or an if.

Be sure to pass an argument. Each of these programs take an argument of which nibble to set [0 or 1]. You should be sure not to compile with optimization as the compiler will remove the dead code (in a real world application it would work fine).

ribble.c:

#define ribble_set(storage, rept, size, offset, set) (storage) = ((storage) & ~(((1 << (size)) - 1) << ((rept) * (size) + (offset)))) | ((set) << ((rept) * (size) + (offset)))
#define ribble_get(storage, rept, size, offset) ((storage) >> ((rept) * (size) + (offset))) & ((1 << (size)) - 1)

#include <stdlib.h>

int main(int argc, char **argv) {
    unsigned char t = 0;
    int which = iota(argv[1]);
    for (loop_count = 0; loop_count <= 1000000000; loop_count++) {
        ribble_set(t, which, 4, 0, 5);
        ribble_get(t, which, 4, 0);
    }
    return 0;
}

bitfield.c:

#include <stdlib.h>

struct t {
    unsigned char s : 4;
    unsigned char r : 4;
};

int main(int argc, char **argv) {
    struct t t = {.s = 4, .r = 15};
    int which = iota(argv[1]);
    unsigned long long loop_count;

    for (loop_count = 0; loop_count <= 1000000000; loop_count++) {
        if (which == 0) {
            t.s = 5;
            t.s;
        } else {
            t.r = 5;
            t.r;
        }
    }
    return 0;
}

offset_only_ribble.c:

#define ribble_get(storage, size, offset) ((storage) >> (offset)) & ((1 << (size)) - 1)

#define ribble_set(storage, size, offset, set) (storage) = ((storage) & ~(((1 << (size)) - 1) << (offset))) | ((set) << (offset))

#include <stdlib.h>

int main(int argc, char **argv) {
    unsigned char t = 0;
    int which = atoi(argv[1]) * 4;
    unsigned long long loop_count;

    for (loop_count = 0; loop_count <= 1000000000; loop_count++) {
        ribble_set(t, which, 4, 0, 5);
        ribble_get(t, 4, which);
    }
    return 0;
}

Benchmark Results

End results are accurate to the nearest millisecond. Tests were run 10 times each with no processes being created nor killed while the test was running.

ribble.c : 2932 milliseconds.
bitfield.c : 3243 milliseconds.
ribble_offset_only : 2924 milliseconds.

When using gcc, ribble_offset_only produces the same assembly output as ribble.c. This is because gcc pre-calculates the values. You would be hardpressed to find a compiler that doesn’t, but ribble_offset_only is included just in case.

Conclusion

The ribble is a very powerful tool that can be used for the micromanagement of data types. However, it’s a little confounded. Unless you really, desperately need the RAM or crave the micro-sized increase in speed, stay away from the ribble. It will do you more harm than good.

Known Bugs / Disadvantages / Errata

You can’t create a ribble to the size of the variable that contains it without getting an overflow warning / an overflow. It’s due to the (1 << size) - 1 code, if it shifts 9 slots (if the maximum is 8) you get 0b100000000 (9 slots) and then you go back to 0b1111111. This isn’t a real problem because if you want to use the variable size anyway, you shouldn’t be using a ribble.

I feel as if the ribble_set function could be made faster. I haven’t looked into it much but it seems to take a lot longer than ribble_get. If anybody has any idea on this, I would appreciate the feedback.

A previous version of this post used 0b# for binary numbers. This worked in gcc but produced problems for other compilers.

The ribble is not thread safe.

I made the article (hopefully) easier to read.

I don’t know any other bugs, or problems, with the ribble. Corrections are welcome and encouraged.

To Recap / tl;dr

    #define ribble_get(storage, size, offset) (((storage) >> (offset)) & ((1 << (size)) - 1))

    #define ribble_set(storage, size, offset, set) ((storage) = ((storage) & ~(((1 << (size)) - 1) << (offset))) | ((set) << (offset)))

ribble_get: read the bits in storage, with a size of size and an offset of offset

ribble_set: set the bits in storage to the bits that are set in set, with a size of size, an offset of offset.

Explore posts in the same categories: Programming

Tags: ,

You can comment below, or link to this permanent URL from your own site.

One Comment on “Arbitrary Data Sizes with C”