Array of bits

P

pozz 5 years ago

Many times I need to pass to functions or serialize an array of bits. If they are just a few (8-16 bits), I decide to use a standard array:

uint8_t my_short_array_of_bits[16]; void abits_set_bit(uint8_t *array, size_t size, uint8_t nbit); uint8_t abits_get_bit(uint8_t *arrray, size_t size, uint8_t nbit);

It sometimes occurs that the number of bits is much greater, maybe 100 or 200. If I developed on desktop/server machine, I would continue using array of uint8_t, but I'm developing on embedded platforms with limited resources.

So I started using uint32_t for 32 bits. It is very comfortable to test, set and clear a bit. I often need to set the N lowest bits or get the lowest bit set (__builtin_ctz).

For a recent project, I needed to increase the number of bits to 60, so the first idea is to use uint64_t. However I'm thinking to try to generalize for a bigger number of bits, because tomorrow someone will ask to increase the length again to 100 or 200.

Do you have some suggestions for this? Maybe in the same project I need to use an array of 100 bits and an array of 200 bits. I would prefer a typedef for both types, for example:

typedef uint8_t inputs_t[100]; uint8_t inputs_get_bit(inputs_t inputs, unsigned int nbit); void inputs_set_bit(...); void inputs_reset_bit(...); void inputs_toggle_bit(...); int inputs_ctz(inputs_t inputs); void inputs_set_low_bits(inputs_t inputs, unsigned int nbits);

typedef uint8_t outputs_t[200]; uint8_t outputs_get_bit(outputs_t inputs, unsigned int nbit); ...

Maybe I can create a .h file with preprocessor macro that will define static inline functions automatically for many types of arrays. Something similar to:

#define ABITS_TYPE_NAME inputs #define ABITS_LENGTH 100 #include "abits.h"

--- abits.h --- #define ABITS_TYPE CONCAT(ABITS_TYPE_NAME, _t) typedef uint8_t ABITS_TYPE[ABITS_LENGTH];

#define ABITS_GET_FUNC CONCAT(ABITS_TYPE_NAME, _get_bit) static inline uint8_t ABITS_GET_FUNC(ABITS_TYPE ABITS_TYPE_NAME, unsigned int nbits) { ... }

...

--- end of abits.h ---

I know I can try to write this .h file, but maybe there are some public domain code ready-to-use.

Vote

D

Don Y 5 years ago

Why would you want to allocate a BYTE (uint8_t) for EACH BIT?

8 bits would fit into: uint8_t my_short_array_of_bits[1]; while 16 would fit in: uint8_t my_short_array_of_bits[2]; generally: uint8_t my_short_array_of_bits[(NUMBITS+7)/8];

uint8_t my_short_array_of_bits[(100+7)/8]; uint8_t my_short_array_of_bits[(200+7)/8];

So?

Ideally, you would conditionally pick your underlying implementation (i.e., whether to use uint8 vs unint32 vs uint64) based on whatever the native hardware best supports.

Using a wider base type can be less efficient on targets that have narrower natural data sizes. E.g., support for a uint32 on an 8 bit processor may ADD code where a "more natural" 8 bit type would better fit with the instruction set/architecture.

E.g., I use BigRationals in some of my math. These are pairs of BigIntegers (which, in turn, are implemented as arrays of some convenient *base* data type wide enough to support the value(s) they are being asked to represent. So, if the value is currently 0x123456789, you'd need at least 33 bits to represent the value. On an architecture where bytes are the nominal data type, this would require an array of 5 bytes; on a 16bit architecture, it would require 3 words; on a 32bit architecture, 2 longs.

In my case, the size of the array is hidden from the type; a BigInteger value of 0x123456789 would consume less resources than one having the value

0x112233445566778899 -- but, both would still be treated as "BigIntegers".

If you want to create stronger typing and create a type for each possible bit-array size, then you'll need to "templatize" the routines that process those data types as a bit_array_16_t is not type compatible with a bit_array_18_t -- even though the underlying implementations are strikingly similar.

OTOH, if you just treat them all as "bit_array_t" -- and assume responsibility for manually ensuring that you've specified the correct size for each argument processed, then you can share a parameterized (instead of templatized) implementation.

Vote

B

boB 5 years ago

Don, isn't this kind of like Bit Banding for an ARM Corte-M3, 4, 7 etc ? Except I ~think~ that ARM uses more than 1 byte for each bit ?

I wouldn't prefer not to use 1 byte to represent 1 bit but I have seen it done before for true/false I think back in the 8 bit days

Vote

D

Don Y 5 years ago

Sure, you can use a long long to represent a bit -- if handling long longs is efficient (for your architecture) *and* if you have the resources.

But, you only *need* a single bit.

Some multiple-bit operations (e.g., set all bits to the right or left of a particular bit) are far more efficient when bits are packed into a "datum" (byte/word/long/etc.) -- you don't need to do multiple load/stores to access them at different addresses.

More than one bit set in a "variable" (useful for scanning keypads)?

if ((variable - 1) & variable) { // multiple bits set } else { // only one bit set }

By contrast, if each "bit" was an array element, you'd have to walk through the array -- the *entire* array (or, at least enough of it to conclusively rule out "only one bit")

Count number of set bits?

int bitcount(uint32_t value) { value = value - ((value >> 0001) & 0x55555555); value = (value & 0x33333333) + ((value >> 0010) & 0x33333333); value = (value + (value >> 0100)) & 0x0F0F0F0F; return (value * 0x01010101) >> (0011 * 01000); }

[hopefully I didn't misremember any typos!]

constant time, regardless of number of set bits (no branches). And, *faster*/smaller than examining individual elements of an array.

Etc. (there are lots of clever tricks that can be applied when bits are *packed*)

The OP doesn't explain what he intends to do with these data. What other operations will be performed on them? *Between* them? merge_bit_arrays(array1, array2) intersect_bit_arrays(...) etc.

As always, horses for courses...

Vote

P

pozz 5 years ago

Because the bits are just a few, so I don't lost many space and loops for operations on bit are very fast.

Of course, this decision is made when the bit are completely indipendent data and I don't interested in classical "bit operations".

For example, when you have 10 boolean options for your software. Why should you pack those options in a bitmask if they are unrelated?

Yes, but my question was related to a "generic" approach that can be used if the bits are 32, 64 or 256. In the latter case, arrays are necessary.

Yes, something similar. I only asked if there was a public domain code with this kind of approach.

Yes, this is another solution.

Vote

D

Don Y 5 years ago

Why should you pack them into a bit_array_t -- if they are unrelated?

Bool flagX; Bool flag3; Bool running; Bool OK;

Putting them into an array suggests that they are related to each other more than simply by the fact that they are of the same *type*.

You don't put all of your "integers" into a int_array_t, do you?

int count; int books; int iteration; int bank_balance;

Whether or not an array is necessary depends on the size of the "base type" on which the array will be built. If your processor handles unit8_t's efficiently, you'd pack 8 bits into a uint8_t and then number_of_bits/8 of these into a bit_array_t. If your processor handles uint32_t's efficiently, you'd pack 32 of them into a uint32_t and then pack number_of_bits/32 of these into a bit_array_t.

Note that "handles efficiently" may not be the same as the native data size for your processor. E.g., a 32b CPU with an 8 bit data path would likely handle uint8_t's more efficiently than uint32_t's -- because it could reduce the array member access to a single "byte access" (memory cycle). OTOH, if the cost of accessing a uint32_t is the same (or lower!), then you'd opt for that as a base type.

Once you have selected a base type, the code is the same (except for adjustments in manifest constants related to the number of bits packed in each byte)

It would likely take longer to FIND than it would take to WRITE. Pseudocode:

// note that this relies on BASE_T being the "expected" width // AND sizeof yielding that value! // if it is greater, then bits of memory get wasted

#define BASE_T ... // unit8_t, uint16_t, uint32_t, etc. #define BASE_BITS (8 * sizeof(BASE_T))

result_t test_bit( uint bit, // [0,size) BASE_T *array, // (size+7)/BASE_BITS elements uint size // (0,UINT_MAX] ) { ASSERT( size > 0 ); // makes no sense to have zero or fewer bits!

// ensure the referenced bit resides within the structure if ( (bit < 0) || (bit >= size) ) { return ERROR; }

if ( array[bit/8] & (1 ]

This implementation counts bits "the normal way" (from LSb upward).

Pick different BASE_T's and see what the code looks like. Or, profile it's execution on YOUR hardware to see if there are advantages to one BASE_T over others.

Note that the larger the BASE_T, the greater the chance for "wasted bits" (e.g., if you only need 5 bits, then using a ulonglong will "cost" you considerably more than you *need*!)

By comparison, an implementation that uses uint8_t's for each "bit" would likely look like:

result_t test_bit( uint bit, // [0,size) BASE_T *array, // size elements uint size // (0,UINT_MAX] ) { ASSERT( size > 0 ); // makes no sense to have zero or fewer bits!

// ensure the referenced bit resides within the structure if ( (bit < 0) || (bit >= size) ) { return ERROR; }

if ( array[bit] ) { return (1); } else { return (0); } }

i.e., once you start considering the overhead of error checking, there's little performance gain.

As implemented, above. But, now the compiler can't tell you if you've used the wrong type in a particular line of code.

Time for breakfast (which most people call "lunch"!). Sunday... The Pork Dish!

Vote

D

Don Y 5 years ago

Note that many compilers will flag the "bit < 0" test as unnecessary (because I've chosen to define it as unsigned -- it can't take on a negative value!). However, I include this as it guards against someone deciding to change bit's data type to signed and introducing a latent error.

Vote

D

Don Y 5 years ago

(And, no, I'm not going to make the other changes!)

Vote

Array of bits

Join the Discussion

Didn't find your answer?