Unions and Structures

- S
- Seeps
  
  Contact options for registered users
posted
20 years ago

Wed, Jan 7, 2004 12:46 AM

I have a 100 byte long packet that is to be transmitted. I want to use it both as a linear array and I want to reference individual bits and bytes that have special meaning. Something like this...

/* 32bit machine */ typedef union PACKET_U { struct LABEL_S { unsigned char dst_addr;

unsigned char src_addr;

unsigned char data1 : 1; unsigned char dummy : 4; signed char data2 : 3;

unsigned char data3 : 4; signed char data4 : 4;

unsigned int data5;

unsigned char data6[95]; } label;

unsigned char linear_array[100]; } packet;

Can I mix bitfields and other types in the same scope (LABEL_S) ?

Is there a better way of doing this?

I am aware of the issues with read-modify-write, and atomic operations, etc.

Thanks.

- S
- Steve at fivetrees
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Wed, Jan 7, 2004 1:10 AM

Personally I wouldn't depend on union element alignment - I'd write a state machine to parse the packet byte by byte and assign values to a structure. I'd do the same whether the packet came from comms or a file. Same applies to bitfields - I've wound up distrusting 'em... much prefer to use explicit masking.

More generally: beware of passing raw binary data with higher-level structure superimposed - what happens when one end is big-endian and the other end isn't?

YMMV.

Steve

formatting link

- Z
- Zevel
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Wed, Jan 7, 2004 1:34 AM

"Steve at fivetrees" wrote in news:3ffb5c8d$0$61059$ snipped-for-privacy@mercury.nildram.net:

Additionally:

1) Don't assume the size of type int. Use long for 4 byte and short for 2 byte scalar members. 2) Explicitly force the compiler to pack on byte boundaries. This is very important for union alignment. For example:

typedef struct { unsigned char byteValue; unsigned long longValue; }t_struct;

If the compiler packing is on 4 byte boundaries the size of t_struct is 8 bytes. The compiler added space between byteValue and longValue.

Use for gcc or vc++ the following:

#pragma pack(1) // force packing on byte boundaries typedef struct { unsigned char byteValue; unsigned long longValue; }t_struct; #pragma pack() // restore default packing

Here the size of t_struct is 5 bytes as desired.

-Zevel

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Wed, Jan 7, 2004 8:00 AM

These fields should be ints. They may not pack the way you expect if CHAR_BIT is larger than 8, so you are better off specifying more unsigned chars and some extraction masks. Also there is no guarantee on the packing order, so again use masks.

Again, this is sensitive both to sizeof int and to endianess. Better to express it as a two char array. Then the value can portably be: 256 * data5[0] + data5[1]; You may also need to mask the individual bytes off to 8 bits.

If you really have a 32 bit machine, with 32 bit ints, your structure formed a 103 byte object already.

--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
     USE worldnet address!

- G
- George Neuner
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Wed, Jan 7, 2004 8:45 AM

Just further emphasis for what Steve and Zevel said ...

Bit field alignment and msb .. lsb assignment within bit fields are completely compiler dependent - there is no standard and you cannot rely on them even using the same brand of compiler across different architectures.

You have to be aware of packing, alignment and CPU endianess to use structured binary data across architectures. If you have mixed architectures, you should define a standard external "network" representation for message data and have each end convert the data appropriately on sending or receiving. If most of the messages go one way (like a data logger) you can save by defining the sending machine's representation as the standard and doing most data conversions in the receiving machine.

And be very careful about sending floating point data as binary. You didn't mention what CPUs you are using, but don't assume they use the same format for floating point data - not all hardware uses IEEE formats and software FP emulations may do just about anything.

George

============================================== Send real email to GNEUNER2 at COMCAST dot NET

- S
- Simon Hosie
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Thu, Jan 8, 2004 9:43 AM

There is no hope.

If you're trying to write portable code then you'll fail somehow, somewhere, no matter what you try. Even once you get past the portability question, everybody has different opinions on what's ugly and what's neat, and what's safe and what's unsafe, and what's readable and what's unreadable.

When implementing a protocol someone else specified for some other machine, I prefer to just convert the data to bytes by pushing things around at the last minute. Some people hate that, because there's a risk that you'll re-order your code and then the bytes will go out in the wrong order.

Others define an array and #define or enumerate all the indices into that array. I'd sooner make a struct where all the elements were exactly the same type, but whatever.

Some people try to make structs with heterogenous elements, and do lots of pre-processor work to adjust the structure so that some trivial function can re-order the bytes as they're transmitted if the endian is wrong. I've never seen that work coherently, but that's just my opinion.

You don't have to give names to your filler bitfields, by the way.

- B
- Bill Davy
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Thu, Jan 8, 2004 10:50 AM

I think you have to do this, and that you can dates back to the use of C as a SIL on PDP11 etc. Others have given some sound advice but I'd add the following.

1) Typedef in a platform dependent header S8, U8, I8, S16, U16, I16 (etc) and use them 2) Include somewhere (a function called Preconditions() called at the start of main() perhaps): { packet a_packet; assert( sizeof(a_packet.linear_array) == sizeof(a_packet.label) ); } 3) "packet" is such a nice name for a packet and not a very good name for a packet type. You might use packet_t (or PacketT) for the type?

- D
- Dave Hansen
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Thu, Jan 8, 2004 5:37 PM

scalar members.

If you need exact-width types, use the ones the (new) standard provides: e.g. int32_t, uint8_t, etc. If you want the smallest types with at least the required number of bits, use int_least32_t instead. If you want the "fastest" type with at least the required number of bits, use int_fast32_t.

To use these you need to #include stdint.h. If your compiler doesn't have a stdint.h, make one of your own that defines the stuff you need.

Regards,

-=Dave

--
Change is inevitable, progress is not.

- N
- Not Really Me
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Fri, Jan 9, 2004 3:38 PM

...snip...

What new standard?

--
Scott
ExoTech R&D, Inc.

- A
- Alan Balmer
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Fri, Jan 9, 2004 5:22 PM

C99.

--
Al Balmer
Balmer Consulting
removebalmerconsultingthis@att.net

- D
- Dave Hansen
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Fri, Jan 9, 2004 8:04 PM

[...]

ISO/IEC 9899:1999, better known as C99, is the first to define these types. I guess a four-year-old standard isn't exactly "new," but most compilers claim compliance the previous standard (if at all).

Regards,

-=Dave

--
Change is inevitable, progress is not.