I'm attempting to pack some numbers for output after doing some work. They're currently r7 = v0|v4 and r10 = v1|v5. They all need to be >>3, before or after repacking. Output will be v1|v0 and v5|v4 (little endian architecture). I managed to get the v1|v0 written reasonably efficiently: mov r8, r10, asr #3 ; 1>>3|xxx pkhtb r8, r8, r7, asr #19 ; 1>>3|,0>>3 str r8, [r0], r2 ; o1|o0, post inc
But v5|v4 is a little ugly because I'm starting with the least significant bits, so right shifting is going to drag in the bottom of the upper word (right?). Right now I'm sign extending, then writing individual shorts. mov r8, r10, asr #3 ; 5 >> 3 strh r8, [r0, #2] ; o5 sxth r1, r1 ;
sxth r7, r7 ; mov r8, r7, asr #3 ; 4 >> 3 strh r8, [r0], r2 ; o4, post inc
I found
Is there a better way to do this?