I've heard the kids here opine compilers are so good these days that assembly makes no sense.
There's lots to be said for high-level languages. But I disassembled some Arduino code--here's the output to write a port pin (a built-in function):
00001006 : 1006: 90 e0 ldi r25, 0x00 ; 0 1008: fc 01 movw r30, r24 100a: e3 52 subi r30, 0x23 ; 35 100c: fd 4f sbci r31, 0xFD ; 253 100e: 34 91 lpm r19, Z 1010: fc 01 movw r30, r24 1012: e7 53 subi r30, 0x37 ; 55 1014: fd 4f sbci r31, 0xFD ; 253 1016: 24 91 lpm r18, Z 1018: fc 01 movw r30, r24 101a: eb 54 subi r30, 0x4B ; 75 101c: fd 4f sbci r31, 0xFD ; 253 101e: e4 91 lpm r30, Z 1020: ee 23 and r30, r30 1022: 09 f4 brne .+2 ; 0x1026 1024: 3b c0 rjmp .+118 ; 0x109c 1026: 33 23 and r19, r19 1028: 39 f1 breq .+78 ; 0x1078 102a: 33 30 cpi r19, 0x03 ; 3 102c: 91 f0 breq .+36 ; 0x1052 102e: 38 f4 brcc .+14 ; 0x103e 1030: 31 30 cpi r19, 0x01 ; 1 1032: a9 f0 breq .+42 ; 0x105e 1034: 32 30 cpi r19, 0x02 ; 2 1036: 01 f5 brne .+64 ; 0x1078 1038: 84 b5 in r24, 0x24 ; 36 103a: 8f 7d andi r24, 0xDF ; 223 103c: 12 c0 rjmp .+36 ; 0x1062 103e: 37 30 cpi r19, 0x07 ; 7 1040: 91 f0 breq .+36 ; 0x1066 1042: 38 30 cpi r19, 0x08 ; 8 1044: a1 f0 breq .+40 ; 0x106e 1046: 34 30 cpi r19, 0x04 ; 4 1048: b9 f4 brne .+46 ; 0x1078 104a: 80 91 80 00 lds r24, 0x0080 ; 0x800080 104e: 8f 7d andi r24, 0xDF ; 223 1050: 03 c0 rjmp .+6 ; 0x1058 1052: 80 91 80 00 lds r24, 0x0080 ; 0x800080 1056: 8f 77 andi r24, 0x7F ; 127 1058: 80 93 80 00 sts 0x0080, r24 ; 0x800080 105c: 0d c0 rjmp .+26 ; 0x1078 105e: 84 b5 in r24, 0x24 ; 36 1060: 8f 77 andi r24, 0x7F ; 127 1062: 84 bd out 0x24, r24 ; 36 1064: 09 c0 rjmp .+18 ; 0x1078 1066: 80 91 b0 00 lds r24, 0x00B0 ; 0x8000b0 106a: 8f 77 andi r24, 0x7F ; 127 106c: 03 c0 rjmp .+6 ; 0x1074 106e: 80 91 b0 00 lds r24, 0x00B0 ; 0x8000b0 1072: 8f 7d andi r24, 0xDF ; 223 1074: 80 93 b0 00 sts 0x00B0, r24 ; 0x8000b0 1078: f0 e0 ldi r31, 0x00 ; 0 107a: ee 0f add r30, r30 107c: ff 1f adc r31, r31 107e: e5 55 subi r30, 0x55 ; 85 1080: fd 4f sbci r31, 0xFD ; 253 1082: a5 91 lpm r26, Z+ 1084: b4 91 lpm r27, Z 1086: 8f b7 in r24, 0x3f ; 63 1088: f8 94 cli 108a: ec 91 ld r30, X 108c: 61 11 cpse r22, r1 108e: 03 c0 rjmp .+6 ; 0x1096 1090: 20 95 com r18 1092: 2e 23 and r18, r30 1094: 01 c0 rjmp .+2 ; 0x1098 1096: 2e 2b or r18, r30 1098: 2c 93 st X, r18 109a: 8f bf out 0x3f, r24 ; 63 109c: 08 95 ret60 lines.
Assembly takes one or two lines, and is proportionally faster.
Meanwhile, I just spent three days tracking down a compiler error-- the compiler insisted (0xff & 1) = 254. It was no fun tracking down, and ultimately took a workaround.
Cute system, horrible documentation. I'm afraid it sets a lot of kids off on the wrong foot, hacking instead of thinking.
YMMV.
Cheers, James Arthur