8051 Speed Optimization

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
I'm working on optimizing code for a C8051F120 that needs to run at an
extremely fast clip.  The following few lines are repeated over, as
fast as possible.  The looping is a straightforward djnz.  The bulk of
the cpu time is spent on the following lines.  This is repeated (cut
and paste, but with the binary value changed) 8 times per loop.

; for reference
rLEVEL           equ   R0
mDATAOUT         DATA  64

; the code
            row0:
                movx    A, @DPTR
                inc     DPTR
                subb    A, rLEVEL
                jc      row1
                orl     mDATAOUT, #00000001b
            row1:

At row1 the next set of those 5 lines executes.  Essentially what this
is doing is taking the data byte at @DPTR, comparing it to the current
value of rLEVEL, and setting a bit in mDATAOUT if it is greater.  It
does this for the sequential bytes at DPTR, but for the range of the
bitfield (#00000001b to #10000000b).

Can anyone see a way to optimize out some cycles from this process?
For reference, this is a chip doing video output.  Even single cycle
optimizations can be big, at the rate that this block is being iterated
over.

Thanks to anyone that can help.

Alex McHale


Re: 8051 Speed Optimization

Quoted text here. Click to load it

how about having mDATAOUT in bit addressable area
and then doing:

    subb A,rLEVEL
    mov  C,mDATAOUT.0
row1:

Note, the subb you are using depends on the state of Carry,
so your present code has some LSB strangeness:^)
(you eventually could use an add instruction and
then complement mDATAOUT after you processed all 8 bits)

Greetings,

Frieder

Re: 8051 Speed Optimization

Quoted text here. Click to load it

Ooops, I meant to write:
       mov  mDATAOUT.0,C

Re: 8051 Speed Optimization
snipped-for-privacy@gmail.com schrieb:

Quoted text here. Click to load it

You could also use CJNE for the comparison - it also sets carry, but is
non-destructive, so you don't need to reload A from XRAM for each test:

    movx    a,@dptr
    cjne    a,ar0,$+3
    mov    mDataOut.0,c
    cjne    a,...
    mov    ...,c

As like Frieder's approach, this one also resets the output bit in case
the data value is lower than the treshold. If you want latching
operation, you need to use conditional jumps again or consider the
previous state:

    movx    a,@dptr
    cjne    a,ar0,$+3
    orl    c,mDataOut.0
    mov    mDataOut.0,c
    cjne    a,...
    orl    c,...
    mov    ...,c

Then it's only slightly faster than your original code, however the time
is fixed (independent of the data), which might also be an advantage.

--
Dipl.-Ing. Tilmann Reh
http://www.autometer.de - Elektronik nach Ma.

Re: 8051 Speed Optimization
Tilmann Reh schrieb:

Quoted text here. Click to load it

I forgot to mention: The polarity of the output bits is opposed to that
of your code. You'd have to negate the data before comparison or CPL the
output bits afterwards (if you can't switch the polarity otherwise).

--
Dipl.-Ing. Tilmann Reh
http://www.autometer.de - Elektronik nach Ma.

Site Timeline