Motorola 68000 (m68k) Optimizations

Motorola 68000 (m68k) Optimizations

Overview

This page contains a number of simple tricks and optimizations for speed on the the Motorola 68000 (m68k) chip for use on the Commodore Amiga.

This list was compiled after relearning Amiga assembler programming in 2016 after a 25 year gap. I wish I’d had this information in the 90s – I didn’t even know about moveq for the longest time!

Most of the information was found whilst trawling through the English Amiga Board coding forums and this amazing article from BYTE magazine.

Note: The information here is specifically targeted at the Amiga and base 68000 – some of these optimizations won’t be effective on a 68020 or later.

Basic Optimizations

Here are some basic optimizations. Generally these are drop-in replacements that can be used without much thought.

UnoptimizedOptimizedNotes
clr.l d0moveq.l #0,d0moveq is faster than clr. Also this sign-extends to 32-bits so even though the value can only be -128 to +127 it is very useful for ensuring the high 16-bits are cleared.
movea.l #0, a0

or

sub.l a0,a0

moveq.l #0,d0
move.l d0,a0
lsl.w #1,d0add.w d0,d01 add is quicker than a shift.
lsl.w #2,d0add.w d0,d0
add.w d0,d0
Even 2 adds is quicker than a shift.
adda.w #10,a0lea 10(a0),a0
moveq.l #16,d0
ror.l d0,d1
swap d1Shifting left by 16 bits. Swapping is quicker.
moveq.l #15,d0
ror.l d0,d1
swap d1
rol.l #1,d1
Shifting left 15 bits. Swapping followed by a single shift is quicker.
add.w d0,a0
move.l (a0),a1
move.l 0(a0,d0.l),a1
cmpi.l #num,d0moveq.l #num,d1
cmp.l d1,d0
num must be in the range -128 to +127.

 

Additional Optimizations

These are some additional optimizations that can be used:

UnoptimizedOptimizedNotes
bsr.s someroutine
rtssomeroutine:

rts
bra.s someroutine

someroutine:

rts

If returning directly after a “jsr” or “bsr” call then you can just use a basic branch and let the subroutine do the return.
jsr sub1
jsr sub2
jsr sub3
rts
pea sub3
pea sub2
jmp sub1
Pushing return addresses onto stack in reverse order, then directly jumping to the first sub-routine. (68000 only). After each sub-routine does “rts” it will “return” to the start of the next subroutine.