ASM Tips n Trix
#1
ASM Tips n Trix

This thread will be a list of mini-guides/tips to help shorten or optimize your ASM codes. This is tailored towards a coder who recently started learning/utilizing ASM.




I. Using Offset Values to complete Memory Addresses

Let's say we want to load a word from the memory address 0x80001650. A beginner might write the following functions....

lis r12, 0x8000
ori r12, r12, 0x1650
lwz r11, 0 (r12)

This is not completely optimized. The use of the ori function is not needed. We can shorten this...

lis r12, 0x8000
lwz r11, 0x1650 (r12)

As you can see, the offset value was used to complete the 2nd half of the memory address. Keep in mind, you cannot exceed the value 0x7FFF for offsets, so this method won't work for a memory address such as 8045FFBF.




II. 'Register into a Register'

Let's say we have a code where we use Register 12 to load a word from and place it into Register 11...

lwz r11, 0x00AB (r12)

However, after this function, let's pretend we are no longer obligated to use Register 12. Well then there's no need to waste the use of Register 11, especially if we need that register for later. Therefore you should do this instead....

lwz r12, 0x00AB (r12)

Now we aren't wasting the use of Register 11.




III. Using a singular lis function for multiple loading/storing

Let's say we have the following functions...

lis r12, 0x8000
lwz r11, 0x1500 (r12)
lis r10, 0x8000
lwz r9, 0x1800 (r10)

We have a redundant ASM. We are executing the same lis function in two different registers. This is a waste of code. Use this instead...

lis r12, 0x8000
lwz r11, 0x1500 (r12)
lwz r9, 0x1800 (r12)

Now we saved the use of Register 10.




IV. Changing '3-Liner' stw-lwz ASM codes to 'Single Liner' li Type

We have the following list of functions..

li r5, 0xC
stw r5, 0x177B (r30)
lwz r5, 0x177B (r30)

This is redundant. There's no need to take our r5 value, store it to memory, and then immediately load it back from memory. Remove both the stw and lwz functions. You are left with this...

li r5, 0xC

In rare cases, some codes may require the stw-lwz functions back to back (like this code HERE). There's really no way (other than testing) to tell if a code stw-lwz won't work as a singular li function.




V. Optimizing Branch Routes

We have the following ASM...

cmpwi r21, 0x1
beq- the_label
b finish_code

the_label:
li r28, 0x14

finish_code:
stb r28, 0x2 (r30)

This is not fully optimized branch routing. There's no need to have two label names, you can do this instead...

cmpwi r21, 0x1
bne+ finish_code

li r28, 0x14

finish_code:
stb r28, 0x2 (r30)

As you can see if r21 is equal to one, it will continue down to the li function. This is more efficient that making two whole separate branch labels/routes.



VI. Avoiding Pushing/Popping the Stack

What some beginner coders will do (when needing extra registers in a code) is use the method of 'pushing/popping' the stack. Info for this is HERE. This will cause any code to naturally have more lines of compiled code. It is nice to have free registers, but if you are wanting to cut down the length of code, you should avoid the push/pop stack method.

We know Register's 11 and r12 are always free for use without restoration (99% of the time). You can also use a volatile register (r3 thru r10), and restore their original values at then end of your code. However, finding a volatile register to have the same value every time is the ASM function is executed (test this via a breakpoint over and over again), is actually rare.

Instead, you can use more registers (without restoring their original values), by looking ahead at further asm functions in comparison to your code's address. For example...let's say we have a code address of 0x80456000, and we have the following addresses plus ASM functions.

0x80456000 lwz r4, 0 (r5) #Default ASM
0x80456004 add r23, r6, r9
0x80456008 mflr r0
0x8045600C cmpwi r31, 0x1

If you have a code that is a loading type function (lwz, lhz etc) as the default address, and you are able to have the default asm at the end of the code, you can use r4 (for our example). r4 is free w/o restoration because it will get written to anyway.

r23 is also free, because it will get written to later. Same with r0. Obviously, we can't use r5, r6, r9, r31, because they are being used as variables for the functions. So even using them with restoring their original values is really not safe.

So with the functions listed above, our list of free registers would be r0, r4, r11, r12, and r23. Which will most likely be enough to not have to push/pop the stack.




VII. Optimizing conditions with the Record (dot) function

We have the following ASM...

lwz r5, 0x1AAE (r31)
add r6, r6, r5
cmpwi r6, 0x0
bne+ some_label

Certain ASM functions can have a dot (.) added to them. This is known as 'Record'. Record is a shortcut for cmpwi rD, 0x0. D = Whatever register you are using for the comparison. Please not that there's no way I can list all the functions that do or do not have the Record option. Refer to a full ASM handbook for assistance.

The add ASM function has the ability to add this Record feature. Like this...

lwz r5, 0x1AAE (r31)
add. r6, r6, r5
bne+ some_label

We now got rid of an unnecessary line of code.




VIII. Writing Activators/Deactivators within an ASM correctly

We have an ASM (NTSC-U code) where we will set a controller address (GCN), and we will load the button value (ZZZZ value; halfword) from the NTSC-U GCN contorller address. We want the use the Y button (0x0880 value) to jump to the_label in our ASM, so we have this snippet of code...

lis r12, 0x8034
lhz r11, 0x3E80 (r12)
cmpwi r11, 0x0880
beq- the_label

While this ASM will work, it will not allow a user to hold other buttons while pressing the Y button to get the ASM to jump to the_label within the code. In order for us to have this feature of being allowed to hold other buttons, we need to use what is called a Logical AND. Like this....

lis r12, 0x8034
lhz r11, 0x3E80 (r12)
and r11, r11, 0x0880
cmpwi r11, 0x0880
beq- the_label

Even though this snippet of code is one line longer, we now gain the ability to allow a user to hold other buttons while pressing the Y button.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)