Creating Loops (Pt. 2)
#1
Creating Loops Pt. 2

Part 2: Complex memcpy loops

NOTICE: This will be for veteran ASM coders, that are writing a code with multiple loops within...

If you have a very complex assembly code that has multiple loops. It may be best to use what are called memcpy loops. Instead of writing a bunch of separate loops, you will jump (call) to the game's built-in looping function every time a loop is needed to be executed.



Step 1. Backing up  Registers



The memcpy function makes use of the following registers: r0, r4, r5, r6. It would make sense to backup the original values of these registers beforehand. There are multiple ways to do that. Since a code that requires memcpy is already going to be very long and complex, you are probably already pushing the stack with this method HERE. You can push the stack, then move r0, r4 thru r6 to some upper registers for backup.

You can also use the following version of pushing the stack listed below to backup Registers 2 thru 31 (credits to Star for this snippet of code):

stwu r1,-0x80(r1)
stmw r2,8(r1)

...

lmw r2,8(r1)
addi r1,r1,0x80



Step 2. Preview of the memcpy loop function



In RAM, here are the following address with ASM functions of the memcpy loop function

80005F4C lbzu r0, 0x0001 (r4)
80005F50 stbu r0, 0x0001 (r6)
80005F54 subic r5, r5, 1
80005F58 bne+ -0xC ###if r5 not equal to zero, jump back to address 0x80005F4C###
80005F5C blr

As you can see, the memcpy function uses a 'subic.' type loop. You will also notice a blr function at the end. If you are reading this tutorial, you should already be able to understand the basic use of blr's.


Step 3. Setting up the appropriate Registers



Alright at this point r0, r4, r5, r6 should be backed up. 

We can tell that r4 represents to address to load data from. r0 is the regiser to temporarily hold said data. r6 is the register used for the storing address. r5 is used as the 'countdown' register telling the game how many times to execute the loop.

Let's say we have a string of data..

Address    Data
80002000 11112222
80002004 33334444
80002008 44445555
8000200C 66667777
80002010 88889999

We want to store these 5 words of data to the starting address of 81405000. Let's setup our loop...

lis r4, 0x8000
ori r4, r4, 0x19FC

Now we are going to set up the 'countdown' register which if r5 for memcpy. Memcpy uses lbzu & stbu instead of lwzu & stwu. Thus the offset value that memcpy uses for updating is 0x1 instead of 0x4. Therefore, we need to load the amount of bytes instead of words for register 5. 5 words of data is 20 bytes. 20 converted to Hex is 0x14. Load 0x14 into r5

li r5, 0x14

lis, r6, 0x8140
ori r6, r6, 0x4FFC

No need to edit in r0 ofc, as r0 will be the register used to temporarily hold the data for loading then storing.


Step 4. Calling the memcpy function



Alright our required registers are configured. Now let's call the loop function to get the loop going...

To call a function, we will set up a memory address value. That value will be the start of the memcpy, which is 0x80005F4C. We will use register 7 to set up this address.

lis r7, 0x8000
ori r7, r7, 0x5F4C

Next we need to move this address to the Link Register. You should always backup the original Link Register value first. Let's backup the OG Link Register value to register 8...

mflr r8

Now let's move r7 value (memcpy address) to the LR

mtlr r7

Time to call the function!!!

blr

Alright great, we are now at the first address of the memcpy function. But there's an issue. Can you figure it out?............Waits.....

The blr after our 'mtlr 7' is the issue. Let's pretend we continued on and did the memcpy loop. When the loop is finally done, it will (at address 0x80005F5C) call a blr. Well it will then jump back to 0x80005F4C, because that's the value that we currently have in the LR from the mtlr r7 function.

This will cause the loop to keep looping back forever... Thus, the game will freeze.


Step 5. Blrl



Welp blr won't work. So what can we do? We use the clever function blrl (Branch To Link Register Then Re-Link)

The blrl will branch to our link register address, and setup the LR to allow us to exit the loop once the memcpy's own blr is executed.

So instead of using blr, let's plug in a blrl instead. 

blrl

Before continuing, let's take a look at a portion of the game's memory pretending we are watching it step through function by function.. Obviously i will just throW in a list of random addresses..

800022C0 mtlr r7 #Moving memcpy address to LR
800022C4 blrl #Branch to memcpy then re-link (update LR) to address 0x800022C8
800022C8 stw r10, 0 (r29) #Random ASm placed here, has nothing to do with the loop.

So as our code is executing, we execute the blrl. At this point, we jump to the address in the LR (0x80005F4C), and at the same time the Link Register is now updated with the address 0x800022C8.

So once we execute the memcpy's own blr (once loop is done), we will be at address 0x800022C8 and the stw function will be executed (code continuing like normal)


Step 6. Conclusion



As you can see while this is complicated to setup. However, if you are using a lot of loops, this may come in handy and could shorten your overall length of code. Blrl is not really used in any mkwii codes, as its only needed in very complex function-jumping within a length ASM code.

Lets take a final look at all our instructions put together...

##some ASM here##

lis r4, 0x8000 #Set up 1st half address for updating Loading Adress
ori r4, r4, 0x19FC #Set 2nd half address for Loading Address

li r5, 0x14 #Load 0x14 into the 'countdown' register

lis, r6, 0x8140 #Set up 1st half address for updating Storing Address
ori r6, r6, 0x4FFC #Set up 2nd half address for updating Storing Address
 
lis r7, 0x8000 #Set up 1st half address of memcpy function
ori r7, r7, 0x5F4C #Set up 2nd half address of memcpy function
 
mflr r8 #Backup OG Link Register

mtlr r7 #Copy memcpy address to the Link Register

blrl #Branch to Link Register Then Re-Link! (we are now at memcpy, even though you cannot see it)

##some ASM here##

mtlr r8 #Restore OG LR's value



Thanks for reading! For a 'real word' example/demonstration. I took this code HERE and 'converted' it to call memcpy instead of writing a 'subic.' type loop from scratch.

(left address blank; using 01230123012301230123456789905 as the Mii Name)
C2000000 00000010
9421FF80 BC410008
7CE802A6 38C30067
48000045 00003000
31003200 33003000
31003200 33003000
31003200 33003000
31003200 33003000
31003200 33003400
35003600 37003800
39003900 30003500
00000000 7C8802A6
38A0003C 3D008000
61085F4C 7D0803A6
4E800021 7CE803A6
B8410008 38210080
8003006C 00000000

Source (using 01230123012301230123456789905 as the Mii Name):

####################
###START ASSEMBLY###
####################

#

####################
##Register Storage##
####################

stwu r1,-0x80(r1)
stmw r2,8(r1)

mflr r7 #Backup original Link Register into Register 7

################################################
##Address Config For Loop Mem Storage Location##
################################################

addi r6, r3, 0x0067 #Add 0x67 to value of r3 to setup proper address location in register 6 for upcoming memcpy loop

#########################################################
##Setup Loading Address Relative to the Program Counter##
#########################################################

##Following address after the 'bl' function will be stored in the link register##
##This will allow us to use it to later load the mii data into the loop##

bl link_label #branch to link_label, store address of link_label to the link register

################################
##Pseudo Ops (Mii Data Table)##
################################

##psuedo ops listed below are the mii data characters##
##Space value is for bytes of zeros##
##.llong is to put in a doubleword##
##.long is to put in a normal word##

##The .space 1 is put in because the first loaded address in the loop uses an offset of 0x1##
##So one bytes of zeros are needed to be added beforehand##

.space 1
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0034003500360037
.llong 0x0038003900390030
.long 0x00350000
.space 3

##The 0x0000 is added after the final ascii character to be used to tell the loop to stop loading data##
##The last .space 3 is for alignment reasons##

##############
##Link Label##
##############

##Now that we have our address that is at the start of the Mii Data table...##
##We want to move that address from the LR to r4 to begin loading data from it to the loop##

link_label:
mflr r4 #Move address that is 0x4 before the mii data from the link register to register 4

###########################
##Preparing Function Call##
###########################

li r5, 0x003C

lis r8, 0x8000
ori r8, r8, 0x5F4C
mtlr r8

####################################
##Calling Memcpy; Creating Re-Link##
####################################

blrl

###############################################
##Post Loop; Restore Original Register Values##
###############################################

mtlr r7 #Restore original Link Register

lmw r2,8(r1)
addi r1,r1,0x80

###############
##Default ASM##
###############

lwz r0, 0x006C (r3) #Default ASM

#

##################
###END ASSEMBLY###
##################

Code creator: zak
Code contributor(s): Star (used his Mii Extender code to setup a Write Breakpoint)
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)