Assembly Tutorial
#1
Assembly Tutorial



This tutorial will teach you how to read/write basic Power PC Assembly Language. After this tutorial, you should be able to read/write the majority of basic MKWii codes that have been made in Assembly. Also, this tutorial is a supplementation to my thread - 'How to Make your own Cheat Codes on Dolphin', which can be read HERE. This Assembly tutorial is designed in a manner to teach specifically for the use of Assembly in MKWii Cheat Codes. However, one can still utilize this tutorial for learning Power PC Asembly in a general sense.


Chapter 1: Introduction

As mentioned earlier, this thread is related to the Dolphin Cheat Code Tut thread. Therefore the requirements of a person before reading this thread, is the same as what was listed in the Dolphin Thread (under Chapter 1: Requirements).

Now lets begin...

All CPU's have an assembly language (ASM for short). It is a low-level computer programming language. Unlike higher levels of language (such as C++, Python, Java), ASM is not 'portable' to other types of CPUs. Of course companies can produce a line of CPUs to share the same or nearly same Assembly language, but taking ASM from an old Intel CPU, and trying to use it to write code on a new non-Intel CPU would not be possible. So, learning ASM will not let you know the entirety of another ASM language for a different CPU. 

This portability issue isn't a problem because all Wiis use the same CPU model. The ASM language that all Wiis' CPUs use is called PowerPC ASM. PowerPC ASM was jointly created by IBM, Apple, and Motorola. The only computer programming language that is at a lower level than ASM is binary code (pure machine language). Thus, this allows the user to understand how MKWii functions at near its lowest level. The downside to this is that the more complex code you need to make, basically the difficulty increases exponentially. This is why very complex ASM codes are rare in MKWii. It takes a lot of intelligence, creativity, and patience. 


Chapter 2: Registers

A register is an accessible location of the Wii's CPU. Assembly language is just a set/list of instructions that utilize these accessible locations in all sorts of ways. You can load, move, store, etc various types of data within these Registers, the possibilities are endless. There are all types of Registers. First thing's first. There are 32 normal integer registers. These registers are referred to as the General Purpose Registers (GPR for short). Most MKWii codes only use the GPRs. For the beginner code creator, these registers are the only registers you need to know. However, let's go over some more types.

There is a 'PC' register, aka the Program Counter. It's simply a register that holds the address of the next ASM function that will be executed. Easy enough. There are also 32 Floating Point Registers (FPR for short). They obviously use floating point values instead of normal integer values.

The Link Register holds the address that is saved to 'jump' back to/from a subroutine. This register is used for what is known as branch instructions.

Most registers function as 32-bit lengths of data. FPRs function as 64 bit, along with a few other special type of registers which won't be covered.


Chapter 3: Variables

There are 3 types of variables. Words, halfwords, and bytes. Words are for 32 bit data, Halfwords for 16 bit data, and bytes for 8 bit... Majority of time, words are used in writing ASM for codes. Once a beginner code creator is familiar with using words, he/she can then implement halfwords and bytes for less basic codes.

Integers and memory address locations:

Ok, now that you understand what the Registers are and what data variables can be used within those registers. Let's talk how registers are mainly used for MKWii cheat codes.

The most common form of action in a Register is loading (setting) a value in a Register. This can be a word/halfword/byte that can represent a character, vehicle, track, item, etc etc. The second most common form of using a Register is the address or memory location. A code creator can set a Register (word value) to represent a memory address, which he/she can later use to store/write/move/load data to/from. Pretty simple stuff.

Characters/symbol set:

When you write out ASM instructions to later be compiled to make a cheat code, various symbols are used for proper formatting. This will allow the compiler to read your ASM and compile it correctly.

List of symbols:
. (period)
: (colon)
, (comma)
() (parenthesis)
+ (plus)
- (minus)
_ (underscore)
# (hash tag)
x (not multiply, this is for writing Hex values)

The symbols a basic code will use will only be commas, parenthesis, and the x (if using Hex values). Before continuing further lets talk about using Hex vs Decimal for writing ASM.

Hex vs Decimal:

Compiled cheat codes are shown in Hex byte code. It's common sense you need to know Hex beforehand. For writing ASM, there are certain elements of an ASM function that you can write in Hex. However, the downside is that most compilers will decompile a cheat code using decimal. If you are not sure what to use, then I recommend that when using integer data (for representing something such as an item value), stick with decimal. However, using Hex for using Memory addresses is vastly superior to decimal. Regardless, you need to know Hex to write ASM.


Chapter 4: Format for Writing ASM

Alright, now let's get into proper formatting of writing ASM. Remember those registers I told you about? Any GPR is written as rX. X = the register's number. Remember there are 32 registers. However, the first register is Register 0, aka r0. The last is Register 31, aka r31. In every ASM instruction there is a destination Register. The destination Register is the Register that holds the result of an ASM function. The source Register is a Register that is used to compute an ASM function. Some instructions will have one source register, while others will have two. Every ASM function can only have one destination Register. Here's a basic format of two source registers with the dest. register

rD, rA, rB

rD = Destination Register
rA = 1st Source Register
rB = 2nd Source Register

Keep in mind this is not an actual ASM function, or an exact correct format. This is just to show you a very very very general look of any ASM function that uses two source registers to compute a value for the destination register. Now let's look at an example with just one source register..

rD, rA, VALUE

rD = Destination Register
rA = Source Register
VALUE = Any Unsigned Decimal/Hex Value

VALUE is an unsigned decimal/hex value used along with the source register to compute a value for the Destination Register. Values cannot exceed 16 bits. Unsigned simply means you wrote it in from scratch. You did not do any computing, or execute any previous ASM functions, for that value to appear. You simply wrote it in and used a register (Source Register) to compute a value for the Destination Register.


Chapter 5: Integer (Basic) ASM Instructions

Ok at this point you sould have a well understanding of the...
Registers
Symbols that can be used in ASM
General Format/Layout of ASM instructions

Let's go over actual real world instructions that a person would use to make codes. Here is one of the most basic ASM instructions....

Add (add's two source registers to compute the value of the destination register)

add rD, rA, rB

Very elementary. The value of rA is added with the value of rB. rD will hold the result of the two values added together. Notice the use of the comma (one of the symbols listed earlier). The commas will let the compiler know there are three registers being used.

Let's say we add the values of r1, and r25. The result of this value will be stored in r20. Our 'add' ASM instruction would be this...

add r20, r1, r25

For a majority of ASM instructions that use two source registers, you can swap them. So you can also write this as...

add r20, r25, r1

You obviously can't change the spot where the destination register is within the function. Keep in mind some ASM instructions won't allow this swapping of source registers. Let's move onto another very basic ASM instruciton...

Add Immediately

addi r4, r30, 12

Notice the number 12. It doesn't have the letter 'r' before it. Therefore we know the 12 is an unsigned value. Thus this function adds the value of r30 and the value of 12. The result will be stored in r4. For the addi function, you CANNOT swap the positions of 12 and r30! If you wanted to write this same function in Hex, it would be like this..

addi r4, r30, 0xC

The '0x' must be put before any hex value, or the compiler will compile it as decimal or not compile it at all (throw an error). You can of course throw a minus (-) before your unsigned value to designate a negative number. So if we did.....

addi r4, r30, -12

This would be adding the value of r30 and negative 12. Thus we are actually subtracting 12 from the value in r30. To save ASM writers time, you can use what are called simplified mnemonics. A simplified mnemonic is a 'shortcut'/'simplified' version of an ASM function.

The simplified mnemonic for addi r4, r30, -12 is...

subi r4, r30, 12

Subi stands for Subtract Immediately. It is actually not a REAL ASM function per say, but compilers have been configured to read these 'ASM shortcuts'. Let's go over the most common simplified mnemonic of all....

Load Immediately

li r1, 0xFF

As you can see there are no source registers in this ASM function. It is a shortcut. Easier to write, easier to read, easier to understand. The actual ASM for li r1, 0xFF is addi r1, 0, 0xFF. You will right away notice the 0 in the middle doesn't have an r in front of it...

Special note about r0:
In the ASM functions addi & addis (add immediately shifted), r0 (when used as the source register) is always treated by the compiler as the number 0. This is because in order to load a value in a register by writing a real ASM function such as addi, a zero is needed in the function, and is thus in the middle between the destination register and unsigned hex/decimal value.

Going back to our 'li' function...

This 'li'  function simply sets a register to the designated unsigned value. Which is 0xFF in our case. Therefore, after that ASM function is executed, register 1 now is 0xFF. 

Important NOTE About Viewing the Registers in Dolphin or USB Gecko:
Every Register for PowerPC is displayed in Hex. It doesn't matter if you use Dolphin or USB Gecko. When Registers are displayed, their entire length of data is shown. Please also note that further ahead in this tutorial, values of Registers will have a '0x' in front of them just to remind the user they are in Hex. However, when you are viewing Registers in Dolphin or USB Gecko, there will be no '0x' in front of the Register value.

Since register 1 is 0xFF, it will actually be displayed on Dolphin & USB Gecko as just '000000FF'.

At this point, you are probably wondering about if data in Registers get erased after certain ASM functions. Source Registers do NOT get their value erased when used in an ASM function. Only the Destination Register loses its original value. So in our earlier 'add r20, r1, r25' ASM function, r1 & r25 still retain their original values, they don't lose any data or reset to zero.


Chapter 6: Store, Load (Less Basic) ASM Instructions

Ok we've gone over the most basic instructions. Let's kick it up a notch. Now we are diving into the realm of loading and storing data to/from memory locations. Let's take a look at one of the most basic store-type instructions...

Store Word

stw rD, VALUE (rA)

In this function, the value in rD is stored to rA. rA is treated as a memory location (address). The use of the parenthesis around the register will let the compiler know that too. Therefore, its data is not lost. This is why the register for the memory location is written as rA (source register), and not written as rD. However in store-type ASM functions will also not lose its data. Let's look at an example...

stw r3, 0x0020 (r28)

The word (32 bit value) of r3 (which is the entire value of the register) will be stored at the memory location (address) that is listed in r28. r28 is not being used as normal data per say, its value is being used to use as a location in RAM. The unsigned value of 0x0020, is known as the 'offset'. This offset is added to the value in r28 to use as the number that will be the finalized memory location of where the word in r3 is stored.

So let's say our value  in r3 is 0x0000200A, and r28 is 0x80001500. Ok, first we add the offset to 0x80001500. We now have the finalized memory location value of 0x80001520. Let's say before the ASM function, the word at 0x80001520 was 0xFFEF1023. After the ASM function is executed, the word is now erased and replaced with 0x0000200A.

Onto another ASM function...

Load Word & Zero

lwz rD, VALUE (rA)

This is simply the 'reverse' of stw. rA is treated as a memory location. VALUE is the offset that is added to rA for the finalized memory location that will be used. However, in a lwz function, the destination register will be replaced by whatever word is at the finalized memory location.

lwz r31, 0 (r1)

For this lwz function, the offset is 0 (no offset). Therefore, nothing is added to r1 to use as the finalized memory location to load the word from. Let's say r1 is 0x806553E4, and the word at that address is 0x00000001. After the ASM function is executed, r31 is now 0x00000001. The previous data in r31 is erased. The word at the address of r1 is NOT erased, the 0x00000001 at that address in RAM remains intact.

Quick tip:
You are probably wondering at this point how to write a whole 32 bit value from scratch to a Register. This is useful for establishing memory locations to later use for store-typ and load-type ASM functions. So let's say we want to write the value of 0x80E6FF30 to Register 22, how do we do this? Simple, with just two ASM functions like this...

First we write the 'top half' or 'left side' of the Register. This is known as the upper 16 bits. For example:

Load Immediately Shifted

lis r22, 0x80E6

This function is called Load Immediately Shifted. Similar to the Li funciton but ofc we are writing the upper 16 bits. Now what happens to the lower 16 bits, or right side of the register? They are CLEARED/DESTROYED. This means the lower 16 bits are always set to 0x0000 anytime 'lis' is executed.

So at this point, r22 has a value of 0x80E60000.

Ok now let's write in the right side/bottom half (lower 16 bits). We do this with a function called Or Immediately. It takes an unsigned value , and does a logical OR with the Register and then the result is stored back into that same register. Like this....

Or Immediately

ori r22, r22, 0xFF30.

Now r22 will have our desired value of 0x80E6FF30. Simple to do!


Chapter 7: Branch, Compare (Intermediate) ASM Functions

Remember those list of symbols I showed you in Chapter 3? You already know about the use of commas and parenthesis, we will go over some more in this chapter. This will also take us into intermediate level ASM functions.

The earlier instructions were known as Integer, Load, and Store Instructions. Now let's cover Branch instructions. Let's look at the most basic branch ASM function......

Branch

b 0x8
li r1, 1
stw r1, 0 (r31)

The letter b is use for what is known as an unconditional branch. Unconditional meaning the branch is executed no matter what the conditions are. Think of it like a 'jump'. The branch will skip/jump over a certain amount of ASM below, thus not executing it. In the provided example, the 'li r1, 1' function would be skipped. Now, the '0x8' next to branch is the amount to 'jump/skip'. Obviously, the larger the jump, the harder it would be to correctly calculate this amount of jump. Therefore we use a trick called 'labels'.

Labels are just that, they are labels.  Wink

To allow any compiler to know you are using labels, you designate labels with two symbols. The underscore symbol and the colon symbol. To first establish a branch label name, you must implement an underscore somewhere in the name. Like this...

b the_label

You can name labels whatever you want as long as you use the underscore and do NOT use special characters like percent signs or dollar signs. Just stick to basic letters. Okay, you have set the label name, now all you need to do is put that same label name right before the first ASM function that you want executed after the jump. Put in the label name and add a colon afterwards like this...

b the_label
li r1, 1

the_label:
stw r1, 0 (r31)

Btw, you are not limited to just jumping 'forward/down', you can jump backwards/up too.

the_label:
add r1, r10, r20
lwz r1, 0 (r15)

b the_label

Very simple to understand, very easy to implement. Now the branche in the provided example above would be useless. Why would you randomly skip over ASM functions? Well branches are needed if you wanted to create a subroutine. Think of your list ASM functions like a road. When the game is preforming the list of functions one after another, think of that like traffic driving on the road. However, you can now put a fork in the road, and tell the traffic which way route to take. The two routes will then later merge back together.

Now you have a mental image of how ASM functions run, let's dive into Conditional Branches. Now normal (unconditional) branches used only be themselves would not make logical sense. We need a create a that 'fork' in the road. The easiest to create that fork is conditional branching.

Conditional branches are branches that only execute base on an 'if'. For example let's look at the 'branch if not equal' ASM function...

Branch If Not Equal

bne the_label

li r1, 1

the_label:
stw r1, 0 (r31)

Ok the_label will only be 'jumped to' if the conditional branch is true. In order to set up this 'if' for a conditional branch, we need to make a comparison. The most command ASM function to establish a comparison is the 'cmpwi' ASM function.

Compare Word Immediately

cmpwi rD, VALUE

Value in rD is compared to VALUE.

cmpwi r10, 0xA

The value in r10 will be compared to the value of 0xA. We have thus created our 'if statement'. So now add in the bne function from earlier....

cmpwi r10, 0xA
bne the_label

li r1, 1

the_label:
stw r1, 0 (r31)

So in this example. The value in r10 is compared to the value of 0xA. Then if the value in r10 is NOT equal to 0xA, you will thus 'jump' to the_label, thus skipping the 'li r1, 1' ASM function.

Let's look at another example using a different conditional branch...

Branch If Equal

cmpwi r10, 0xA
beq the_label

li r1, 1

b the_end

the_label:
stw r1, 0 (r31)

the_end:
stw r3, 0x0010 (r24)

As you can see not only are we using 'beq' now, we are adding an unconditional branch and a second label called the_end. You should quickly see why I added the unconditional branch. Remember the road analogy I used earlier... Let's take the first route of the fork in the road (if r10 does equal A)

If r10 equals A, we jump to the_label. We thus execute the first 'stw'.... Now remember the traffic/road analogy, we now go right to the next ASM function, the second 'stw'. The label name itself is NOT a barrier in our 'road' in any way shape or form. The labels are just label names to calculate the branch offsets for the compiler so you don't have to do the work.

Let's instead take the second route of the fork in the road. If r10 isn't equal to A, we don't jump at all to the_label. We instead go straight down our road to the 'li' function. After that, we encounter our unconditional branch. This obviously means we take the branch/jump no matter what. We do this because why would we go to the_label when our r10 value wasn't equal to 0xA? That would make no sense. Therefore we jump to the_end, thus skipping the first stw function.

Here is a list of commonly used 'if-type' Branches..
beq - Branch If Equal
bne - Branch If Not Equal
bgt - Branch If Greater Than
blt - Branch If Less Than
bge - Branch If Greater Than Or Equal To
ble - Branch If Less Than Or Equal To

Let's go over another Compare ASM function really quick... 

Compare Word

cmpw rD, rA

This will simply compare the values of two registers instead of using an unsigned value.

cmpw r4, r8
bgt the_label

In this example, if the value in r4 is greater than the value in r8, then the jump to the_label will be taken.


Chapter 8: Extra Stuff

Let's go over some more symbols that we haven't covered yet.

Period (.):

You can use this period to establish a value to have it's own unique label name. Btw, this has nothing to do with branch labels. Think of these like making definitions, or having 'macros'. The period is followed by the word 'set'. Just like branch labels, you need to incorporate an underscore. For example:

.set ITEM_MUSHROOM, 0x4

...some ASM here....

li r31, ITEM_MUSHROOM

This now allows the ASM writer to put ITEM_MUSHROOM for any time we wants to use the value of 0x4. Very basic 'macro' per say. Can come in handy if you are writing lengthy ASM.

Plus & Minus (+ and -):

The plus and minus symbols are used for conditional branches. Whenever a branch is done, you can help the CPU by giving it a 'hint'. The plus symbol stands for more-likely, while the minus symbol stands for less-likely. For example....

cmpwi r8, 0xC
bne+ the_label

The plus symbol next to the 'bne' will tell the CPU that the branch is more-likely to occur.

Hash Tag (#):

Whenever someone is writing very lengthy ASM, it can be handy to add notes that will let that someone know why he/she wrote those functions. Here's an example of using hash tags to add notes/comments:

lis r4, 0x8000 #Set 1st half address to the store word to
stw r30, 0x157C (r4) #Store word to memory location 0x8000157C, the offset amount is used to complete 2nd half of address


Chapter 9: Conclusion & Credits

Alright, this should help get you started writing PowerPC ASM for your cheat codes. For more ASM examples, visit this thread HERE.

Once you are have created a couple of basic ASM codes, read the Wiibrew ASM tutorial HERE. It is more in depth and gives a more technical approach. Keep in mind, they are teaching ASM in a general sense for program creation, not for using ASM in codes specifically.

Credits:
IBM, Apple, and Motorola (creators of PowerPC ASM)
WiiBrew (a lot of information was gathered from there)
Star (taught me ASM)
Reply
#2
blyatful
Reply
#3
Wow, this guide is actually quite comprehensive and well-written. It helped me a lot in getting started on ASM and led me into learning Hex (which I didn't know beforehand, although it's actually pretty simple). Kudos
Reply
#4
Thank you for the kind words. Assembly by itself isn't that tough to learn, it's just very difficult coming up with code ideas from scratch and applying your ASM knowledge into making an actual cheat code.

I would suggest going through the codes forum and looking at the Source of basic ASM codes. Ones either written by me or Star. We put plenty of good comments in our Source to help others understand how the code(s) work.

EDIT:

Here is essentially the most basic ASM you can do. Writing a value in a register before that value gets stored.

https://mkwii.org/showthread.php?tid=848

I was looking at a value in the RAM Viewer. I noticed it would get written to whenever I did a certain action with my item. Therefore i set a Write BP. I used my item, the value in memory gets written a new value, the Write BP gets set and the game pauses. I see in the Code view that the value in Register 31 is getting stored to a spot in memory. 

This is easy to manipulate. As you can see in the Source, I simply load in a custom value in Register 31 (replacing the legit value), and then including the game's default ASM to allow the game to store the new value to memory. Very simple.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)