Edited recently, added better content
This is a tutorial made by me:
Instructions: Instructions are the operation of the processor (CPU) determined by its instruction set.
Instruction Set: An instruction set is a group of instructions that the CPU can execute (reference: instructions)
Now: Let me start off with some basic instructions for x86.
mov - move
mov eax,eax
This copies and moves the value of eax into eax, causing no change in where the register points to or the value stored at where it points
mov ebp, esp
This copies and moves the address at the register esp into ebp, they point in the same location now. This is commonly used in the beginning of new functions for setting up a stack
mov [eax], 50h; mov [ecx], eax;
Copies the value 50h into the accumulator register's buffer (eax)
Makes the value at ecx point to the register eax, now they access the same data.
add - arithmetic add
mov eax, 00401000h
Moves the hexadecimal value 0x00401000 into eax's address, making eax point to 0x00401000.
add eax, 100h
Adds the hexadecimal value of 100 to eax's pointer, making eax point to 0x00401100
sub - arithmetic subtract
Based off of the idea that eax is still pointing to 0x00401100 we should subtract from it
sub eax, 100h
Why did we subtract from it? Because in this scenario, 0x00401100 was actually a location that didn't have any memory allocated to it and would raise an access violation exception if we attempted to access it. Registers can hold locations that aren't accessible, but cannot access those locations directly. So now the value is 0x00401000 again.
{ inc - increase/increment
dec - decrease/decrement }
The inc and dec instructions increase a register by 1 or decrease it by 1
inc eax
jmp - jump, jumps to a code location
jmp eax
The jump instruction is useful for changing the flow of execution in a program, it is much like goto in certain languages, and for some languages and compilers, goto will just compile to the jmp instruction.
The conditional versions of these jumps include functions like jne and je (jump if not equal, jump if equal) they're also referred to as jnz and jz (jump if not zero, jump if zero).
cmp - compare
This instruction is used to compare two values for equality/inequality, value measurements, etc... It is used for conditional programming / jumps
cmp eax, 00401000h jne 00501000h ; eax's pointer does not equal the value we compared it with, jump {data if it was equal here} 0x00501000: ret
There is more comparison instructions, another being test
push - pushes a value onto the stack (esp - stack pointer register)
push eax
pop - pops a value off of the stack into the specified location
pop eax
call - push instruction pointer onto the stack so the calling function knows where to return to and then jump to the function being called
call 0x00400000
0x00400000: push ebp ; save the stackframe for the caller mov ebp, esp ; point to this subroutine's stackframe mov eax, [ebp+4h] ; move the first argument into eax jmp eax ; eax holds a pointer to 0x00401000 0x00401000: mov esp, ebp ; Begin returning from the function pop ebp ; get the caller's stackframe again ret ; Return
As you can see, the flow of execution jumps to a new code location and returns from that location after doing work.
Furthermore
push esi ; contains "hello world" call 0x00601000 ; print(buffer) 0x00601000: push ebp mov ebp, esp push [ebp + 4] call msvcrt.printf add esp, 04 mov esp, ebp pop ebp ret (return to callee)
Remember that the function of push ebp and mov ebp, esp is to setup the stack for the subroutine, function arguments are passed into esp when called and offset by + 4; however, [esp+0] is reserved for the return address.
I didn’t include all the instructions, nor all the jump instructions either
Quick reminder:
mov eax,ecx; this moves the address at the register ecx into eax [assembly comments are after the terminator/semicolon “;†and are ignored by the compiler]
add [ecx], 0xFF; this adds the value 255(decimal) to the value of the ecx register, that’s what 255 is in hexadecimal add [ecx], 255; you can do this as well
register - no brackets = the address of the register [register] - brackets = value being held at the address of the register
As you can see the instructions are used here and they’re complemented by an operation. So think of it like this:
add = instruction add eax,15 = complement of the instruction making it an operation code mov [eax+15], 0x00; this moves the value 0 into eax and offsets it 15 places from the location
Register: Registers for now, are basically fast access storage units to place values (EAX,EBX,ECX,EDX,EDI,ESI,ESP,EBP) Those are the list of 32-bit registers.
Stackframe: Keeping it basic, storage for where values are pushed and popped, most used when completing an operation such as a subroutine call and values are pushed on the stack as arguments to the function.
Now lets make this more of a program: (I’m not going to include all of the code, but you’ll still understand it conceptually)
This is a simple register check that is compatible with FASM in Windows:
add [eax], 100 cmp [eax],100 je successful; if the comparison is not successful then it skips over this cmp [eax],100 jne fail successful: ccall [printf],â€goodâ€; printf is C push 0h call [ExitProcess] fail: push 0h call [ExitProcess]; Immediately exits the process
Memory Address: it's just a specific location in an address space
Important Memory Regions:
- .data is where most global variables and static information is located
- .code/.text is where the executable routines of the program are located
Edited by jasonfish4, 13 November 2018 - 11:48 PM.