Architecture 1001: x86-64 Assembly

Introduction

About this Class

x86 - it's called x86 because of the progression of Intel chips from 8086, 80186, 80286, etc.
x86-64 - used by server systems and supercomputers

Refresher: Binary to Hex to Decimal

Decimal (base 10)

Binary (base 2)

Decimal (aka Hex, base 16)

0000b

0x00

0001b

0x01

0010b

0x02

0011b

0x03

0100b

0x04

0101b

0x05

0110b

0x06

0111b

0x07

1000b

0x08

1001b

0x09

1010b

0x0A

1011b

0x0B

1100b

0x0C

1101b

0x0D

1110b

0x0E

1111b

0x0F

Example:

Given: 0x1337

To Decimal:

= 1 x 16^3 + 3 x 16^2 + 3 x 16^1 + 7 x 16^0

= 1 x 4096 + 3 x 256 + 3 x 16 + 7 x 1

= 4096 + 768 + 48 + 7

= 4919 or 4919d

To Binary:

4096

2048

1024

512

256

128

= 1001100110111 or 1001100110111b

Refresher: two's complement negative numbers

Signed numbers - either positive or negative values
Unsigned numbers - only positive values
Signed char - can hold positive 0x01 to 0x7F (127) and values 0x80 to 0xFF represents -128 to -1
Unsigned char - can hold 0-255

Negative values are represented as "two's complement" of their positive value, it is computed by flipping all bits and adding 1

Example:

Given: 0xFF is -1

= 15 x 16^1 + 15 x 16^0

= 255 in decimal

= 11111111b in binary

= 00000000b (flip)

= 00000000b + 1

= 00000001

= 1 or -255

Example:

Given: -128

= 1000000 in binary

= 01111111 (flip)

= 01111111 + 1

= 10000000

= 128 in decimal

Questions:

1. What is the hexadecimal two's complement representation of the lowest value possible in an 8 byte signed value?

0x8000000000000000 or 8000000000000000 or 8000000000000000h

2. What does that value correspond to in decimal?

-9223372000000000000 or -9223372036854775808

Refresher: C data type sizes

char - single byte
short - two bytes
word - intel's native 16-bit data size when x86 is a 16-bit architecture
double word (DWORD) - expanded to 32-bit
quad word (QWORD) - for 64-bit

Background: Endianess

Little Endian - (little end first) the least significant byte (LSB) of a word or larger is stored in the lowest address e.g. 0x12345678 -> 0x78, 0x56, 0x34, 0x12

Intel is Little Endian

Big Endian - (big end first) the most significant byte (MSB) of a word or larger is stored in the lowest address e.g. 0x12345678 -> 0x12, 0x34, 0x56, 0x78

Network Traffic is Big Endian, also many RISC systems. ARM is started out as Little Endian and now is Bi-Endian

Endianess applies only in memory NOT IN REGISTERS
Endianess applies to bytes NOT IN BITS

Computer Registers

Memory Hierarchy

First 3 on top is the represents memory that has a short-term memory and Last 3 on the bottom represents memory that has a long-term memory.

x86-64 General Purpose Registers

Registers - small memory storage areas built into the processor (volatile memory)
Intel - has 16 "General Purpose" Registers + instruction pointer which points at the next instruction to be executed

x86-32, registers are 32 bits wide
x86-64, registers are 64 bits wide

Intel Register Evolution

Intel Recommended Register Conventions

RAX - Stores the function return values
RBX - Base Pointer to the data section
RCX - Counter for string and loop operations
RDX - I/O Pointer
RSI - Source Index Pointer for string operations
RDI - Destination Index Pointer for string operations
RSP - Stack (top) Pointer, last value was put on the stack
RBP - Stack Frame Base Pointer, used to point the base current stack frame
RIP - Instruction Pointer, pointer to the next instruction to execute

First Instruction

No-Operation (NOP) - no registers, no values, nothing. Just there to pad/align bytes or to delay time (known as 0x90).

The Stack

Overview

Stack - is a Last-In-First-Out (LIFO) data structure where data is pushed on the top of stack and popped off the top, also conceptual area of RAM

Different OSes starts in different addresses by their own convention. Sometimes they are using Address Space Layout Randomization (ASLR)

By convention, stack grows toward lower addresses. Adding to stack means the top of stack is now at a lower address.

RSP - points at the top of the stack - the lowest address is being used
You can find on the stack:
- Return Addresses on the function
- Local variables
- Arguments passed in a function
- Save space for registers
- Dynamic allocated memory via alloca()

Push and Pop Instructions

Push - instruction that automatically decrements the stack pointer, RSP, by 8
r/mX - is a term that refers to r/m8, r/m16, r/m32 or r/m64 in the Intel
[ ] - brackets means to treat the value within a memory address, fetch value at that address
- Register -> rbx
- Memory, base-only -> [rbx]
- Memory, base+index*scale -> [rbx+rcx*X]
- Memory, base+index*scale+displacement -> [rbx+rcx*X+Y]

Pop - pop a value from the stack, in RSP, by 8

Push/Pop in 64-bit, they decrement and increment RSP by 8
Push/Pop in 32-bit, they decrement and increment RSP by 4
Push/Pop in 16-bit, they decrement and increment RSP by 2

Examples (RBP is red, RSP is blue):

If the High and Low Addresses is flipped, remember the sign for Low Address is (-) and High Address is (+)

NextNetwork+ (N10-008)

Last updated 1 year ago