Memory Layout & Architecture

Overview

Every running process has a virtual address space divided into segments with different purposes and permissions. Understanding this layout is the foundation of exploit development — you need to know where your input lands, where the return address lives, and which memory regions are executable.

This file covers x86-64 Linux process memory. The concepts apply broadly to x86 (32-bit) and other Unix-like systems with minor differences in address sizes and calling conventions.

Process Memory Segments

  High addresses
┌──────────────────────┐
│     Kernel space     │  (inaccessible from userland)
├──────────────────────┤
│        Stack         │  ↓ grows downward
│     (local vars,     │
│      return addrs,   │
│      saved regs)     │
├──────────────────────┤
│          ↕           │  (unmapped gap)
├──────────────────────┤
│        Heap          │  ↑ grows upward
│     (malloc/free)    │
├──────────────────────┤
│        BSS           │  (uninitialized globals, zeroed)
├──────────────────────┤
│        Data          │  (initialized globals)
├──────────────────────┤
│        Text          │  (executable code, read-only)
├──────────────────────┤
  Low addresses

Text Segment

Contains the compiled machine code of the program. Mapped read-only and executable. On non-PIE binaries, the text segment loads at a fixed address (typically 0x400000 on x86-64 Linux).

Data and BSS Segments

Data — initialized global and static variables (int x = 42;)
BSS — uninitialized global and static variables (int y;), zeroed at load

Both are readable and writable but not executable.

Heap

Dynamically allocated memory via malloc(), calloc(), realloc(). Grows upward from low addresses. Managed by the allocator (glibc ptmalloc2 on Linux).

Stack

Stores function call frames — local variables, saved registers, return addresses. Grows downward from high addresses. Each function call pushes a new frame; each return pops it.

Examining Memory Layout

# View memory mappings of a running process
cat /proc/<pid>/maps

# Or use pmap for a cleaner view
pmap <pid>

Example output from /proc/self/maps:

00400000-00401000 r--p  /tmp/binary     ← ELF header
00401000-00402000 r-xp  /tmp/binary     ← text (executable)
00402000-00403000 r--p  /tmp/binary     ← rodata
00403000-00404000 r--p  /tmp/binary     ← data (read-only relocations)
00404000-00405000 rw-p  /tmp/binary     ← data/bss (writable)
00405000-00426000 rw-p  [heap]          ← heap (brk-based)
7ffff7d00000-...  r-xp  /lib/libc.so.6  ← shared library
7ffffffde000-...  rw-p  [stack]         ← stack

The permission flags (r-xp, rw-p) show read/write/execute for each region.

With GDB

# GDB
# https://www.gnu.org/software/gdb/
gdb ./binary

# Inside GDB:
(gdb) info proc mappings
(gdb) maintenance info sections

With pwndbg loaded, the vmmap command provides a colored, annotated memory map:

# pwndbg
# https://github.com/pwndbg/pwndbg
# Inside GDB with pwndbg:
pwndbg > vmmap

x86-64 Registers

General-Purpose Registers

Register	Purpose	Caller/Callee Saved
RAX	Return value, syscall number	Caller
RBX	General purpose	Callee
RCX	4th argument (function calls), counter	Caller
RDX	3rd argument	Caller
RSI	2nd argument	Caller
RDI	1st argument	Caller
RBP	Base pointer (frame pointer)	Callee
RSP	Stack pointer	Callee
R8	5th argument	Caller
R9	6th argument	Caller
R10-R11	General purpose	Caller
R12-R15	General purpose	Callee

Special Registers

RIP — instruction pointer (next instruction to execute)
RFLAGS — status flags (zero, carry, sign, overflow)

In exploit development, controlling RIP is the primary goal — it determines what code executes next.

System V AMD64 Calling Convention (Linux)

Function arguments are passed in registers, then on the stack:

RDI — 1st argument
RSI — 2nd argument
RDX — 3rd argument
RCX — 4th argument
R8 — 5th argument
R9 — 6th argument
Additional arguments pushed onto the stack (right to left)

Return value goes in RAX (and RDX for 128-bit returns).

Linux Syscall Convention

Syscalls use a different register mapping than function calls:

Register	Purpose
RAX	Syscall number
RDI	1st argument
RSI	2nd argument
RDX	3rd argument
R10	4th argument (not RCX)
R8	5th argument
R9	6th argument

Invoke with the syscall instruction. Return value in RAX.

Common syscall numbers (x86-64):

Syscall	Number	Arguments
read	0	fd, buf, count
write	1	fd, buf, count
open	2	filename, flags, mode
execve	59	filename, argv, envp
dup2	33	oldfd, newfd
socket	41	domain, type, protocol
connect	42	sockfd, addr, addrlen

Stack Frame Layout

When a function is called, the stack frame looks like this (x86-64):

  High addresses (bottom of stack)
┌──────────────────────┐
│   Caller's frame     │
├──────────────────────┤
│   Arguments > 6      │  (pushed right to left, if any)
├──────────────────────┤
│   Return address     │  ← pushed by CALL instruction
├──────────────────────┤
│   Saved RBP          │  ← pushed by function prologue
├──────────────────────┤  ← RBP points here
│   Local variable 1   │
│   Local variable 2   │
│   ...                │
│   Buffer             │
├──────────────────────┤  ← RSP points here
  Low addresses (top of stack)

Function Prologue and Epilogue

; Prologue — set up frame
push rbp           ; save caller's base pointer
mov  rbp, rsp      ; set new base pointer
sub  rsp, 0x40     ; allocate 64 bytes for local variables

; ... function body ...

; Epilogue — tear down frame
leave              ; equivalent to: mov rsp, rbp; pop rbp
ret                ; pop return address into RIP

The ret instruction pops the top of the stack into RIP. If an attacker overwrites the saved return address on the stack, ret will jump to the attacker-controlled address.

GDB Quick Reference for Memory Inspection

# GDB
# https://www.gnu.org/software/gdb/
gdb ./binary

# Set Intel syntax (easier to read for exploit dev)
(gdb) set disassembly-flavor intel

# Disassemble a function
(gdb) disassemble main
(gdb) disassemble vulnerable

# Set a breakpoint
(gdb) break *0x401150
(gdb) break vulnerable

# Run with arguments
(gdb) run AAAA

# Examine registers
(gdb) info registers
(gdb) info registers rsp rbp rip

# Examine memory
(gdb) x/20xw $rsp          # 20 words in hex at RSP
(gdb) x/10xg $rsp          # 10 giant (8-byte) words at RSP
(gdb) x/s $rdi             # string at RDI
(gdb) x/10i $rip           # 10 instructions at RIP

# Step execution
(gdb) si                    # single instruction
(gdb) ni                    # next instruction (skip calls)
(gdb) c                     # continue

# Print expression
(gdb) print $rbp - $rsp     # distance between frame pointers
(gdb) print/x $rsp          # RSP in hex

x86 vs x86-64 Key Differences

Feature	x86 (32-bit)	x86-64 (64-bit)
Address size	4 bytes	8 bytes
Register names	EAX, EBX, ESP, EBP, EIP	RAX, RBX, RSP, RBP, RIP
Calling convention	Arguments on stack	First 6 in registers
Stack alignment	4-byte	16-byte (before CALL)
Syscall instruction	`int 0x80`	`syscall`
Return address size	4 bytes	8 bytes
Packing function	`p32()` in pwntools	`p64()` in pwntools

On x86-64, addresses often contain null bytes (e.g., 0x0000000000401234), which complicates exploits that use string functions — strcpy stops at null.