← Back to Series
Malware Analysis — Chapter 01

x86 Architecture Overview

Malware doesn't conjure itself out of nowhere — it runs on a real CPU, in real memory, using the same architecture as everything else. Before I can look at a malicious binary and understand what it's doing, I need to understand the environment it's living in. This is the foundation that everything else in malware analysis builds on.
// Contents
  1. The Von Neumann Architecture
  2. CPU Registers
  3. Status Flag Registers
  4. Segment Registers
  5. Memory Layout
  6. The Stack

01 The Von Neumann Architecture

Most modern CPUs — the ones we'll encounter in malware analysis — are based on the Von Neumann architecture. It defines how a processor fetches, decodes, and executes instructions. That three-step loop is the heartbeat of every program running on a system, including any malware.

There are five key components:

CPU Component
Control Unit

Fetches the next instruction from memory. Uses the Instruction Pointer register to know where to look.

CPU Component
Arithmetic Logic Unit

Actually executes the instruction — does the maths, the comparisons, the logic. Puts the result in a register or back in memory.

CPU Component
Registers

Ultra-fast, tiny storage built into the CPU itself. Holds whatever the CPU is actively working with.

External
Main Memory (RAM)

Where the full program — its code and data — lives while it's running. Much larger than registers but slower.

fetch_decode_execute.py
# Simplified fetch-decode-execute cycle while program_running: instruction = memory[EIP] # fetch from address in EIP/RIP EIP += instruction.length # advance instruction pointer decoded = control_unit.decode(instruction) result = ALU.execute(decoded) # execute it store(result, register_or_memory) # save the result

What matters for malware analysis: the Instruction Pointer (EIP in 32-bit, RIP in 64-bit) always tells the CPU what to run next. Almost all exploitation techniques — buffer overflows, ROP chains, shellcode injection — ultimately work by corrupting or redirecting this register.

02 CPU Registers

Registers are the fastest storage available — they sit directly on the CPU die. Because there are so few of them, they're split by purpose. There are four main categories: the Instruction Pointer, General Purpose Registers, Status Flags, and Segment Registers.

The Instruction Pointer

Holds the address of the next instruction to execute. Simple concept, enormous importance. Called EIP in 32-bit systems and RIP in 64-bit systems. You'll see this referenced constantly in exploit writeups — it's always the prize.

General Purpose Registers

These are the CPU's working hands. Each has a conventional role (accumulated results, loop counters, stack tracking), but they can generally be used for any computation. Knowing their conventions is what makes reading disassembly faster.

Each register can be accessed at different widths. EAX (32-bit) contains AX (lower 16-bit), which contains AH (upper 8-bit) and AL (lower 8-bit). This sub-register access shows up constantly in assembly.

RAX
RAX (64-bit)
EAX
AH
AL
RCX
RCX (64-bit)
ECX
CH
CL
RSP
RSP (64-bit)
ESP
SP
RBP
RBP (64-bit)
EBP
BP
RegisterNicknameWhat it's for64-bit
EAXAccumulatorStores arithmetic results; holds return values from function callsRAX
EBXBase RegisterBase address for memory offset referencesRBX
ECXCounterLoop counters, string operation countsRCX
EDXData RegisterMultiplication/division overflow; I/O port operationsRDX
ESPStack PointerAlways points to the top of the stack — updates on every push/popRSP
EBPBase PointerStable reference point for the current function's stack frameRBP
ESISource IndexSource address for string/memory copy operationsRSI
EDIDestination IndexDestination address for string/memory copy operationsRDI
R8–R15Extended GPRs64-bit only. Additional registers; used heavily in x64 calling conventions for passing function argumentsR8–R15

03 Status Flag Registers

The EFLAGS register (64-bit: RFLAGS) is a 32-bit register made entirely of individual 1-bit flags. After each instruction runs, the CPU automatically updates the relevant flags. These flags drive conditional branches — if ZF is set, jump here; otherwise go there — so understanding them is essential for following program logic in a disassembler.

ZF — Zero Flag
Zero Flag

Set to 1 when an instruction produces a zero result. Used constantly in comparisons — CMP sets ZF, then JE (jump if equal) checks it.

CF — Carry Flag
Carry Flag

Set when a result is too large (or too small) for its destination register — an unsigned overflow or underflow. Signals the carry out of the most significant bit.

SF — Sign Flag
Sign Flag

Set when the most significant bit of a result is 1, meaning the result is negative in signed arithmetic.

TF — Trap Flag
Trap Flag

When set, forces single-step execution — the CPU stops after every single instruction. Used by debuggers. Malware actively checks this to detect if it's being debugged.

⚠ Malware Relevance — Anti-Debugging

The Trap Flag is a classic anti-debugging trick. Malware can deliberately set TF and then check whether it causes an exception that's handled internally or by an external debugger. If a debugger is present, behaviour changes — the payload hides or the sample terminates early. Always worth checking for TF manipulation during dynamic analysis.

04 Segment Registers

Segment registers are 16-bit registers originally designed to help address memory by dividing the flat address space into named regions. Modern OSes use a flat memory model so these are less critical than they once were, but they still appear in disassembly and one of them — FS — is actively used by Windows in ways malware exploits.

RegisterSegmentWhat it points to
CSCode SegmentThe executable code section of the program
DSData SegmentThe data section — global and static variables
SSStack SegmentThe program's stack
ES, FS, GSExtra SegmentsAdditional data sections. FS is used by Windows to hold the Thread Information Block (TIB) — a structure malware often reads to get process/thread info without calling obvious APIs.

05 Memory Layout

When Windows loads a program, it doesn't give it access to all of physical RAM. The OS presents each process with its own virtual address space — an abstracted view of memory that looks complete but is isolated from other processes. Within that space, memory is divided into four sections:

Stack local vars · args · return addresses
Heap dynamic allocations at runtime
Code executable instructions (.text)
Data initialised globals and constants

Code — machine code instructions. Has execute permissions. If malware can inject shellcode into a region and get the CPU to treat it as code, that's code execution.

Data — initialised globals and constants set at compile time. Rarely changes at runtime. Often contains interesting strings that show up in static analysis.

Heap — dynamically allocated memory (think malloc). Created and freed at runtime. Heap spray attacks and use-after-free bugs live here.

Stack — the most important section for malware analysis. Contains local variables, function arguments, and return addresses. Because return addresses control execution flow, the stack is the primary target for memory corruption attacks.

06 The Stack

The stack is a Last In, First Out (LIFO) structure. Picture a stack of plates — you can only add to or remove from the top. The CPU tracks it with two registers at all times:

ESP / RSP (Stack Pointer) — always points to the current top of the stack. Every push decrements it; every pop increments it.

EBP / RBP (Base Pointer) — a stable reference point for the current function's stack frame. While ESP moves around, EBP stays fixed, making it easy to reference local variables and arguments by a constant offset.

Stack Frame Layout

Every time a function is called, a chunk of the stack is set up for it — a stack frame. Reading from high addresses (top) to low (bottom):

↑ High addresses
Argument 2
Argument 1
Return Address ⚠
Saved EBP
EBP →
Local Variable 1
Local Variable 2
Local Variable 3
ESP →
↓ Low addresses (top of stack)

Function Prologue and Epilogue

There's a standard sequence of instructions at the start and end of every function call. Once you recognise these patterns in a disassembler, function boundaries become obvious.

prologue_epilogue.asm
; ── FUNCTION PROLOGUE (entry) ────────────────── push ebp ; save caller's base pointer on the stack mov ebp, esp ; new base pointer = current stack top sub esp, 0x18 ; reserve space for local variables ; ... function body executes here ... ; ── FUNCTION EPILOGUE (exit) ─────────────────── mov esp, ebp ; restore stack pointer (tear down locals) pop ebp ; restore caller's base pointer ret ; pop return address → jump back to caller
⚠ Stack Buffer Overflow

Notice that local variables sit right below the Saved EBP and Return Address. If a local buffer (say, char buf[64]) is written to without bounds checking, an attacker can write past the end of that buffer and overwrite the Return Address with any value they want. When the function hits ret, the CPU jumps to the attacker's address — their shellcode, a ROP gadget, anywhere. This is a Stack Buffer Overflow, and it's one of the most fundamental techniques in binary exploitation.

// Key Takeaways
← Series Malware Analysis Overview