Hacking The Address Space: Can I Write To Null?
Introduction
Memory management and address space layout are fundamental concepts in operating systems and low-level programming. This comprehensive guide explores how modern operating systems handle memory addressing, with a particular focus on null pointer behavior and the lowest regions of virtual memory. We’ll dive deep into memory mapping, segmentation faults, and practical experiments with memory allocation.
Memory Layout Fundamentals
Virtual Memory Address Space
Modern operating systems employ virtual memory systems that provide each process with its own isolated address space. This virtual address space typically spans from address 0 (null) to the maximum address supported by the system architecture (e.g., 2^64 - 1 for 64-bit systems).
However, a fascinating aspect of modern systems is that they deliberately avoid placing program segments at low memory addresses, particularly near address 0. Instead, most program segments (code, data, heap) start at relatively high addresses, often around 0x400000 or higher. This design choice serves multiple purposes:
- Null Pointer Detection: By keeping the lowest memory addresses unmapped, the system can easily detect and trap null pointer dereferences
- Security: Creates a guard zone that helps prevent certain types of memory corruption exploits
- Address Space Layout Randomization (ASLR): Provides more flexibility for randomizing segment locations
Memory Mapping and Protection
Operating systems use page tables to map virtual addresses to physical memory. Each page (typically 4KB) can have different protection flags:
- Read (R)
- Write (W)
- Execute (X)
- No access (None)
The kernel maintains strict control over these mappings and enforces protection at the hardware level through the Memory Management Unit (MMU).
Practical Implementation: Memory Mapping Experiments
Let’s explore how we can interact with memory mapping through practical code examples.
Basic Memory Mapping Example
First, let’s create a program that attempts to map and access memory at different addresses:
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <stdint.h>
#include <errno.h>
#include <string.h>
#define PAGE_SIZE 4096
void print_memory_info(void* addr) {
printf("Mapped address: %p\n", addr);
if (addr == MAP_FAILED) {
printf("Mapping failed: %s\n", strerror(errno));
return;
}
printf("Page alignment: 0x%lx\n", (uintptr_t)addr & (PAGE_SIZE - 1));
}
int main() {
// Attempt 1: Let the system choose the address
void* addr1 = mmap(NULL,
PAGE_SIZE,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
printf("\nAttempt 1 - System-chosen address:\n");
print_memory_info(addr1);
// Attempt 2: Request a specific low address
void* addr2 = mmap((void*)0x1000,
PAGE_SIZE,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED_NOREPLACE,
-1,
0);
printf("\nAttempt 2 - Requested low address (0x1000):\n");
print_memory_info(addr2);
// Clean up
if (addr1 != MAP_FAILED) munmap(addr1, PAGE_SIZE);
if (addr2 != MAP_FAILED) munmap(addr2, PAGE_SIZE);
return 0;
}
To compile and run this code:
gcc -o memory_map memory_map.c -Wall
./memory_map
Expected output will vary by system, but typically looks like:
Attempt 1 - System-chosen address:
Mapped address: 0x7f9b23400000
Page alignment: 0x0
Attempt 2 - Requested low address (0x1000):
Mapping failed: Operation not permitted
Advanced Memory Access Patterns
Let’s create a more sophisticated example that demonstrates memory protection and access patterns:
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <stdint.h>
#include <errno.h>
#include <string.h>
#include <signal.h>
#include <unistd.h>
#define PAGE_SIZE 4096
static void handler(int sig, siginfo_t *si, void *unused) {
printf("Caught signal %d (SIGSEGV/SIGBUS) accessing address: %p\n",
sig, si->si_addr);
exit(EXIT_FAILURE);
}
void setup_signal_handler() {
struct sigaction sa;
sa.sa_flags = SA_SIGINFO;
sigemptyset(&sa.sa_mask);
sa.sa_sigaction = handler;
if (sigaction(SIGSEGV, &sa, NULL) == -1) {
perror("sigaction");
exit(EXIT_FAILURE);
}
if (sigaction(SIGBUS, &sa, NULL) == -1) {
perror("sigaction");
exit(EXIT_FAILURE);
}
}
int main() {
setup_signal_handler();
// Map a page with read/write permissions
void* addr = mmap(NULL,
PAGE_SIZE,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
if (addr == MAP_FAILED) {
perror("mmap");
exit(EXIT_FAILURE);
}
printf("Successfully mapped page at: %p\n", addr);
// Write to the page
*(int*)addr = 42;
printf("Successfully wrote to page\n");
// Change permissions to read-only
if (mprotect(addr, PAGE_SIZE, PROT_READ) == -1) {
perror("mprotect");
exit(EXIT_FAILURE);
}
printf("Changed page permissions to read-only\n");
printf("Value at address: %d\n", *(int*)addr);
// Attempt to write (should trigger SIGSEGV)
printf("Attempting to write to read-only memory...\n");
*(int*)addr = 43;
// Should never reach here
munmap(addr, PAGE_SIZE);
return 0;
}
To compile and run:
gcc -o memory_protection memory_protection.c -Wall
./memory_protection
Expected output:
Successfully mapped page at: 0x7f9b23400000
Successfully wrote to page
Changed page permissions to read-only
Value at address: 42
Attempting to write to read-only memory...
Caught signal 11 (SIGSEGV/SIGBUS) accessing address: 0x7f9b23400000
Memory Layout Analysis
Using objdump
To analyze the memory layout of compiled programs, we can use tools like objdump
. Here’s a program that helps visualize its own memory layout:
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
const char global_str[] = "Global string";
static int static_var = 42;
int global_var;
void print_address(const char* name, const void* addr) {
printf("%20s: %p\n", name, addr);
}
int main() {
static int local_static = 100;
int local_var = 200;
int* heap_var = malloc(sizeof(int));
if (heap_var == NULL) {
fprintf(stderr, "Failed to allocate memory\n");
return 1;
}
*heap_var = 300;
printf("Memory Layout Analysis\n");
printf("=====================\n");
print_address("main function", (void*)main);
print_address("global string", global_str);
print_address("static variable", &static_var);
print_address("global variable", &global_var);
print_address("local static", &local_static);
print_address("local variable", &local_var);
print_address("heap variable", heap_var);
free(heap_var);
return 0;
}
To compile and analyze:
gcc -o memory_layout memory_layout.c -Wall
objdump -t memory_layout | sort
The output will show addresses that are likely to be different on each run (due to ASLR) but should demonstrate a clear pattern where:
- Text segment (main function) is in one region
- Global/static data is in another region
- Stack variables (local_var) are in a different region
- Heap variables are in yet another region
Output will be something like below:
0000000000000000 F *UND* 0000000000000000 __libc_start_main@GLIBC_2.2.5
0000000000000000 F *UND* 0000000000000000 free@GLIBC_2.2.5
0000000000000000 F *UND* 0000000000000000 fwrite@GLIBC_2.2.5
0000000000000000 F *UND* 0000000000000000 malloc@GLIBC_2.2.5
0000000000000000 F *UND* 0000000000000000 printf@GLIBC_2.2.5
0000000000000000 F *UND* 0000000000000000 puts@GLIBC_2.2.5
0000000000000000 w *UND* 0000000000000000 __gmon_start__
0000000000000000 l df *ABS* 0000000000000000
0000000000000000 l df *ABS* 0000000000000000 crtstuff.c
0000000000000000 l df *ABS* 0000000000000000 crtstuff.c
0000000000000000 l df *ABS* 0000000000000000 elf-init.c
0000000000000000 l df *ABS* 0000000000000000 init.c
0000000000000000 l df *ABS* 0000000000000000 memory_layout.c
0000000000400538 g F .init 0000000000000000 _init
00000000004005e0 g F .text 000000000000002a _start
0000000000400610 l F .text 0000000000000000 deregister_tm_clones
0000000000400640 l F .text 0000000000000000 register_tm_clones
0000000000400680 l F .text 0000000000000000 __do_global_dtors_aux
00000000004006a0 l F .text 0000000000000000 frame_dummy
00000000004006cd g F .text 000000000000002c print_address
00000000004006f9 g F .text 00000000000000ea main
00000000004007f0 g F .text 0000000000000065 __libc_csu_init
0000000000400860 g F .text 0000000000000002 __libc_csu_fini
0000000000400864 g F .fini 0000000000000000 _fini
0000000000400870 g O .rodata 0000000000000004 _IO_stdin_used
0000000000400874 g O .rodata 000000000000000e global_str
000000000040093c l .eh_frame_hdr 0000000000000000 __GNU_EH_FRAME_HDR
0000000000400aa4 l O .eh_frame 0000000000000000 __FRAME_END__
0000000000401dd8 l .init_array 0000000000000000 __init_array_start
0000000000401dd8 l O .init_array 0000000000000000 __frame_dummy_init_array_entry
0000000000401de0 l .init_array 0000000000000000 __init_array_end
0000000000401de0 l O .fini_array 0000000000000000 __do_global_dtors_aux_fini_array_entry
0000000000401de8 l O .jcr 0000000000000000 __JCR_END__
0000000000401de8 l O .jcr 0000000000000000 __JCR_LIST__
0000000000401df0 l O .dynamic 0000000000000000 _DYNAMIC
0000000000401fe8 l O .got.plt 0000000000000000 _GLOBAL_OFFSET_TABLE_
0000000000402030 w .data 0000000000000000 data_start
0000000000402030 g .data 0000000000000000 __data_start
0000000000402038 g O .data 0000000000000000 .hidden __dso_handle
0000000000402040 l O .data 0000000000000004 static_var
0000000000402044 l O .data 0000000000000004 local_static.2724
0000000000402048 g .bss 0000000000000000 __bss_start
0000000000402048 g .data 0000000000000000 _edata
0000000000402048 g O .bss 0000000000000008 stderr@GLIBC_2.2.5
0000000000402048 g O .data 0000000000000000 .hidden __TMC_END__
0000000000402050 l O .bss 0000000000000001 completed.6354
0000000000402054 g O .bss 0000000000000004 global_var
0000000000402058 g .bss 0000000000000000 _end
SYMBOL TABLE:
memory_layout: file format elf64-x86-64
Memory Safety and Null Pointer Protection
Modern operating systems implement several mechanisms to prevent null pointer dereferences and protect low memory addresses:
- Page Protection: The first page of virtual memory (containing address 0) is typically unmapped
- MMU Traps: Hardware-level protection that generates exceptions on invalid memory access
- Compiler Optimizations: Modern compilers can detect and warn about potential null pointer dereferences
Practical Implications and Best Practices
Understanding memory layout and protection mechanisms is crucial for:
- Security: Proper memory management helps prevent buffer overflows and other memory-related vulnerabilities
- Performance: Efficient memory layout can improve cache utilization and overall program performance
- Debugging: Understanding memory layout helps in debugging segmentation faults and memory corruption issues
Recommended Practices:
- Always check return values from memory allocation functions
- Use tools like Valgrind to detect memory leaks and invalid accesses
- Implement proper error handling for memory operations
- Consider using memory protection mechanisms for sensitive data
- Maintain proper alignment for better performance
Conclusion
Memory management and protection mechanisms form the foundation of modern operating system security and stability. Understanding these concepts is essential for system-level programming and debugging. We’ve explored how systems protect against null pointer dereferences and manage memory access permissions through practical experimentation and analysis.
The examples provided demonstrate both the flexibility and restrictions of memory mapping operations, highlighting why certain memory regions are protected and how different protection mechanisms work together to ensure system stability and security.