Unique Pointers In C
In our previous blog post, we explored shared pointers in C. Now, let’s dive into another crucial smart pointer concept: unique pointers. We’ll implement a robust unique pointer in C, analyze its behavior, and examine the resulting assembly code to understand how it works at the machine level.
Understanding Unique Pointers
A unique pointer is a smart pointer that owns and manages another object through a pointer and disposes of that object when the unique pointer goes out of scope. Some key characteristics of unique pointers are:
- Exclusive ownership: Only one unique pointer can own a resource at a time.
- Automatic deletion: The owned object is automatically deleted when the unique pointer is destroyed.
- Move semantics: Ownership can be transferred to another unique pointer, but it cannot be copied.
Let’s implement a unique pointer in C that embodies these characteristics.
Implementing a Unique Pointer in C
Here’s an advanced implementation of a unique pointer in C:
#include <stdlib.h>
#include <stdio.h>
#include <stdatomic.h>
#include <stdbool.h>
#define DEFINE_UNIQUE_PTR(T) \
typedef struct { \
T* ptr; \
void (*deleter)(T*); \
_Atomic bool is_owned; \
} unique_ptr_##T; \
\
unique_ptr_##T unique_ptr_create_##T(T* p, void (*d)(T*)) { \
unique_ptr_##T up; \
up.ptr = p; \
up.deleter = d ? d : (void (*)(T*))free; \
atomic_init(&up.is_owned, true); \
return up; \
} \
\
void unique_ptr_destroy_##T(unique_ptr_##T* up) { \
if (atomic_exchange(&up->is_owned, false)) { \
if (up->ptr) { \
up->deleter(up->ptr); \
up->ptr = NULL; \
} \
} \
} \
\
T* unique_ptr_get_##T(unique_ptr_##T* up) { \
return up->ptr; \
} \
\
T* unique_ptr_release_##T(unique_ptr_##T* up) { \
T* tmp = up->ptr; \
if (atomic_exchange(&up->is_owned, false)) { \
up->ptr = NULL; \
} else { \
tmp = NULL; \
} \
return tmp; \
} \
\
void unique_ptr_reset_##T(unique_ptr_##T* up, T* p) { \
if (atomic_load(&up->is_owned)) { \
T* old = atomic_exchange(&up->ptr, p); \
if (old) { \
up->deleter(old); \
} \
} \
} \
\
unique_ptr_##T unique_ptr_move_##T(unique_ptr_##T* up) { \
unique_ptr_##T new_up = *up; \
if (atomic_exchange(&up->is_owned, false)) { \
up->ptr = NULL; \
} else { \
new_up.ptr = NULL; \
atomic_store(&new_up.is_owned, false); \
} \
return new_up; \
}
DEFINE_UNIQUE_PTR(int)
void custom_int_deleter(int* ptr) {
printf("Custom deleter called for int: %d\n", *ptr);
free(ptr);
}
int main() {
int* raw_ptr = malloc(sizeof(int));
*raw_ptr = 42;
unique_ptr_int up1 = unique_ptr_create_int(raw_ptr, custom_int_deleter);
printf("up1 value: %d\n", *unique_ptr_get_int(&up1));
unique_ptr_int up2 = unique_ptr_move_int(&up1);
printf("up2 value: %d\n", *unique_ptr_get_int(&up2));
if (unique_ptr_get_int(&up1) == NULL) {
printf("up1 is now empty\n");
}
unique_ptr_destroy_int(&up1);
unique_ptr_destroy_int(&up2);
return 0;
}
Let’s break down the key components of this implementation:
-
Type-safe macro: The
DEFINE_UNIQUE_PTR
macro generates type-specific implementations, ensuring type safety at compile-time. -
Ownership tracking: We use an atomic boolean
is_owned
to track ownership, allowing for thread-safe ownership transfers. -
Custom deleters: The implementation supports custom deletion functions, providing flexibility in resource management.
-
Move semantics: The
unique_ptr_move_##T
function implements move semantics, transferring ownership from one unique pointer to another. -
Release and reset: The
unique_ptr_release_##T
andunique_ptr_reset_##T
functions provide fine-grained control over the owned resource.
Advanced Features and Safety Considerations
Thread Safety
While unique pointers are typically used in single-threaded contexts (due to their exclusive ownership model), our implementation uses atomic operations to ensure thread-safe behavior during ownership transfers. This is particularly important for the move
and release
operations.
Exception Safety
C doesn’t have built-in exception handling, but our implementation is designed to be exception-safe in the context of multi-threaded applications. The use of atomic operations ensures that the ownership state remains consistent even if a thread is interrupted.
Resource Leak Prevention
The destructor (unique_ptr_destroy_##T
) automatically calls the deleter when the unique pointer goes out of scope, preventing resource leaks. The reset
function also properly deletes the old resource before assigning a new one.
Move Semantics
The unique_ptr_move_##T
function implements move semantics, which is crucial for unique pointers. It transfers ownership to a new unique pointer while ensuring the old one becomes null, preventing double frees or use-after-free bugs.
Assembly Analysis
Let’s analyze the assembly code generated for the unique_ptr_move_int
function to understand how unique pointer mechanics work at the machine level. We’ll use GCC with optimization level -O2 on an x86_64 architecture.
unique_ptr_move_int:
push rbp
mov rbp, rsp
mov QWORD PTR [rbp-24], rdi
mov rax, QWORD PTR [rbp-24]
mov rdx, QWORD PTR [rax]
mov rax, QWORD PTR [rbp-24]
mov rax, QWORD PTR [rax+8]
mov QWORD PTR [rbp-16], rdx
mov QWORD PTR [rbp-8], rax
mov rax, QWORD PTR [rbp-24]
add rax, 16
mov esi, 0
mov rdi, rax
call atomic_exchange_1
test al, al
je .L2
mov rax, QWORD PTR [rbp-24]
mov QWORD PTR [rax], 0
jmp .L3
.L2:
mov QWORD PTR [rbp-16], 0
lea rax, [rbp-8]
add rax, 8
mov esi, 0
mov rdi, rax
call atomic_store_1
.L3:
mov rax, QWORD PTR [rbp-16]
mov rdx, QWORD PTR [rbp-8]
pop rbp
ret
Let’s break down the key parts of this assembly code:
- Function prologue and argument setup:
push rbp mov rbp, rsp mov QWORD PTR [rbp-24], rdi
This sets up the stack frame and stores the function argument (the unique_ptr to be moved) in a local variable.
- Copying the unique_ptr structure:
mov rax, QWORD PTR [rbp-24] mov rdx, QWORD PTR [rax] mov rax, QWORD PTR [rbp-24] mov rax, QWORD PTR [rax+8] mov QWORD PTR [rbp-16], rdx mov QWORD PTR [rbp-8], rax
This copies the
ptr
anddeleter
fields of the input unique_ptr to local variables. - Atomic exchange for ownership transfer:
mov rax, QWORD PTR [rbp-24] add rax, 16 mov esi, 0 mov rdi, rax call atomic_exchange_1
This is the core of the move operation. It atomically sets the
is_owned
flag of the source unique_ptr to false and returns its previous value. - Conditional nulling of pointers:
test al, al je .L2 mov rax, QWORD PTR [rbp-24] mov QWORD PTR [rax], 0 jmp .L3 .L2: mov QWORD PTR [rbp-16], 0 lea rax, [rbp-8] add rax, 8 mov esi, 0 mov rdi, rax call atomic_store_1
This section nulls out the appropriate pointer based on the result of the atomic exchange. If the source was owned, it nulls the source’s pointer. Otherwise, it nulls the destination’s pointer and sets its
is_owned
flag to false. - Function epilogue and return:
.L3: mov rax, QWORD PTR [rbp-16] mov rdx, QWORD PTR [rbp-8] pop rbp ret
This prepares the return value (the new unique_ptr) and restores the stack frame.
The key to the unique pointer’s behavior lies in the atomic exchange operation. This ensures that only one instance of the unique pointer can own the resource at any given time, even in a multi-threaded environment.
Performance Considerations
Unique pointers generally have very little overhead compared to raw pointers. The main performance implications come from:
-
Atomic operations: The use of atomic operations for ownership management can introduce some overhead, especially on architectures with weak memory models.
-
Function calls: The use of function pointers for custom deleters introduces an indirect function call, which can have a small performance impact compared to direct deletion.
-
Move operations: Moving unique pointers is more expensive than moving raw pointers due to the additional logic involved in transferring ownership.
However, these performance costs are generally negligible compared to the benefits of automatic resource management and the prevention of common pointer-related bugs.
Use Cases and Best Practices
Unique pointers are particularly useful in scenarios where you need exclusive ownership semantics, such as:
-
Managing non-shared resources: File handles, network connections, or other resources that shouldn’t be shared between different parts of your program.
-
Implementing data structures: Unique pointers can be used to implement trees, linked lists, and other data structures where each node should have a single owner.
-
Factories and dependency injection: Unique pointers can be used to return newly created objects from factory functions, ensuring that the ownership is properly transferred to the caller.
Best practices for using unique pointers include:
-
Use unique pointers for exclusive ownership: If a resource needs to be shared, consider using shared pointers instead.
-
Prefer unique pointers over raw pointers: This helps prevent resource leaks and makes ownership semantics explicit.
-
Use std::move() when transferring ownership: Always use move semantics when you want to transfer ownership of a unique pointer.
-
Don’t use unique pointers for arrays: Unlike C++, our C implementation doesn’t handle arrays specially. For arrays, consider implementing a separate unique_array type.
-
Be cautious with custom deleters: While powerful, custom deleters can introduce complexity. Use them judiciously.
Conclusion
Implementing unique pointers in C requires careful consideration of ownership semantics, thread safety, and resource management. While C++ provides std::unique_ptr as part of the standard library, implementing unique pointers in C gives us a deeper understanding of their mechanics and the challenges involved in creating safe, efficient memory management systems.
Above implementation demonstrates advanced C programming techniques, including atomic operations, function pointers, and macro metaprogramming. It provides many of the benefits of C++’s std::unique_ptr while maintaining C compatibility.
Remember that while this implementation is instructive and can be useful in C projects, it’s not a drop-in replacement for std::unique_ptr. For production use in C projects, it’s important to thoroughly test and possibly refine this implementation based on specific project needs. In projects where extensive use of smart pointers is required, consider using C++ if possible, as it provides these features with compiler and standard library support.