C# has two options to store data in memory. The Stack and the Heap. On the Stack we have Value Types (struct, enum like bool, int, char, ...) and on the Heap Reference Types (class, interface, delegate, object, string, dynamic).
The default Stack size (per Thread) in C# is 1MB. We can specify the maxStackSize when creating a new Thread (maxStackSize is ignored if it is greater than the default Stack size). When we allocate more memory on the Stack we get the infamous StackOverflowException.
Why we need to know that?
Performance! Allocating memory on the Stack is much cheaper and accessed faster than allocating on the Heap.
- Data stored on the Stack is removed from memory the moment the method (function) finishes (stack frame pop)
- Data stored on the Heap is managed by the garbage collector (GC)
Garbage Collector (GC)
The GC automatically releases memory on the Heap based on an algorithm (basically tracking the number of variables/references pointing to the data on the Heap).
Even though the GC engine is very optimized (also see generations), it will freez our application whenever the garbage collector performs collection.
C# pointer vs C pointer
Pointers store a reference (memory address) on the stack memory that points to the actual object/value on the Heap.
C# pointers are managed (managed pointer), means the Garbage Collector (GC) can update it's reference (memory address). This is important, because the GC will perform defragmentation.
Fragmentation occurs when memory is allocated and deallocated in a way that leaves small, unusable gaps (holes) between used memory blocks, hindering efficient memory usage and allocation of larger objects.
The GC will perform defragmentation of the Heap to remove those gaps, which means those memory addresses can change. The GC will automatically update those memory addresses with the new ones.
This is why we need to pin a pointer in C# to prevent the object pointed to being moved by the GC.
unsafe
{
byte[] bytes = [1, 2, 3];
fixed (byte* pointerToFirstByte = bytes)
{
Console.WriteLine($"The address of the first array element: {(long)pointerToFirstByte:X}.");
Console.WriteLine($"The value of the first array element: {*pointerToFirstByte}.");
}
}
Boxing and Unboxing
This is called the process of converting a value type (Stack) to a reference type (Heap) or vice versa.
int i = 10;
object obj = i; // boxing
int i2 = (int)obj; // unboxing
Boxing and Unboxing should be avoided for performance reasons. A way to avoid it is the use of generics.
Difference between Struct and Class?
- a struct stores the value itself on the Stack*
- a class stores the memory address (reference) on the Stack that points to the value on the Heap
The size of a C# pointer (memory address) depends on the architecture (arch) of the operating system (OS).
- 32bit (x86) - 4 bytes
- 64bit (x64) - 8 bytes
*when defining a struct field on a class, that field will be stored on the heap as part of the class instance.
Struct size
It is important to understand how to calculate the size of a struct to determine wheter to use a struct or a class.
The size of a struct is fixed and can be determined by the sizeof operator or Marshal.SizeOf (outside unsafe context).
struct Person
{
public string Name; // 8 bytes on x64
public int Age; // 4 bytes (int is an alias for System.Int32)
}
Console.WriteLine(Marshal.SizeOf<Person>());
So the struct Person takes 12 bytes in total. But wait, why we are getting 16 bytes!? It is called Padding!
Padding is the process of adding extra bytes to a data structure to ensure that it is aligned on a memory boundary. When data is not aligned on a memory boundary, it can cause the CPU to perform additional operations to access the data, which can slow down performance. We can control padding in C# by setting the Pack field.
When to use a struct?
The general rule is to not use a struct over the size of 16 bytes. Otherwise risking to slow down performance when copying.
- the struct gets copied when assigned to another variable
- the struct gets copied when passed as an argument to a method
- the struct gets copied when returned from a method
A class will only copy the memory address which is always the same of 8 bytes on a 64 bit system.
Struct value by reference
To avoid copying a struct we could take advantage of the ref keyword.
Microsoft itself is pushing the usage of struct in their .NET libraries to improve performance on hot paths (perf-critical code). They heavily rely on System.Span<T>.