Rust ownership, references and pointers - part 1

Developers don’t usually think about references, they used to do it in the early days but they are still here and thanks to Rust, they are pretty well managed now.

The memory #

The memory that we use when programming (RAM) is virtually divided in sections sections, two of them are the heap and the stack. But why do we have these two spaces? They have a few key differences:

The stack is faster than the heap for reading because it works like a list and the elements are pushed and popped based on the execution context, the elements in the stack are easy to find.
The elements in the stack are fixed in size.
The stack is automatically managed by the operating system behind the scenes.

If we want to see a visual representation of the memory we can imagine something like this:

+-------------------+  (High memory address)
|       Stack       |
| (grows downwards) |
+-------------------+
|                   | 
|    Free space     |
|   (Unused RAM)    |
|                   |
+-------------------+
|        Heap       |
|  (grows upwards)  |
+-------------------+  
|    Data segment   |
|  (global/static)  |
+-------------------+
|    Code Segment   |
|   (program text)  |
+-------------------+  (Low memory address)

The stack is usually allocated in higher positions while the heap is usually allocated in lower positions in the memory.

Stack and heap are spaces in the same physical ram but they behave differently and used for different purposes. The stack is used for function parameters, local variables, return addresses (to know where to go after a function returns) and the heap is used for dynamic allocation and objects the compiler doesn’t know their size (malloc or new).

So, what is ownership in rust? #

Let’s first start with the two types of memory release management we have nowadays:

Garbage Collector: The program (GC) is the one that is responsible for releasing the objects as soon as they are not needed.
Counting pointers: You control when an object is not needed anymore and release the object from memory.

In the first approach, the program (GC) is responsible for releasing the objects while in the second, the developer is responsible for releasing the objects (well and not leave dangling pointers).

Rust comes with its own approach to “kind of” solve this situation, it has a system to help developers safely deal with the memory, that is the “ownership system”.

Every piece of data has a single “owner” (usually a variable), and this owner is responsible for managing that data’s memory. When the owner goes out of scope, Rust automatically frees the memory associated with the data, ensuring that memory leaks or unsafe memory access do not occur.

Let’s take a look at the following piece of code:

fn print_vec() {
    let mut some_vec = vec![1,1,1];  // allocated here
    for i in 3..10 {
        let next = some_vec[i-1];
        some_vec.push(next);
    }
    println!("P(1..10) = {:?}", some_vec); // released here
}

We can imagine the previous code in memory as follow:

+----------------------------------------------+
|                Stack Frame                   |
+-------------------+------------+-------------+
|   buffer pointer  |  length    |  capacity   | 
+-------------------+------------+-------------+
|  Points to Heap   |     10     |     40      |
+-------------------+------------+-------------+
          |
          v
+-----------------------------------------------------+
|                        Heap                         |
+-----------------------------------------------------+
|  1 |  1 |  1 |  1 |  1 |  1 |  1 |  1 |  1 |  1 |   |
+-----------------------------------------------------+

This means is that variable some_vec has a pointer to the data in the heap. Also, the capacity and lenght are known in the stack. So we can say from this is that some_vec owns the data that it points to.

So then, what are references? #

Well, the name is quite revealing, it references somewhere, like the index of a book:

                           TABLE OF CONTENTS

    PREFACE ........................................................

1.  INTRODUCTION ..................................................... 

    1.1 Motivation .................................................... 100
    1.2 Scope ......................................................... 120
    1.3 About This Document ........................................... 130
    1.4 Interfaces .................................................... 140
    1.5 Operation...................................................... 150
    ...

In the previous book index, we know “Motivation” is pointing to page 100 while Operation is pointing to page 150. In Rust, these pointers are called “references” and they are “memory-safe” pointers because they point to “owned” memory locations, but what does that mean? In Rust, every piece of data has a single “owner” (usually a variable), and this owner is responsible for managing that data’s memory (like we saw with some_vec).

References are expressed with an ampersand (&) prior to the variable name we want to reference. This could sound pretty weird so let’s take an example.

let string = String::from("Hello");
let reference = &string;  // r is a reference to the owned memory of string
println!("r = {:?}", reference); // works and print the value
println!("s = {:?}", string); // works and print the value

Let’s dig a bit about this piece of code.

When r is accessing to &s is saying “point to the value that s is holding”. This is a way to “reference” to the value and not take ownership of it. If instead of referencing we reassign the value of s to r, then we change the owner of the s value to r, like this:

let string = String::from("Hello");
let reference = string;  // now reference is the owner of the value that was pointing string

In the previous code, if we try to print s, the compiler will complain because is no longer available because the value is owned by r now.

let string = String::from("Hello");
let reference = string;  // now reference is the owner of the value that was pointing string
println!("reference = {:?}", reference); // works and print the value
// println!("string = {:?}", string); // doesn't work and compiler complain

What we can say about these examples:

When using &string the variable reference is “borrowing the value” of string.
While reference is using the borrowed value from string printing both variables work because both point to the same value.
If we reassign the value of string to reference (without using the &) the value changes owner and string is no longer available because it points to nothing.

Conclusions #

This is the top of the iceberg, I am going to dig more about pointers and references but for now we can say that pointers in rust usually refer to references and these references take the value “borrowed” from the owner, and that process happens by creating a pointer to the pointer of the owner.

Also, every time we reference a value we say we are “borrowing” that value, meanwhile we use the value borrowed in our variable (reference for example) and it is freed, the original value will remain safe because what is being dropped is just the reference.