Understanding C pointer shenanigans

Understanding C pointer shenanigans

This is a two part series about pointers in c:
Find part 2 here: Understanding pointers with structs in c

Pointers are,...well, pointers, like literally they point at things. So, If on. the road you ask me do you own a cafe, i'll say no. But, i know a good one, I can point at one for you. This doesn't mean i have the cafe now, i just know the address of it. And that's how pointers work. They know addresses. Now obviously, we are talking about a programming language. So, we need syntactical shenanigans. So, in c, we generally have following symbols while working with them

  1. * (Asterisk)

  2. & (Ampersand)

  3. -> (Arrow like symbol)

  4. . (period)

Let's understand all of them through examples

Example 1

No pointers, plain function calls:

#include <stdio.h>

int add(int a, int b){
    return a + b;
}


int main() {
    int a = 1;
    int b = 2;

    int ret = add(a, b);
    if (ret != 3) {
        printf("Failed expected: 3, got: %d", ret);
    } else {
        printf("Passed expected: 3, got: %d", ret);
    }

    return 0;
}

Output: Works as expected, yayy.

Passed expected: 3, got: 3

Example 2

Function takes in pointers. Here we introduce our first symbol, the asterisk. Note, the new declaration of the add function.

int add(int *a, int *b){
    return a + b;
}

Earlier in example 1, we were directly sending the values of a and b to add. Which mean, inside add, we have a copy of a and b. Comparing to our cafe analogy, we are literally giving the cafe to our new friend. Sounds weird, but it's what we are doing. Now, with the changed function signature with two new *, we are just sharing the addresses of a and b. That's what the asterisk does. It's a variable type, literally called pointer, which holds addresses, address to some place in memory.

But, if you try run the program with this new signature, it will fail. Because,

  1. we are passing integers while calling the add function, which accepts pointers now. error: invalid conversion from 'int' to 'int*' [-fpermissive]

  2. we are trying to add two pointers, which well, is not allowed. error: invalid operands of types 'int*' and 'int*' to binary 'operator+' return a + b;

Let's fix one by one. First, we can't pass integers to pointers type. Let's change our calling line to pointer format. Let's think, our add function accepts pointers. You might think, well it need pointers, let's give it pointers. From:

int ret = add(a, b);

to:

int ret = add(*a, *b);

But, this will fail again. And we'll get the following error: error: invalid type argument of unary '*' (have 'int')

This is because we have used * in a different context. When defining the function, it acted like a declaration, a declaration of a as a pointer to an integer. But, here while calling, it's a different usage context. Here it acts like an operator, like add, subtract etc. To be exact, this is called, Dereferencing a pointer (unary operator). Which means asking the pointer go to this address and get what's there. But, because in our main function a is an integer, not a pointer, you said "Go to address number 1 and get what's there. But 1 isn't an address, it's just a number. The computer doesn't know what to do with this instruction. Hence, the invalid type error.

So, how to solve this. This is where we introduce our next symbol, the ampersand, &. It's again an operator, which says: Give me the address of this thing. or Tell me where does this variable live in the memory? Let's use this new knowledge. first we get the addresses of a and b. Note, how we are still using the pointer notation to declare the variable denoting these addresses

int *aAddress = &a;
int *bAddress = &b;

If you try to print the values of these variables, for example:

printf("Address of a: %p, ", aAddress);
printf("Address of b: %p, ", bAddress);

You will get an output like:

Address of a: 0x7ffd8c673110
Address of b: 0x7ffd8c673114

These are the actual memory address where a lives, and the pointers point to.

Interesting point to note: Every time you run the program, you'll get a different addresses as these memory addresses are allocated at runtime. But if you run the same program on let's say an old system or a microcontroller, you should get the same addresses every time. This is due to modern systems uses Address Space Layout Randomization (ASLR), which in simple terms, randomly changes where programs and data are placed in memory each time you run a program. But, embedded systems or older systems used fixed memory layouts where the same program would load into the same memory addresses each time it ran.

Ok, let's use this new operator to solve our issue. We'll change our program From:

int ret = add(a, b);

to:

int ret = add(*a, *b);

to

int ret = add(aAddress, bAddress);
// or 
int ret = add(&a, &b);

That solves our first bug, but our compiler still throws the error that you can't add pointers. Let's circle back to dereferencing. Like how we use add, subtract operators with other data types, we use * as the dereference operator. It says: go to this address and get what's there. So, if we use * with aAddress. Like *aAddress, you get back a. So, in the add function, as we can't add pointers, we have to get the value back from the pointers, or we need to dereference the pointers.

Let's change our function from

int add(int *a, int *b){
    return a + b;
}

to

int add(int *a, int *b){
    return *a + *b;
}

Now, if we run the program, It worked. Yayy.

Complete program:

#include <stdio.h>

int add(int *a, int *b){
    return *a + *b;
}

int main() {
    int a = 1;
    int b = 2;

    int *aAddress = &a;
    int *bAddress = &b;

    printf("Address of a: %p\n", aAddress);
    printf("Address of b: %p\n", bAddress);

    int ret = add(aAddress, bAddress); 
    // or
    // int ret = add(&a, &b);

    if (ret != 3) {
        printf("Failed expected: 3, got: %d", ret);
    } else {
        printf("Passed expected: 3, got: %d", ret);
    }
    return 0;
}

A quick tldr;

#include <stdio.h>

int main() {
    // Write C code here
    int a = 1;             // declares a simple int variable a with value 1
    int *ptr = &a;         // Creates a ptr pointer variable pointing to the address of variable a, where a is stored in memory, `&` says: get the address of a
    int value = *ptr;      // dereferencing: go to address pointed by ptr and get the value there, `1` in this case.
    int alsoValue = *(&a); // dereferencing by addrress: go to address of a and get the value there, `1` in this case. You can think *& cancelling each other, (order matters)
    // int alsoValue = &(*a); // This fails as &(*a) would try to get the address of the value pointed to by a, which doesn't make sense in this context as a is not a pointer
    int alsoAddressLikeptr = &(*ptr); // This works but is equivalent to int *ptr. Though when you run this will only contain a part of the address. You can change this to `long` etc. which will print the complete address

    printf("a: %d\n", a);
    printf("address of a: %p\n", &a);
    printf("ptr: %p\n", ptr);
    printf("*ptr: %d\n", *ptr);
    printf("value: %d\n", value);
    printf("alsoValue: %d\n", alsoValue);

    return 0;
}

This is my drawing attempt at a mental model of pointers and addresses.

Ok, I hope this makes the first two symbols clear. Take a breather. Go through the above pointers (hehe) again, if you didn't understand something. Run these programs on some online c compilers (just google 'online c compilers').

We are still remaining with two remaining operators, the arrow (->) and the period (.) operator. Let's discuss that in the second part: Understanding pointers with structs in c