All about pointers


Match word(s).

If you have any questions or comments,
please visit us on the Forums.

FAQ > Prelude's Corner > All about pointers

This item was added on: 2004/09/26

This article has a sibling at Prelude's website.

  • Nice and simple...
  • Function parameters and return types
  • Dynamic data structures
  • Iterators and saved locations
  • Miscellany

    Nice and simple...

    Pointers are surprisingly simple, all you have to do is remember that you're working with memory addresses. For example, let us use an imaginary block of memory (while at the same time showing off my beautiful artistic abilities )

    -----------------------------------
    | 123 || 124 || 125 || 126 || 127 |
    -----------------------------------
    

    Now, when you declare a regular variable, for example:
    char ch;

    The name 'ch' becomes a synonym for the memory location it is given. If ch is given the location 123 you can imagine the imaginary memory block to look like this:
    ----------------------------------
    | ch || 124 || 125 || 126 || 127 |
    ----------------------------------
    

    Now, a pointer is a separate variable that refers to another location in memory. However, when it comes down to everything, a pointer is just another variable. You declare it using the pointer notation *:
    char *pch;

    Since pch is just another variable, our imaginary memory block might look like this (assuming pointers take up two bytes and chars one byte for the sake of relative accuracy):
    -----------------------------------------
    | ch(123) || pch(124,125) || 126 || 127 |
    -----------------------------------------
    

    To give ch a value, all you do is assign that value to the memory location (cleverly disguised by the name 'ch'):
    ch = 'A';

    The process is exactly the same with pointers except instead of arbitrary values, their value is an actual address in memory. To obtain the address of a variable, you prefix it with &. So to assign the address of ch to pch you would do this:
    pch = &ch;

    Now pch's value is the address of ch, exactly like ch's value is 'A'. If you access pch, you'll get that address:
    printf("%p\n", pch);

    Now, the great thing about pointers is that since they are references to the address they contain, you can "dereference" them by prefixing the * character. This accesses the memory address contained by the pointer. The following printf statements output the exact same thing:
    printf("%c\n", ch);
    printf("%c\n", *pch);

    Why? Because ch is the memory location that we assigned 'A'. By using a dereference, *pch is also the memory location that we assigned 'A'.

    This is the basic idea behind the pointer. Any data type in C can have a pointer to that data type (even functions!). Let's look at a few uses of pointers.

    Before going any further, it's a good idea to mention two special variations of a pointer. The first is a type of pointer and the second is a special value. First is the void pointer:

    void *pv;

    A void pointer is special in that it is the generic pointer, you can assign any pointer value to void and then cast it back to the original type. The reason this cast is required is because void pointers cannot be dereferenced:
    void *pv;
    int  *pi;
    int   i = 10;
    
    pi = &i;
    pv = pi;
    
    printf("%d\n", *pi);       /* Prints 10 */
    printf("%d\n", *(int*)pv); /* Also prints 10 */
    

    Note that the cast is made *before* the dereference is performed. This is important, if the cast were
    (int*)*pv

    You would be dereferencing a void pointer and returning a pointer to int from that invalid reference. Both are quite wrong.

    Next is the null pointer. A null pointer is a value that you assign to a pointer variable when you want it to point to nothing in particular. This value is used as a safe starting value for pointer variables, and erroneous return values from functions that return pointers, such as malloc. A null pointer is any integral value that evaluates to 0 (zero). The two common ways of using null are the integral constant 0, and a macro defined in several standard headers called NULL:

    #include <stddef.h> /* For NULL */
    
    int  *pi = 0;
    char *pc = NULL;
    

    A null pointer cannot be dereferenced because null is an invalid location in memory to access. You can only assign null and test for it:
    if (pc != NULL)
        /* Do something */
    
    if (pc != 0)
        /* Do something */
    

    Since both NULL and 0 are both null pointer values, even if you assign NULL to a pointer variable, you can still test it against 0 to see if it's a null pointer.

    Now that all of that is out of the way, a few uses of pointers. :)

    Function parameters and return types

    Pointers are useful as function parameters and return types in two ways. First is a size issue, imagine that you have a structure variable that is big, say, 30 bytes. If you pass it to a function the usual way

    void f(struct MYSTRUCT mys);
    ...
    f(var);
    

    A copy of the entire 30 bytes is made. This can be wasteful in both memory and performance. Pointers to the rescue! By passing a pointer to that variable only the size of its memory address is copied and passed. Using our imaginary memory setup, instead of 30 bytes, only 2 bytes are copied (still using our imaginary memory scheme). This is faster and more conservative of memory. Yay!
    void f(struct MYSTRUCT *pmys);
    ...
    f(&var);
    

    Notice that the syntax of the pointer hasn't changed from our original examples. This is important, consistency is a good thing.

    The second big help with pointers as function parameters and return types is the referencing feature. By passing mys in the first example, you get a *copy* of the variable passed to the function. So if you want to make any changes to the original variable, you have to return the copy with changes made and assign it to the original:

    struct MYSTRUCT f(struct MYSTRUCT mys);
    ...
    var = f(var);
    

    But what if you want to return an error code instead? Pointers to the rescue! By passing a pointer to the function, you are really copying the memory address of the original variable. Since you're able to access the memory address of the original variable, you can make *changes* to it from within another function!
    int f(struct MYSTRUCT *pmys);
    ...
    errorcode = f(&var);
    

    Return values work the same way with pointers, you can return a pointer for improved efficiency, and so that a variable can be changed outside of the function. There is one big pitfall that you should take into consideration though. Local variables are destroyed and their memory reclaimed at the end of the enclosing block, so returning a pointer to a local variable is fraught with peril:
    int *f(void)
    {
        int i = 10;
        return &i; /* No! Returning a local variable is bad! */
    }
    

    Returning memory that you control (such as memory returned by malloc and friends) is okay though:
    int *f(void)
    {
        int *i = malloc(sizeof (int));
        return i; /* Okay, you control when the memory is released */
    }
    

    Dynamic data structures

    Pointers are used heavily for data structures that can grow and shrink dynamically at runtime. This dynamic sizing requires the programmer to handle memory through pointers using malloc/calloc, realloc, and free. Such data structures can be node based (linked lists, binary trees, etc...) or block based (resizable arrays). To use a pointer to create an array, simply malloc enough memory for it by multiplying the size of the element type by the number of elements you want:

    int *parray = malloc(10 * sizeof (int));
    if (parray != NULL)
    {
        /* Okay, use it */
    }
    

    Note that malloc, calloc, and realloc all return null pointers if they fail. The above code is basically the same thing as using static arrays:
    int array[10];

    The only big differences are that you have control of the size of parray at runtime, you can grow it or shrink it by using realloc. The second big difference is that you *must* remember to free the memory you allocate using malloc, calloc, or realloc.

    Iterators and saved locations

    Pointers are helpful for iterating through arrays and saving things that you want to come back to. For example, if you have the following string:

    char s[] = "This is a test";

    And you want a pointer to the beginning of each word, you can simply do this:
    char *a = s;
    char *b = s + 5;
    char *c = s + 8;
    char *d = s + 10;
    

    Now a points to "This is a test", b points to "is a test", c points to "a test", and d points to "test". Note that pointers can have arithmetic performed on them. The name of an array is a pointer to the first element, so if you add 1 to it you get the second element. The above could also have been written as
    char *a = &s[0];
    char *b = &s[5];
    char *c = &s[8];
    char *d = &s[10];
    

    Arrays and pointers are closely related but are *not* the same. Keep this in mind when mixing them.

    If a beginner in C wanted to print each element in an array, she would most likely do this:

    int array[10] = {1,2,3,4,5,6,7,8,9,-1};
    int i;
    
    for (i = 0; i < 10; i++)
        printf("%d\n", array[i]);
    

    or
    for (i = 0; array[i] != -1; i++)
        printf("%d\n", array[i]);
    

    This can also be done using pointers:
    int  array[10] = {1,2,3,4,5,6,7,8,9,-1};
    int *pi;
    
    for (pi = array; *pi != -1; pi++)
        printf("%d\n", *pi);
    

    By using arithmetic on a pointer to the first element of the array, you can walk every element in the array by adding one. Note that by adding one to a pointer, the pointer moves forward by the size of its data type. If char is one byte and int is two bytes
    char *pc = &c;
    int  *pi = &i;
    
    pc++; /* Moves forward 1 byte */
    pi++; /* Moves forward 2 bytes */
    

    This is important to remember because you don't have to manage how much memory is jumped over, the compiler does all of this for you.

    Miscellany

    Pointers can be constant, but since a pointer really has two parts (the pointer itself and the address it points to), one or both or neither can be constant. You do it like this:

    const int *p;        /* The pointed-to value is const */
    int * const p;       /* The pointer itself is const */
    int *p;              /* Both are not const */
    const int * const p; /* Both are const */
    

    A pointer can point to a function, believe it or not. There are two uses of a function, call it, or take its address. If you aren't calling a function then you can assign its address to a pointer. The declaration for a pointer to a function is complex, just pretend you're writing a function prototype, then prefix the name with * and surround both with parentheses:
    void (*pf)(void); /* Pointer to a function that takes no args and returns nothing */
    

    Now you can assign a function with the same type:
    void f(void)
    {
        printf("Hello, world!\n");
    }
    
    pf = f; /* Not calling f, must be taking the address */
    

    Now that pf points to f, you can call f *through* pf:
    (*pf)();

    Wow. Of course, that notation can get annoying after a while, so C allows you to call f through pf without dereferencing pf:
    pf();

    Double wow.

    Pointers can also point to other pointers! My standard example of a pointer to a pointer is in the case of memory allocation. Say you want to pass a pointer to a function and allocate memory to it. You want to return an error code, so returning the freshly allocated memory isn't an option:

    int allocme(int *parray)
    {
        parray = malloc(10 * sizeof (int));
    
        if (parray == NULL)
            return -1;
    
        return 0;
    }
    
    ...
    
    int *pa;
    
    if (allocme(pa) != -1)
        /* All is well, use pa */
    

    This looks like it should work fine. By passing a pointer to int you can change the original, right? Right, but that's not what we want to do. We want to change the *pointer* that points to int, not the int that it points to. This is a problem because the pointer is a copy. The solution is the same as when we wanted to make changes to the original structure variable, pass a pointer to it. :)
    int allocme(int **pparray)
    {
        *pparray = malloc(10 * sizeof (int));
    
        if (*pparray == NULL)
            return -1;
    
        return 0;
    }
    
    ...
    
    int *pa;
    
    if (allocme(&pa) != -1)
        /* All is well, use pa */
    

    Once again note that the syntax for a double pointer is consistent with a single pointer, just tack on an extra * for the declaration and dereference. Just make sure that you dereference the outer pointer first. Pointers to pointers to structures can cause problems because of this and when you want to access a member you have to surround the first dereference with parentheses:
    struct MYSTRUCT **ppmys;
    ...
    (*ppmys)->member; /* Works okay */
    *ppmys->member;   /* Doesn't work right  */
    

    You can have as many pointers to pointers as you want, but the syntax gets nasty as you end up having to nest the dereferences and use parentheses, but the syntax does remain consistent. The most I've ever used is five:
    int *****crazy;

    I don't recommend it unless there's no other viable option.

    I hope this helps you to understand pointers a little bit better.

  • Script provided by SmartCGIs