C/Pointers with words

From Attie's Wiki
(Difference between revisions)
Jump to: navigation, search
m (Created page with 'Having been criticized by my colleagues, I decided to add some words to my Pointers page. ==A Variable== If we start by thinking about a standard variable 'i', assuming an x…')
 
m
Line 2: Line 2:
  
 
==A Variable==
 
==A Variable==
If we start by thinking about a standard variable 'i', assuming an x86, 32-bit system - 32-bit (4 byte) address and integer sizes.
+
Before we begin, this page assumes an x86 system (32-bit) - which means that addresses and integers are 32-bit / 4 bytes.
<source lang="c">
+
 
int i;
+
If we start by thinking about a normal variable, <code>i</code>, of type <code>int</code>.
</source>
+
 
This is just a block of 4 bytes that the compiler promises it will find memory for.
+
An <code>int</code> is just a block of 4 bytes that the compiler promises it will find memory for.<br>
 
In reality this comes from the stack, but that is a detail for another time.
 
In reality this comes from the stack, but that is a detail for another time.
  
If we now assign <code>0xDEADBEEF</code> to <code>i</code> and represent the memory visually, we might find the following (the colouring will become apparent later).
+
Let's declare <code>i</code> and assign <code>0xDEADBEEF</code> to it.  
 
<source lang="c">
 
<source lang="c">
 
int i;
 
int i;
Line 15: Line 15:
 
i = 0xDEADBEEF;
 
i = 0xDEADBEEF;
 
</source>
 
</source>
 +
If you were to inspect the memory for this process, and represent it visually, we might find the following (the colouring will become apparent later).
 
{|border="1" cellpadding="0"
 
{|border="1" cellpadding="0"
 
! Address !! Value !! Note
 
! Address !! Value !! Note
Line 20: Line 21:
 
|colspan="3" align="center"| ...
 
|colspan="3" align="center"| ...
 
|-
 
|-
| 0xBF893E8F || 0xDE || i ''MSB''
+
| <code>0xBF893E8F</code> || <code>0xDE</code> || <code>i</code> ''MSB''
 
|-
 
|-
| 0xBF893E8E || 0xAD || i
+
| <code>0xBF893E8E</code> || <code>0xAD</code> || <code>i</code>
 
|-
 
|-
| 0xBF893E8D || 0xBE || i
+
| <code>0xBF893E8D</code> || <code>0xBE</code> || <code>i</code>
 
|-
 
|-
|style="color:blue"| 0xBF893E8C || 0xEF || i ''LSB''
+
|style="color:blue"| <code>0xBF893E8C</code> || <code>0xEF</code> || <code>i</code> ''LSB''
 
|-
 
|-
 
|colspan="3" align="center"| ...
 
|colspan="3" align="center"| ...
 
|}
 
|}
  
This is very simple. Store a value in some memory.
+
From this memory map, we can see that the variable <code>i</code> is stored at address <code>0xBF893E8C</code>, and because the memory is byte addressable, <code>i</code> uses 4 'slots' - <code>0xDE</code>, <code>0xAD</code>, <code>0xBE</code>, <code>0xEF</code>.
 +
 
 +
==A Pointer==
 +
Now let's consider a pointer.<br>
 +
They are just like normal variables, but they have a special meaning to the compiler.
 +
<source lang="c">
 +
int *p;
 +
 
 +
p = 0xDEADBEEF;
 +
</source>
 +
The code above gives exactly the same result, and assuming that <code>p</code> is located at the sample place in memory that <code>i</code> was, the previous memory map is still valid.
 +
 
 +
But when we come to 'de-reference' <code>p</code> we are going to have a problem.<br>
 +
The term 'de-referencing' is the act of following a pointer, to access the variable to which it points. This can be done in two ways:
 +
<source lang="c">
 +
*p    /* this is usually used in a situation similar to this, when 'p' points at a single variable */
 +
p[0]  /* this is usually used when 'p' points to an array of variables */
 +
</source>
 +
 
 +
The map below shows the value of <code>p</code> as well as the memory that it points to.<br>
 +
As you can see, this is no longer a simple variable, because its value is used to locate some other memory.<br>
 +
In this case, the memory at <code>0xDEADBEEF</code> is not allocated to anything, and accessing it will cause a [http://en.wikipedia.org/wiki/Segmentation_fault segfault] (your program will crash).
 +
{|border="1" cellpadding="0"
 +
! Address !! Value !! Valid? !! Note
 +
|-
 +
|colspan="4" align="center"| ...
 +
|-
 +
| <code>0xDEADBEF2</code> || <code>??</code> ||style="color:red"| '''No''' || <code>?</code>
 +
|-
 +
| <code>0xDEADBEF1</code> || <code>??</code> ||style="color:red"| '''No''' || <code>?</code>
 +
|-
 +
| <code>0xDEADBEF0</code> || <code>??</code> ||style="color:red"| '''No''' || <code>?</code>
 +
|-
 +
|style="color:red"| <code>0xDEADBEEF</code> || <code>??</code> ||style="color:red"| '''No''' || <code>?</code>
 +
|-
 +
|colspan="4" align="center"| ...
 +
|-
 +
| <code>0xBF893E8F</code> ||style="color:red"| <code>0xDE</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> ''MSB''
 +
|-
 +
| <code>0xBF893E8E</code> ||style="color:red"| <code>0xAD</code> ||style="color:#4B2"| '''Yes''' || <code>p</code>
 +
|-
 +
| <code>0xBF893E8D</code> ||style="color:red"| <code>0xBE</code> ||style="color:#4B2"| '''Yes''' || <code>p</code>
 +
|-
 +
|style="color:blue"| <code>0xBF893E8C</code> ||style="color:red"| <code>0xEF</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> ''LSB''
 +
|-
 +
|colspan="4" align="center"| ...
 +
|}
 +
 
 +
Let's make something useful, instead of something that will crash.<br>
 +
Consider the following code:
 +
<source lang="c">
 +
int i;            /* our variable */
 +
int *p;          /* our pointer */
 +
 
 +
p = &i;          /* assign the address of 'i' to 'p' */
 +
 
 +
*p = 0x12345678;  /* effectively assigns 0x12345678 to 'i' */
 +
</source>
 +
In this snippet, we make <code>p</code> point at <code>i</code>, and then store <code>0x12345678</code> in the memory that <code>p</code> points to.<br>
 +
This effectively stores <code>0x12345678</code> in <code>i</code> could result in the following memory map.
 +
{|border="1" cellpadding="0"
 +
! Address !! Value !! Valid? !! Note
 +
|-
 +
|colspan="4" align="center"| ...
 +
|-
 +
| <code>0xBF893E8F</code> || <code>0x12</code> ||style="color:#4B2"| '''Yes''' || <code>i</code> ''MSB''
 +
|-
 +
| <code>0xBF893E8E</code> || <code>0x34</code> ||style="color:#4B2"| '''Yes''' || <code>i</code>
 +
|-
 +
| <code>0xBF893E8D</code> || <code>0x56</code> ||style="color:#4B2"| '''Yes''' || <code>i</code>
 +
|-
 +
|style="color:blue"| <code>0xBF893E8C</code> || <code>0x78</code> ||style="color:#4B2"| '''Yes''' || <code>i</code> ''LSB''
 +
|-
 +
| <code>0xBF893E8B</code> ||style="color:blue"| <code>0xBF</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> ''MSB''
 +
|-
 +
| <code>0xBF893E8B</code> ||style="color:blue"| <code>0x89</code> ||style="color:#4B2"| '''Yes''' || <code>p</code>
 +
|-
 +
| <code>0xBF893E8B</code> ||style="color:blue"| <code>0x3E</code> ||style="color:#4B2"| '''Yes''' || <code>p</code>
 +
|-
 +
| <code>0xBF893E8B</code> ||style="color:blue"| <code>0x8C</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> ''LSB''
 +
|-
 +
|colspan="4" align="center"| ...
 +
|}

Revision as of 14:47, 13 March 2012

Having been criticized by my colleagues, I decided to add some words to my Pointers page.

A Variable

Before we begin, this page assumes an x86 system (32-bit) - which means that addresses and integers are 32-bit / 4 bytes.

If we start by thinking about a normal variable, i, of type int.

An int is just a block of 4 bytes that the compiler promises it will find memory for.
In reality this comes from the stack, but that is a detail for another time.

Let's declare i and assign 0xDEADBEEF to it.

int i;
 
i = 0xDEADBEEF;

If you were to inspect the memory for this process, and represent it visually, we might find the following (the colouring will become apparent later).

Address Value Note
...
0xBF893E8F 0xDE i MSB
0xBF893E8E 0xAD i
0xBF893E8D 0xBE i
0xBF893E8C 0xEF i LSB
...

From this memory map, we can see that the variable i is stored at address 0xBF893E8C, and because the memory is byte addressable, i uses 4 'slots' - 0xDE, 0xAD, 0xBE, 0xEF.

A Pointer

Now let's consider a pointer.
They are just like normal variables, but they have a special meaning to the compiler.

int *p;
 
p = 0xDEADBEEF;

The code above gives exactly the same result, and assuming that p is located at the sample place in memory that i was, the previous memory map is still valid.

But when we come to 'de-reference' p we are going to have a problem.
The term 'de-referencing' is the act of following a pointer, to access the variable to which it points. This can be done in two ways:

*p     /* this is usually used in a situation similar to this, when 'p' points at a single variable */
p[0]   /* this is usually used when 'p' points to an array of variables */

The map below shows the value of p as well as the memory that it points to.
As you can see, this is no longer a simple variable, because its value is used to locate some other memory.
In this case, the memory at 0xDEADBEEF is not allocated to anything, and accessing it will cause a segfault (your program will crash).

Address Value Valid? Note
...
0xDEADBEF2 ?? No ?
0xDEADBEF1 ?? No ?
0xDEADBEF0 ?? No ?
0xDEADBEEF ?? No ?
...
0xBF893E8F 0xDE Yes p MSB
0xBF893E8E 0xAD Yes p
0xBF893E8D 0xBE Yes p
0xBF893E8C 0xEF Yes p LSB
...

Let's make something useful, instead of something that will crash.
Consider the following code:

int i;            /* our variable */
int *p;           /* our pointer */
 
p = &i;           /* assign the address of 'i' to 'p' */
 
*p = 0x12345678;  /* effectively assigns 0x12345678 to 'i' */

In this snippet, we make p point at i, and then store 0x12345678 in the memory that p points to.
This effectively stores 0x12345678 in i could result in the following memory map.

Address Value Valid? Note
...
0xBF893E8F 0x12 Yes i MSB
0xBF893E8E 0x34 Yes i
0xBF893E8D 0x56 Yes i
0xBF893E8C 0x78 Yes i LSB
0xBF893E8B 0xBF Yes p MSB
0xBF893E8B 0x89 Yes p
0xBF893E8B 0x3E Yes p
0xBF893E8B 0x8C Yes p LSB
...
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox