C/Pointers with words
m (Created page with 'Having been criticized by my colleagues, I decided to add some words to my Pointers page. ==A Variable== If we start by thinking about a standard variable 'i', assuming an x…') |
m |
||
Line 2: | Line 2: | ||
==A Variable== | ==A Variable== | ||
− | + | Before we begin, this page assumes an x86 system (32-bit) - which means that addresses and integers are 32-bit / 4 bytes. | |
− | < | + | |
− | + | If we start by thinking about a normal variable, <code>i</code>, of type <code>int</code>. | |
− | </ | + | |
− | + | An <code>int</code> is just a block of 4 bytes that the compiler promises it will find memory for.<br> | |
In reality this comes from the stack, but that is a detail for another time. | In reality this comes from the stack, but that is a detail for another time. | ||
− | + | Let's declare <code>i</code> and assign <code>0xDEADBEEF</code> to it. | |
<source lang="c"> | <source lang="c"> | ||
int i; | int i; | ||
Line 15: | Line 15: | ||
i = 0xDEADBEEF; | i = 0xDEADBEEF; | ||
</source> | </source> | ||
+ | If you were to inspect the memory for this process, and represent it visually, we might find the following (the colouring will become apparent later). | ||
{|border="1" cellpadding="0" | {|border="1" cellpadding="0" | ||
! Address !! Value !! Note | ! Address !! Value !! Note | ||
Line 20: | Line 21: | ||
|colspan="3" align="center"| ... | |colspan="3" align="center"| ... | ||
|- | |- | ||
− | | 0xBF893E8F || 0xDE || i ''MSB'' | + | | <code>0xBF893E8F</code> || <code>0xDE</code> || <code>i</code> ''MSB'' |
|- | |- | ||
− | | 0xBF893E8E || 0xAD || i | + | | <code>0xBF893E8E</code> || <code>0xAD</code> || <code>i</code> |
|- | |- | ||
− | | 0xBF893E8D || 0xBE || i | + | | <code>0xBF893E8D</code> || <code>0xBE</code> || <code>i</code> |
|- | |- | ||
− | |style="color:blue"| 0xBF893E8C || 0xEF || i ''LSB'' | + | |style="color:blue"| <code>0xBF893E8C</code> || <code>0xEF</code> || <code>i</code> ''LSB'' |
|- | |- | ||
|colspan="3" align="center"| ... | |colspan="3" align="center"| ... | ||
|} | |} | ||
− | This is | + | From this memory map, we can see that the variable <code>i</code> is stored at address <code>0xBF893E8C</code>, and because the memory is byte addressable, <code>i</code> uses 4 'slots' - <code>0xDE</code>, <code>0xAD</code>, <code>0xBE</code>, <code>0xEF</code>. |
+ | |||
+ | ==A Pointer== | ||
+ | Now let's consider a pointer.<br> | ||
+ | They are just like normal variables, but they have a special meaning to the compiler. | ||
+ | <source lang="c"> | ||
+ | int *p; | ||
+ | |||
+ | p = 0xDEADBEEF; | ||
+ | </source> | ||
+ | The code above gives exactly the same result, and assuming that <code>p</code> is located at the sample place in memory that <code>i</code> was, the previous memory map is still valid. | ||
+ | |||
+ | But when we come to 'de-reference' <code>p</code> we are going to have a problem.<br> | ||
+ | The term 'de-referencing' is the act of following a pointer, to access the variable to which it points. This can be done in two ways: | ||
+ | <source lang="c"> | ||
+ | *p /* this is usually used in a situation similar to this, when 'p' points at a single variable */ | ||
+ | p[0] /* this is usually used when 'p' points to an array of variables */ | ||
+ | </source> | ||
+ | |||
+ | The map below shows the value of <code>p</code> as well as the memory that it points to.<br> | ||
+ | As you can see, this is no longer a simple variable, because its value is used to locate some other memory.<br> | ||
+ | In this case, the memory at <code>0xDEADBEEF</code> is not allocated to anything, and accessing it will cause a [http://en.wikipedia.org/wiki/Segmentation_fault segfault] (your program will crash). | ||
+ | {|border="1" cellpadding="0" | ||
+ | ! Address !! Value !! Valid? !! Note | ||
+ | |- | ||
+ | |colspan="4" align="center"| ... | ||
+ | |- | ||
+ | | <code>0xDEADBEF2</code> || <code>??</code> ||style="color:red"| '''No''' || <code>?</code> | ||
+ | |- | ||
+ | | <code>0xDEADBEF1</code> || <code>??</code> ||style="color:red"| '''No''' || <code>?</code> | ||
+ | |- | ||
+ | | <code>0xDEADBEF0</code> || <code>??</code> ||style="color:red"| '''No''' || <code>?</code> | ||
+ | |- | ||
+ | |style="color:red"| <code>0xDEADBEEF</code> || <code>??</code> ||style="color:red"| '''No''' || <code>?</code> | ||
+ | |- | ||
+ | |colspan="4" align="center"| ... | ||
+ | |- | ||
+ | | <code>0xBF893E8F</code> ||style="color:red"| <code>0xDE</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> ''MSB'' | ||
+ | |- | ||
+ | | <code>0xBF893E8E</code> ||style="color:red"| <code>0xAD</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> | ||
+ | |- | ||
+ | | <code>0xBF893E8D</code> ||style="color:red"| <code>0xBE</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> | ||
+ | |- | ||
+ | |style="color:blue"| <code>0xBF893E8C</code> ||style="color:red"| <code>0xEF</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> ''LSB'' | ||
+ | |- | ||
+ | |colspan="4" align="center"| ... | ||
+ | |} | ||
+ | |||
+ | Let's make something useful, instead of something that will crash.<br> | ||
+ | Consider the following code: | ||
+ | <source lang="c"> | ||
+ | int i; /* our variable */ | ||
+ | int *p; /* our pointer */ | ||
+ | |||
+ | p = &i; /* assign the address of 'i' to 'p' */ | ||
+ | |||
+ | *p = 0x12345678; /* effectively assigns 0x12345678 to 'i' */ | ||
+ | </source> | ||
+ | In this snippet, we make <code>p</code> point at <code>i</code>, and then store <code>0x12345678</code> in the memory that <code>p</code> points to.<br> | ||
+ | This effectively stores <code>0x12345678</code> in <code>i</code> could result in the following memory map. | ||
+ | {|border="1" cellpadding="0" | ||
+ | ! Address !! Value !! Valid? !! Note | ||
+ | |- | ||
+ | |colspan="4" align="center"| ... | ||
+ | |- | ||
+ | | <code>0xBF893E8F</code> || <code>0x12</code> ||style="color:#4B2"| '''Yes''' || <code>i</code> ''MSB'' | ||
+ | |- | ||
+ | | <code>0xBF893E8E</code> || <code>0x34</code> ||style="color:#4B2"| '''Yes''' || <code>i</code> | ||
+ | |- | ||
+ | | <code>0xBF893E8D</code> || <code>0x56</code> ||style="color:#4B2"| '''Yes''' || <code>i</code> | ||
+ | |- | ||
+ | |style="color:blue"| <code>0xBF893E8C</code> || <code>0x78</code> ||style="color:#4B2"| '''Yes''' || <code>i</code> ''LSB'' | ||
+ | |- | ||
+ | | <code>0xBF893E8B</code> ||style="color:blue"| <code>0xBF</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> ''MSB'' | ||
+ | |- | ||
+ | | <code>0xBF893E8B</code> ||style="color:blue"| <code>0x89</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> | ||
+ | |- | ||
+ | | <code>0xBF893E8B</code> ||style="color:blue"| <code>0x3E</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> | ||
+ | |- | ||
+ | | <code>0xBF893E8B</code> ||style="color:blue"| <code>0x8C</code> ||style="color:#4B2"| '''Yes''' || <code>p</code> ''LSB'' | ||
+ | |- | ||
+ | |colspan="4" align="center"| ... | ||
+ | |} |
Revision as of 14:47, 13 March 2012
Having been criticized by my colleagues, I decided to add some words to my Pointers page.
A Variable
Before we begin, this page assumes an x86 system (32-bit) - which means that addresses and integers are 32-bit / 4 bytes.
If we start by thinking about a normal variable, i
, of type int
.
An int
is just a block of 4 bytes that the compiler promises it will find memory for.
In reality this comes from the stack, but that is a detail for another time.
Let's declare i
and assign 0xDEADBEEF
to it.
int i; i = 0xDEADBEEF;
If you were to inspect the memory for this process, and represent it visually, we might find the following (the colouring will become apparent later).
Address | Value | Note |
---|---|---|
... | ||
0xBF893E8F |
0xDE |
i MSB
|
0xBF893E8E |
0xAD |
i
|
0xBF893E8D |
0xBE |
i
|
0xBF893E8C |
0xEF |
i LSB
|
... |
From this memory map, we can see that the variable i
is stored at address 0xBF893E8C
, and because the memory is byte addressable, i
uses 4 'slots' - 0xDE
, 0xAD
, 0xBE
, 0xEF
.
A Pointer
Now let's consider a pointer.
They are just like normal variables, but they have a special meaning to the compiler.
int *p; p = 0xDEADBEEF;
The code above gives exactly the same result, and assuming that p
is located at the sample place in memory that i
was, the previous memory map is still valid.
But when we come to 'de-reference' p
we are going to have a problem.
The term 'de-referencing' is the act of following a pointer, to access the variable to which it points. This can be done in two ways:
*p /* this is usually used in a situation similar to this, when 'p' points at a single variable */ p[0] /* this is usually used when 'p' points to an array of variables */
The map below shows the value of p
as well as the memory that it points to.
As you can see, this is no longer a simple variable, because its value is used to locate some other memory.
In this case, the memory at 0xDEADBEEF
is not allocated to anything, and accessing it will cause a segfault (your program will crash).
Address | Value | Valid? | Note |
---|---|---|---|
... | |||
0xDEADBEF2 |
?? |
No | ?
|
0xDEADBEF1 |
?? |
No | ?
|
0xDEADBEF0 |
?? |
No | ?
|
0xDEADBEEF |
?? |
No | ?
|
... | |||
0xBF893E8F |
0xDE |
Yes | p MSB
|
0xBF893E8E |
0xAD |
Yes | p
|
0xBF893E8D |
0xBE |
Yes | p
|
0xBF893E8C |
0xEF |
Yes | p LSB
|
... |
Let's make something useful, instead of something that will crash.
Consider the following code:
int i; /* our variable */ int *p; /* our pointer */ p = &i; /* assign the address of 'i' to 'p' */ *p = 0x12345678; /* effectively assigns 0x12345678 to 'i' */
In this snippet, we make p
point at i
, and then store 0x12345678
in the memory that p
points to.
This effectively stores 0x12345678
in i
could result in the following memory map.
Address | Value | Valid? | Note |
---|---|---|---|
... | |||
0xBF893E8F |
0x12 |
Yes | i MSB
|
0xBF893E8E |
0x34 |
Yes | i
|
0xBF893E8D |
0x56 |
Yes | i
|
0xBF893E8C |
0x78 |
Yes | i LSB
|
0xBF893E8B |
0xBF |
Yes | p MSB
|
0xBF893E8B |
0x89 |
Yes | p
|
0xBF893E8B |
0x3E |
Yes | p
|
0xBF893E8B |
0x8C |
Yes | p LSB
|
... |