Most people find the concept of arrays & pointers hard to comprehend. My personal theory of why people find it hard to comprehend the idea of pointers is that their knowledge is limited to just the coding part of it. I hope my illustrated tutorial will help you to understand it easily. If you don't know what happens inside the RAM when you declare a variable, don't skip to the pointers part. Read through the basics where I have explained what happens inside RAM when you declare a variable.
You can download the power point presentation from, this link.
PS: This guide assumes that you have already dabbled with pointers and is here for a better understanding.
For all the code explained below, we shall be discussing it with the understanding that the compiler does no optimisations.
Behind the scenes
Lets start by looking at a piece of code,
void main()
{
int a;
a = 5;
printf("Variable a has integer %d stored in it.", a);
}
We know that when we run it, it will print "Variable a has integer 5 stored in it." But do you know what happened behind the scenes? Well a lot of things happened. Compiler understood that it needed to reserve enough memory for storing data of integer variable a, then it stored the current value '5' into that reserved area. Then it fetched that variable and gave it to the printf function which gave us our output. Lets us take a closer look.
System Memory (RAM)
RAM starts for Random Access Memory and is usually what people refer to, when they say memory, but beware that memory might not always refer to RAM and that depending on the context it can refer to Read Only Memory (ROM) where the programs gets stored (To those of you who are technically advance: let's just stick to this simpler concept of memory for now). To understand pointers it is important to understand how RAM stores and retains data.
Why should anyone understand how RAM works?
- Output of pointers will depend on what is stored in RAM & where
- All pointer operations will eventually end up as some operations on RAM data
- It will help you understand why pointers are a double edged sword
In RAM all data is stored in binary form.
- Binary form is a combination of 1s and 0s.
- An elementary unit of binary form is called a bit (can be a 1 or a 0)
e.g. If we stored 120 into RAM, then somewhere some place in RAM a ''01111000" sequence (8 bits) is stored.
- bit is the basic unit, 8 bits combine to form a byte, 1024 bytes combine to form a kilobyte (KB) 1024 kilobytes combine to form a megabyte (MB), 1024 megabyte combines to form a gigabyte (GB), ...
When we stored 5 in to variable a in previous program what actually happened was, binary (00000101) of integer a got stored into some location in RAM.
> RAM is just a collection of bits (or storage space for bits).
> So a 256 MB RAM is a collection of 2147483648 bits
> If we were to give an address each bit in RAM there would be so many addresses, so each byte of RAM is given a unique address.
> Hence in a 256MB RAM there is (2147483648 /8 = 268435456) bytes and that many addresses.
If we represent every byte of RAM by a box, it would look like,
Lets assume each byte is given an address and the addresses started with 1000. There would be 268435456 boxes in the actual scenario. In every box there is 8 bits.
While the boxes above is a good visual aid, there are a few things that the table misrepresents.
1. 1009 as a separate row, RAM is one stretch of bytes the table is shown as ROW and COLUMN for ease of presentation.
2. Starting address of 1000, it changes from system to system, what matters is that second byte in RAM will be address of 1st byte + 1
Important Binary Concepts
An important concept related to binary data is MSB and LSB!
MSB stands for the most significant bit and LSB stands for least significant bit.
We know that,
01111000 in binary corresponds to 120 in decimal.
01111001 in binary corresponds to 121 in decimal.
11111000 in binary corresponds to 248 in decimal.
As you clearly see the left most bit has higher significance on the output value than one on the right. Hence leftmost bit is called MSB and the rightmost bit is called LSB.
Different counting styles
A major reason why people get their program wrong is because they don't understand the difference in counting style of a C program and humans. C programs use integers to count and we humans use whole numbers.
Is the following correct ?
1 to 5 contains 5 numbers
0 to 4 contains 5 numbers
Char a; Explained
As I have mentioned earlier, RAM is a collection of bits. It can/could represent just about anything in binary form.
void main()
{
char a;
a = 'C';
printf("Variable a has character %c stored in it.", a);
}
When the compiler sees the declaration ‘char a’, 1 byte of memory space (8 bits) is reserved for storing the character data. The variable name ‘a’ is associated with the reserved byte’s address.
Since only numbers have a binary representation, its necessary that a character be associated with a number. ASCII (AMERICAN STANDARD CODE FOR INFORMATION INTERCHANGE) does just that.
In ASCII all characters present at the time of its creation were assigned a number. (New characters got added into Extended ASCII table.)
Maximum ASCII number used is 255 which in binary is 11111111 (8 bits/1 byte). Hence to store any character, only 8 bits (or 1byte) is enough. Hence char a; reserves 1 byte.
Remember how we stored 25 as :
How are they stored in RAM? How pointer operations works on arrays?
Learn more at this link.
You can download the power point presentation from, this link.
PS: This guide assumes that you have already dabbled with pointers and is here for a better understanding.
For all the code explained below, we shall be discussing it with the understanding that the compiler does no optimisations.
Behind the scenes
Lets start by looking at a piece of code,
void main()
{
int a;
a = 5;
printf("Variable a has integer %d stored in it.", a);
}
We know that when we run it, it will print "Variable a has integer 5 stored in it." But do you know what happened behind the scenes? Well a lot of things happened. Compiler understood that it needed to reserve enough memory for storing data of integer variable a, then it stored the current value '5' into that reserved area. Then it fetched that variable and gave it to the printf function which gave us our output. Lets us take a closer look.
System Memory (RAM)
RAM starts for Random Access Memory and is usually what people refer to, when they say memory, but beware that memory might not always refer to RAM and that depending on the context it can refer to Read Only Memory (ROM) where the programs gets stored (To those of you who are technically advance: let's just stick to this simpler concept of memory for now). To understand pointers it is important to understand how RAM stores and retains data.
Why should anyone understand how RAM works?
- Output of pointers will depend on what is stored in RAM & where
- All pointer operations will eventually end up as some operations on RAM data
- It will help you understand why pointers are a double edged sword
In RAM all data is stored in binary form.
- Binary form is a combination of 1s and 0s.
- An elementary unit of binary form is called a bit (can be a 1 or a 0)
e.g. If we stored 120 into RAM, then somewhere some place in RAM a ''01111000" sequence (8 bits) is stored.
- bit is the basic unit, 8 bits combine to form a byte, 1024 bytes combine to form a kilobyte (KB) 1024 kilobytes combine to form a megabyte (MB), 1024 megabyte combines to form a gigabyte (GB), ...
When we stored 5 in to variable a in previous program what actually happened was, binary (00000101) of integer a got stored into some location in RAM.
> RAM is just a collection of bits (or storage space for bits).
> So a 256 MB RAM is a collection of 2147483648 bits
> If we were to give an address each bit in RAM there would be so many addresses, so each byte of RAM is given a unique address.
> Hence in a 256MB RAM there is (2147483648 /8 = 268435456) bytes and that many addresses.
If we represent every byte of RAM by a box, it would look like,
1000 |
1001 |
1002 |
1003 |
1004 |
1005 |
1006 |
1007 |
1008 |
1009 |
1010 |
1011 |
1012 |
1013 |
1014 |
1015 |
1016 |
1017 |
1018 |
1019 |
1020 |
1021 |
1022 |
1023 |
1024 |
1025 |
1026 |
1027 |
1028 |
1029 |
1030 |
1031 |
1032 |
1033 |
1034 |
1035 |
1036 |
1037 |
1038 |
1039 |
1040 |
1041 |
1042 |
1043 |
1044 |
1045 |
1046 |
1047 |
1048 |
1049 |
1050 |
1051 |
1052 |
1053 |
1054 |
1055 |
1056 |
1057 |
1058 |
1059 |
1060 |
1061 |
1062 |
1063 |
1064 |
1065 |
1066 |
1067 |
1068 |
1069 |
1070 |
1071 |
Lets assume each byte is given an address and the addresses started with 1000. There would be 268435456 boxes in the actual scenario. In every box there is 8 bits.
While the boxes above is a good visual aid, there are a few things that the table misrepresents.
1. 1009 as a separate row, RAM is one stretch of bytes the table is shown as ROW and COLUMN for ease of presentation.
2. Starting address of 1000, it changes from system to system, what matters is that second byte in RAM will be address of 1st byte + 1
Important Binary Concepts
An important concept related to binary data is MSB and LSB!
MSB stands for the most significant bit and LSB stands for least significant bit.
We know that,
01111000 in binary corresponds to 120 in decimal.
01111001 in binary corresponds to 121 in decimal.
11111000 in binary corresponds to 248 in decimal.
As you clearly see the left most bit has higher significance on the output value than one on the right. Hence leftmost bit is called MSB and the rightmost bit is called LSB.
Different counting styles
A major reason why people get their program wrong is because they don't understand the difference in counting style of a C program and humans. C programs use integers to count and we humans use whole numbers.
Lets look at the initialization of 1D array of integers.
Int a[2] = {10, 20}
Whatever we give in the [] of variable is the count of data. We use whole numbers to count (1,2,3,…..) and computers use integers to count (0,1,2,3,….).
Due to this difference of counting style, when we refer to first position, computer refers to zeroth position.
0th position of computer |
1st position of computer |
2nd position of computer |
Our 1st
position
|
Our 2nd
position
|
Our 3rd
position
|
Is the following correct ?
1 to 5 contains 5 numbers
0 to 4 contains 5 numbers
Char a; Explained
As I have mentioned earlier, RAM is a collection of bits. It can/could represent just about anything in binary form.
void main()
{
char a;
a = 'C';
printf("Variable a has character %c stored in it.", a);
}
When the compiler sees the declaration ‘char a’, 1 byte of memory space (8 bits) is reserved for storing the character data. The variable name ‘a’ is associated with the reserved byte’s address.
Since only numbers have a binary representation, its necessary that a character be associated with a number. ASCII (AMERICAN STANDARD CODE FOR INFORMATION INTERCHANGE) does just that.
In ASCII all characters present at the time of its creation were assigned a number. (New characters got added into Extended ASCII table.)
Maximum ASCII number used is 255 which in binary is 11111111 (8 bits/1 byte). Hence to store any character, only 8 bits (or 1byte) is enough. Hence char a; reserves 1 byte.
How exactly is the data stored?
Lets look at the code,
scanf(“%c”,&a);
‘a’ being previously declared as char a;
Suppose the user enters ‘D’ and hits enter. Then the binary of ‘D’ is stored in a;
The reserved byte of a which has an assumed address of 1000. Binary of ‘D’ is 1000100 as per ASCII table.
int no; Explained
In the case of characters maximum ASCII number is 255, hence only 8 bits is necessary.
Integers extends from -∞ to +∞, so we will need ∞ bits to store it?!!!
So the “C-compiler” developers designed some standards for storing integers.
Ie: the C compiler will reserve two bytes of memory for you (16 bits).
Integers can be negative or positive, the starting bit is reversed for sign representation
(sign bit)[0 for +ve and 1 for -ve]
Rest 15bits to store the number.
The C compiler stores negative numbers using 2’s complement form.
In the 15bits we can store numbers from - 2^15 to 0 to 2^15- 1.
[Sign bit is only occupied when doing two’s complement on negative numbers].
Lets see how this code works.
scanf(“%d”,&no);
And user inputs 25
Binary of ‘25’ is 11001.
Since 25 is positive it’s directly stored in memory. [No 2’s complement calculation neccesarry]
And ‘a’ has two bytes associated, with two memory address[say 1200 and 1201]
Only the starting address is assigned to ‘a’.
Binary of 25 is 11001but that of -25 is 2s complement of 11001i.e for a 16 bit signed representation 1111111111100111. (11001 is actually 0000000000011001 as integers are 16bits in length;). Yes, doing a 2's complement SETS the MSB bit.
BONUS :
Remember how we stored 25 as :
It is a legitimate questions to ask why it was stored so? why not store 00011001 at address 1200 and 00000000 at 1201 instead? Yes, that is also acceptable in fact there are systems out there which stores in that fashion as well. Based on how you store it the system is classified into BIG ENDIAN or LITTLE ENDIAN system.
Unsigned int no; Explained
In this case the sign bit’s space is also used for data storage. Hence, all data from +0 to +2^16-1.
Float no; Explained
Compiler reserves 4 bytes for you. Of which starting bit is a sign bit.
Float follows IEEE Standards. 1 sign bit 13 exponent bits and remaining for mantissa.
Learning multi-dimensional arrays
Concept of nD arrays n = 1,2,3
Best thing to learn nD array is to compare it with a nD space. This helps to easily use 2D, 3D,…..
Ie: for 1D array 1D space = X axis
For 2D array 2D space = X-Y space
For 3D array 3D space = X-Y-Z space
For 4D array concept will be explained later.
Concept of 1D arrays
Normally declared as
Char a[5] (5 – Our style of counting)
Int b[5] ( " )
……
And we call it like a[2], a[4],….
In a[postion] as position takes values from 0,1,2…,4 [Computer style], we get data at a[0], a[1],…, a[4] (0th position, 1st position,….)
We see that change of value of position can be imagined in a stright line, like x-axis.
Concept of 2D arrays
Normally declared as
Char a[3][3] (3,3 – Our style of counting)
Int b[3][3] ( “ )
……
And we call it like a[2][0],a[2][1],…., a[4][0],….
Lets consider it in 2D space.
Lets consider the call as a[Y position][x position]
When Y position is a fixed value then it’s more like 2D space reduces to a 1D space at that fixed value of Y position.
If Y position is 1, then a[1][x position] is like having a 1D array at Y position 1.
We see that if Y position is 1, then a[1][x position] by varying x position from 0 to 2 we are able to select all data 1st row.
And by keeping X position a fixed value say 2, Now we can only vary Y-Position. Which retrieves all data of 2nd column as Y-position varies from 0 to 2.
A 3D array is a i x j x k Matrix. In 3D array there is one more axis, viz Z-axis.
Declared as Char a[Z-position][Y-position][X-position].
During call, When Z-position is fixed, the 3D array reduces to 2D array at that point. In graphical terms ‘A plane at corresponding Z-position’. If you have the concept of 1D and 2D arrays, I really don’t think I have to say much.
Visualizing 3x3x3 3D array
How are they stored in RAM? How pointer operations works on arrays?
Learn more at this link.
Comments
Post a Comment