Arrays & Pointers

Most people find the concept of arrays & pointers hard to comprehend. My personal theory of why people find it hard to comprehend the idea of pointers is that their knowledge is limited to just the coding part of it. I hope my illustrated tutorial will help you to understand it easily. If you don't know what happens inside the RAM when you declare a variable, don't skip to the pointers part. Read through the basics where I have explained what happens inside RAM when you declare a variable.

You can download the power point presentation from, this link.

PS: This guide assumes that you have already dabbled with pointers and is here for a better understanding.

For all the code explained below, we shall be discussing it with the understanding that the compiler does no optimisations.

Behind the scenes
Lets start by looking at a piece of code,

void main()
{
int a;
a = 5;

printf("Variable a has integer %d stored in it.", a);
}

We know that when we run it, it will print "Variable a has integer 5 stored in it." But do you know what happened behind the scenes? Well a lot of things happened. Compiler understood that it needed to reserve enough memory for storing data of integer variable a, then it stored the current value '5' into that reserved area. Then it fetched that variable and gave it to the printf function which gave us our output. Lets us take a closer look.

System Memory (RAM)
RAM starts for Random Access Memory and is usually what people refer to, when they say memory, but beware that memory might not always refer to RAM and that depending on the context it can refer to Read Only Memory (ROM) where the programs gets stored (To those of you who are technically advance: let's just stick to this simpler concept of memory for now). To understand pointers it is important to understand how RAM stores and retains data.

Why should anyone understand how RAM works?
- Output of pointers will depend on what is stored in RAM & where
- All pointer operations will eventually end up as some operations on RAM data
- It will help you understand why pointers are a double edged sword

In RAM all data is stored in binary form.
- Binary form is a combination of 1s and 0s.
- An elementary unit of binary form is called a bit (can be a 1 or a 0)
e.g. If we stored 120 into RAM, then somewhere some place in RAM a ''01111000" sequence (8 bits) is stored.
- bit is the basic unit, 8 bits combine to form a byte, 1024 bytes combine to form a kilobyte (KB) 1024 kilobytes combine to form a megabyte (MB), 1024 megabyte combines to form a gigabyte (GB), ...

When we stored 5 in to variable a in previous program what actually happened was, binary (00000101) of integer a got stored into some location in RAM.

> RAM is just a collection of bits (or storage space for bits).
> So a 256 MB RAM is a collection of 2147483648 bits
> If we were to give an address each bit in RAM there would be so many addresses, so each byte of RAM is given a unique address.
> Hence in a 256MB RAM there is (2147483648 /8 = 268435456) bytes and that many addresses.

If we represent every byte of RAM by a box, it would look like,

1000	1001	1002	1003	1004	1005	1006	1007	1008
1009	1010	1011	1012	1013	1014	1015	1016	1017
1018	1019	1020	1021	1022	1023	1024	1025	1026
1027	1028	1029	1030	1031	1032	1033	1034	1035
1036	1037	1038	1039	1040	1041	1042	1043	1044
1045	1046	1047	1048	1049	1050	1051	1052	1053
1054	1055	1056	1057	1058	1059	1060	1061	1062
1063	1064	1065	1066	1067	1068	1069	1070	1071

Lets assume each byte is given an address and the addresses started with 1000. There would be 268435456 boxes in the actual scenario. In every box there is 8 bits.

While the boxes above is a good visual aid, there are a few things that the table misrepresents.
1. 1009 as a separate row, RAM is one stretch of bytes the table is shown as ROW and COLUMN for ease of presentation.
2. Starting address of 1000, it changes from system to system, what matters is that second byte in RAM will be address of 1st byte + 1

Important Binary Concepts
An important concept related to binary data is MSB and LSB!
MSB stands for the most significant bit and LSB stands for least significant bit.
We know that,
01111000 in binary corresponds to 120 in decimal.
01111001 in binary corresponds to 121 in decimal.
11111000 in binary corresponds to 248 in decimal.
As you clearly see the left most bit has higher significance on the output value than one on the right. Hence leftmost bit is called MSB and the rightmost bit is called LSB.

Different counting styles
A major reason why people get their program wrong is because they don't understand the difference in counting style of a C program and humans. C programs use integers to count and we humans use whole numbers.

Lets look at the initialization of 1D array of integers.

Int a[2] = {10, 20}

Whatever we give in the [] of variable is the count of data. We use whole numbers to count (1,2,3,…..) and computers use integers to count (0,1,2,3,….).

Due to this difference of counting style, when we refer to first position, computer refers to zeroth position.

0^th position of computer	1^st position of computer	2^nd position of computer
Our 1^st position	Our 2^nd position	Our 3^rd position

Is the following correct ?
1 to 5 contains 5 numbers
0 to 4 contains 5 numbers

Char a; Explained
As I have mentioned earlier, RAM is a collection of bits. It can/could represent just about anything in binary form.

void main()
{
char a;
a = 'C';

printf("Variable a has character %c stored in it.", a);
}

When the compiler sees the declaration ‘char a’, 1 byte of memory space (8 bits) is reserved for storing the character data. The variable name ‘a’ is associated with the reserved byte’s address.

Since only numbers have a binary representation, its necessary that a character be associated with a number. ASCII (AMERICAN STANDARD CODE FOR INFORMATION INTERCHANGE) does just that.

In ASCII all characters present at the time of its creation were assigned a number. (New characters got added into Extended ASCII table.)

Maximum ASCII number used is 255 which in binary is 11111111 (8 bits/1 byte). Hence to store any character, only 8 bits (or 1byte) is enough. Hence char a; reserves 1 byte.

How exactly is the data stored?

Lets look at the code,

scanf(“%c”,&a);

‘a’ being previously declared as char a;

Suppose the user enters ‘D’ and hits enter. Then the binary of ‘D’ is stored in a;

The reserved byte of a which has an assumed address of 1000. Binary of ‘D’ is 1000100 as per ASCII table.

int no; Explained

In the case of characters maximum ASCII number is 255, hence only 8 bits is necessary.

Integers extends from -∞ to +∞, so we will need ∞ bits to store it?!!!

So the “C-compiler” developers designed some standards for storing integers.

Ie: the C compiler will reserve two bytes of memory for you (16 bits).

Integers can be negative or positive, the starting bit is reversed for sign representation

(sign bit)[0 for +ve and 1 for -ve]

Rest 15bits to store the number.

The C compiler stores negative numbers using 2’s complement form.

In the 15bits we can store numbers from - 2^15 to 0 to 2^15- 1.

[Sign bit is only occupied when doing two’s complement on negative numbers].

Lets see how this code works.

scanf(“%d”,&no);

And user inputs 25

Binary of ‘25’ is 11001.

Since 25 is positive it’s directly stored in memory. [No 2’s complement calculation neccesarry]

And ‘a’ has two bytes associated, with two memory address[say 1200 and 1201]

Only the starting address is assigned to ‘a’.

Binary of 25 is 11001but that of -25 is 2s complement of 11001i.e for a 16 bit signed representation 1111111111100111. (11001 is actually 0000000000011001 as integers are 16bits in length;). Yes, doing a 2's complement SETS the MSB bit.

BONUS :

Remember how we stored 25 as :

It is a legitimate questions to ask why it was stored so? why not store 00011001 at address 1200 and 00000000 at 1201 instead? Yes, that is also acceptable in fact there are systems out there which stores in that fashion as well. Based on how you store it the system is classified into BIG ENDIAN or LITTLE ENDIAN system.

Unsigned int no; Explained

In this case the sign bit’s space is also used for data storage. Hence, all data from +0 to +2^16-1.

Float no; Explained

Compiler reserves 4 bytes for you. Of which starting bit is a sign bit.

Float follows IEEE Standards. 1 sign bit 13 exponent bits and remaining for mantissa.

Learning multi-dimensional arrays

Concept of nD arrays n = 1,2,3

Best thing to learn nD array is to compare it with a nD space. This helps to easily use 2D, 3D,…..

Ie: for 1D array 1D space = X axis

For 2D array 2D space = X-Y space

For 3D array 3D space = X-Y-Z space

For 4D array concept will be explained later.

Concept of 1D arrays

Normally declared as

Char a[5] (5 – Our style of counting)

Int b[5] ( " )

……

And we call it like a[2], a[4],….

In a[postion] as position takes values from 0,1,2…,4 [Computer style], we get data at a[0], a[1],…, a[4] (0th position, 1st position,….)

We see that change of value of position can be imagined in a stright line, like x-axis.

Concept of 2D arrays

Normally declared as

Char a[3][3] (3,3 – Our style of counting)

Int b[3][3] ( “ )

……

And we call it like a[2][0],a[2][1],…., a[4][0],….

Lets consider it in 2D space.

Lets consider the call as a[Y position][x position]

When Y position is a fixed value then it’s more like 2D space reduces to a 1D space at that fixed value of Y position.

If Y position is 1, then a[1][x position] is like having a 1D array at Y position 1.

We see that if Y position is 1, then a[1][x position] by varying x position from 0 to 2 we are able to select all data 1st row.

And by keeping X position a fixed value say 2, Now we can only vary Y-Position. Which retrieves all data of 2nd column as Y-position varies from 0 to 2.

A 3D array is a i x j x k Matrix. In 3D array there is one more axis, viz Z-axis.

Declared as Char a[Z-position][Y-position][X-position].

During call, When Z-position is fixed, the 3D array reduces to 2D array at that point. In graphical terms ‘A plane at corresponding Z-position’. If you have the concept of 1D and 2D arrays, I really don’t think I have to say much.

Visualizing 3x3x3 3D array