Internal Mechanics of Arrays
Course: DSA › Arrays › Introduction
So far, we learned what an array is and how it solves problems where we need to store and manipulate large-scale data easily. We can now look at how the array data structure works under the hood and what makes it so fast and easy to use.
Memory Addresses
Array elements are accessed using indices because arrays are stored contiguously in memory. To understand why, let’s revisit the memory model.
Note: This is how an array data structure is stored at the lowest level. Higher-level programming languages abstract all this from the user, but at their core, use the same mechanism.
Memory in RAM is logically organized as a sequence of blocks, each 1 byte (8 bits) long. Every block has a unique identifier — its address — which is simply its relative position from the start (starting from 0).
---
config:
theme: base
themeVariables:
primaryColor: "#dbeafe"
primaryBorderColor: "#3b82f6"
primaryTextColor: "#1e3a5f"
lineColor: "#64748b"
secondaryColor: "#ede9fe"
tertiaryColor: "#fef9c3"
---
flowchart LR
b0["0<br/>8 bits"] --- b1["1<br/>8 bits"] --- b2["2<br/>8 bits"] --- b3["3<br/>8 bits"] --- b4["4<br/>8 bits"] --- b5["5<br/>8 bits"] --- b6["6<br/>8 bits"] --- b7["7<br/>8 bits"]
addr(["Address = 3"]) --> b3
Memory is logically organized as a linear sequence of blocks.
Layout in Memory
An array is just a continuous segment of memory that stores data of a single type. Each element in the array has a fixed size equal to the size of its data type. So:
Total size of array = size of datatype × number of elements
The address of the memory block where an array starts is called the array’s base address.
Base address: The address of the block of memory where an array starts. The base address, along with the index, is used to access data items in an array.
Here’s what an array of 5 integers looks like in memory, with a base address of 2 and each int occupying 4 bytes:
---
config:
theme: base
themeVariables:
primaryColor: "#dbeafe"
primaryBorderColor: "#3b82f6"
primaryTextColor: "#1e3a5f"
lineColor: "#64748b"
secondaryColor: "#ede9fe"
tertiaryColor: "#fef9c3"
---
flowchart LR
base(["base address = 2"]) --> e0
e0["value1<br/>index: 0<br/>addr: 2 → 5"]
e1["value2<br/>index: 1<br/>addr: 6 → 9"]
e2["value3<br/>index: 2<br/>addr: 10 → 13"]
e3["value4<br/>index: 3<br/>addr: 14 → 17"]
e4["value5<br/>index: 4<br/>addr: 18 → 21"]
e0 --- e1 --- e2 --- e3 --- e4
Structure of an array in memory. Each element spans 4 consecutive bytes (size of int).
Key observations:
- Elements are laid out back to back with no gaps
- Each element spans exactly
size_of_datatypebytes - The next element always starts exactly
size_of_datatypebytes after the previous one
Accessing Data Items
Now that we know how an array maps into continuous memory, we can derive a simple formula to calculate the address of any element, given:
- the base address (where the array starts)
- the size of the datatype (bytes per element)
- the index (which element we want)
---
config:
theme: base
themeVariables:
primaryColor: "#dbeafe"
primaryBorderColor: "#3b82f6"
primaryTextColor: "#1e3a5f"
lineColor: "#64748b"
secondaryColor: "#ede9fe"
tertiaryColor: "#fef9c3"
---
flowchart LR
F["address(index)"] --> EQ["="]
EQ --> B["base_address"]
B --> PLUS["+"]
PLUS --> PAREN["( size_of_datatype × index )"]
style F fill:#ede9fe,stroke:#7c3aed,color:#3b0764
style B fill:#fef9c3,stroke:#d97706,color:#78350f
style PAREN fill:#dcfce7,stroke:#16a34a,color:#14532d
Calculating the address of a data item stored at a given index in an array.
Let’s verify with our example (base address = 2, int = 4 bytes):
| Index | Formula | Address |
|---|---|---|
| 0 | 2 + (4 × 0) | 2 |
| 1 | 2 + (4 × 1) | 6 |
| 2 | 2 + (4 × 2) | 10 |
| 3 | 2 + (4 × 3) | 14 |
| 4 | 2 + (4 × 4) | 18 |
This matches the layout exactly.
This is why array indices start at 0, not 1.
Think of the index not as a position number, but as a how many elements to skip count.
- To reach the first element — skip 0 elements → index
0- To reach the second element — skip 1 element → index
1- To reach the third element — skip 2 elements → index
2The formula makes this concrete. With base address
2and int size4:
index 0→ 2 + (4 × 0) = 2 ✓ lands exactly at the first elementindex 1→ 2 + (4 × 1) = 6 ✓ lands exactly at the second elementIf indices started at
1instead,index 1would give address6— jumping straight past the first element at address2, which would never be reachable. You’d need a messy correction likebase + (size × (index − 1)). Starting at0keeps the formula clean and direct.
The CPU computes this address instantly using a single multiplication and addition — no iteration, no searching. That’s why array access is O(1).
Key Takeaway
The power of arrays comes from this formula. Once you know the base address and the datatype size, you can jump to any element in constant time. The CPU doesn’t need to scan from the beginning — it does one arithmetic operation and lands exactly at the right memory address.
# Simulating the address formula in Python
base_address = 2
size_of_int = 4 # bytes
def address_of(index: int) -> int:
return base_address + (size_of_int * index)
for i in range(5):
print(f"value{i+1} at index {i} → address {address_of(i)}")