Understanding Column Major Order

The Mirror Image of Row-Major

You’ve seen that row-major order flattens a 2D array row by row. Column-major order does the exact opposite — it flattens the array column by column.

Column-major: store all elements of column 0 top to bottom, then column 1, then column 2, and so on.

Think of reading a Chinese newspaper — you start at the top of a column and read downward, then move to the next column. That’s column-major order.

Languages like Fortran, MATLAB, Julia, and R store their arrays this way. This matters enormously when you’re calling numerical libraries or exchanging data between ecosystems.

Side-by-Side: What Changes

Take the same 3×4 array from the previous lessons:

arr = [
  [10, 20, 30, 40],   ← Row 0
  [50, 60, 70, 80],   ← Row 1
  [90, 11, 12, 13],   ← Row 2
]

	Flat memory sequence
Row-major	10, 20, 30, 40, 50, 60, 70, 80, 90, 11, 12, 13
Column-major	10, 50, 90, 20, 60, 11, 30, 70, 12, 40, 80, 13

The 2D grid is identical. The only thing that changes is which elements end up next to each other in memory.

Visualising Column-Major Order

Here’s the same 3×4 logical grid:

---
config:
  theme: base
  themeVariables:
    primaryColor: "#dbeafe"
    primaryBorderColor: "#3b82f6"
    primaryTextColor: "#1e3a5f"
    lineColor: "#64748b"
    secondaryColor: "#ede9fe"
    tertiaryColor: "#fef9c3"
---
block-beta
  columns 4
  A["[0][0] = 10"] B["[0][1] = 20"] C["[0][2] = 30"] D["[0][3] = 40"]
  E["[1][0] = 50"] F["[1][1] = 60"] G["[1][2] = 70"] H["[1][3] = 80"]
  I["[2][0] = 90"] J["[2][1] = 11"] K["[2][2] = 12"] L["[2][3] = 13"]

The logical 2D view — same 3×4 grid as before. What changes is the memory layout below.

In column-major order, this is flattened one complete column at a time — top to bottom within each column:

---
config:
  theme: base
  themeVariables:
    primaryColor: "#dbeafe"
    primaryBorderColor: "#3b82f6"
    primaryTextColor: "#1e3a5f"
    lineColor: "#64748b"
    secondaryColor: "#ede9fe"
    tertiaryColor: "#fef9c3"
---
flowchart LR
  subgraph C0["Column 0"]
    a["10"] --> b["50"] --> c["90"]
  end
  subgraph C1["Column 1"]
    d["20"] --> e["60"] --> f["11"]
  end
  subgraph C2["Column 2"]
    g["30"] --> h["70"] --> i["12"]
  end
  subgraph C3["Column 3"]
    j["40"] --> k["80"] --> l["13"]
  end
  c --> d
  f --> g
  i --> j

Generic representation of the 2D array in column-major order in memory — Column 0 is placed first (top to bottom), then Column 1, then Column 2, then Column 3.

Notice how neighbours in memory are now vertically adjacent in the grid, not horizontally. 10 (addr 0) is next to 50 (addr 1), not 20.

The N-Dimensional Rule

For a general N-dimensional array with sizes Dₙ × Dₙ₋₁ × ... × D₁:

Column-major: the highest dimension index (Dₙ) moves the fastest. The lowest dimension index (D₁) moves the slowest.

This is the exact reverse of row-major (where D₁ moves fastest). Think of it as flipping the odometer — the leftmost digit now ticks, and the rightmost digit is the slowest.

For the 3×4 array (D₂=3, D₁=4), the indices progress like this:

---
config:
  theme: base
  themeVariables:
    primaryColor: "#dbeafe"
    primaryBorderColor: "#3b82f6"
    primaryTextColor: "#1e3a5f"
    lineColor: "#64748b"
    secondaryColor: "#ede9fe"
    tertiaryColor: "#fef9c3"
---
flowchart LR
  subgraph Col0["Column 0  (D₁=0)  —  D₂ races 0→1→2"]
    direction TB
    r00["[0][0]"] --> r10["[1][0]"] --> r20["[2][0]"]
  end
  subgraph Col1["Column 1  (D₁=1)  —  D₂ races 0→1→2"]
    direction TB
    r01["[0][1]"] --> r11["[1][1]"] --> r21["[2][1]"]
  end
  subgraph Col2["Column 2  (D₁=2)  —  D₂ races 0→1→2"]
    direction TB
    r02["[0][2]"] --> r12["[1][2]"] --> r22["[2][2]"]
  end
  subgraph Col3["Column 3  (D₁=3)  —  D₂ races 0→1→2"]
    direction TB
    r03["[0][3]"] --> r13["[1][3]"] --> r23["[2][3]"]
  end
  r20 -->|"D₁ increments"| r01
  r21 -->|"D₁ increments"| r02
  r22 -->|"D₁ increments"| r03

Column-major order lays out elements by moving the highest dimension the fastest — D₂ (row index) races 0→1→2 within each column; D₁ (column index) only increments when the column is exhausted.

The full memory sequence for our 3×4 array:

[0][0] → [1][0] → [2][0] → [0][1] → [1][1] → [2][1] → [0][2] → [1][2] → [2][2] → [0][3] → [1][3] → [2][3]
  ↑ D₂ races ↓         ↑ D₂ races ↓         ↑ D₂ races ↓         ↑ D₂ races ↓

The Address Formula

Reframe the Mental Model

In row-major, you think of skipping rows. In column-major, you think of skipping columns.

To reach element arr[i][j] in a matrix with num_rows rows and num_cols columns:

Skip past j complete columns — each column has num_rows elements, so skip j × num_rows elements
Walk i steps down into the current column — add i more

The offset from the start of the array:

offset = j × num_rows + i

The memory address:

address = base_address + (j × num_rows + i) × element_size

Compare this directly with row-major:

	Formula	What you multiply by
Row-major	`i × num_cols + j`	Row index × column count
Column-major	`j × num_rows + i`	Column index × row count

The structure is symmetric. Row-major uses num_cols as the stride; column-major uses num_rows as the stride.

Walk Through It With Exact Numbers

Let’s find arr[1][2] in our 3×4 array (base_address = 1000, element_size = 4 bytes, num_rows = 3):

---
config:
  theme: base
  themeVariables:
    primaryColor: "#dbeafe"
    primaryBorderColor: "#3b82f6"
    primaryTextColor: "#1e3a5f"
    lineColor: "#64748b"
    secondaryColor: "#ede9fe"
    tertiaryColor: "#fef9c3"
---
flowchart TB
  Input["Looking for arr[1][2]<br/>i = 1,  j = 2,  num_rows = 3<br/>base = 1000,  element_size = 4 bytes"]
  Step1["Skip j = 2 complete columns<br/>2 × 3 = 6 elements skipped"]
  Step2["Walk i = 1 step into Column 2<br/>6 + 1 = 7"]
  Offset["offset = 7"]
  Addr["address = 1000 + (7 × 4) = 1028"]

  Input --> Step1 --> Step2 --> Offset --> Addr

Calculating the base address of arr[1][2] in column-major order — skip 2 complete columns (6 elements), then step 1 into the current column.

Contrast with row-major for the same element:

Row-major offset  = 1 × 4 + 2 = 6   → address 1024
Column-major offset = 2 × 3 + 1 = 7  → address 1028

The same logical element arr[1][2] lives at different memory addresses depending on the storage order. This is why mixing row-major and column-major code (e.g., calling a Fortran library from C) requires explicit transposition.

# Compare row-major vs column-major address for the same element
base = 1000
element_size = 4
num_rows, num_cols = 3, 4

i, j = 1, 2  # element arr[1][2]

row_major_offset  = i * num_cols + j         # i × C + j
col_major_offset  = j * num_rows + i         # j × R + i

print(f"arr[{i}][{j}]")
print(f"  Row-major:    offset={row_major_offset}, address={base + row_major_offset * element_size}")
print(f"  Column-major: offset={col_major_offset}, address={base + col_major_offset * element_size}")

What Breaks If You Mix Them Up?

Imagine you store a 3×4 matrix in column-major order (like MATLAB does), then access it with row-major indexing (like C does):

You want arr[0][1] = 20
Row-major formula gives:    offset = 0 × 4 + 1 = 1  → reads 50  ✗  (actually arr[1][0])
Column-major formula gives: offset = 1 × 3 + 0 = 3  → reads 20  ✓

The wrong formula silently gives you the wrong value with no error. This is one of the most insidious bugs in scientific computing — it doesn’t crash, it just produces subtly incorrect results.

Cache Performance: The Column-Major Flip

In the previous lesson you saw that row-major traversal is cache-friendly in row-major languages. Column-major languages (Fortran, MATLAB, Julia) flip this completely:

Language family	Storage	Cache-friendly loop	Cache-unfriendly loop
C, C++, Python, Java	Row-major	Outer=row, inner=col	Outer=col, inner=row
Fortran, MATLAB, Julia	Column-major	Outer=col, inner=row	Outer=row, inner=col

The rule: always iterate in the direction that matches how the data is stored in memory. The hardware rewards sequential access and punishes random jumps.

Key Takeaways

Concept	Row-Major	Column-Major
Strategy	Row by row, left to right	Column by column, top to bottom
Index that moves fastest	D₁ — lowest (column)	Dₙ — highest (row)
2D offset formula	`i × num_cols + j`	`j × num_rows + i`
Stride	`num_cols`	`num_rows`
Languages	C, C++, Python, Java, Go	Fortran, MATLAB, Julia, R
Cache-friendly inner loop	column index `j`	row index `i`

Column-major is not better or worse than row-major — it’s a different convention with equally sound reasoning. The danger only appears when you assume one and the data is stored in the other.

Keyboard shortcuts

Note Book