Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

1. SQL Foundations – Introduction, SELECT, DDL, DML & Filtering Data

1. Summary

Core building blocks of SQL:

  • What databases and tables are.
  • How to define tables (DDL).
  • How to manipulate data (DML).
  • How to query data with SELECT.
  • How to filter, group, sort, and limit results.

Examples are based on tables like customers, orders, and persons from below.


2. Big Picture – Where This Fits in SQL

2.1 Database & SQL Overview

Think of the structure like this:

  • Server → hosts one or more databases.
  • Database → container for related data (e.g., Sales, HR).
  • Schema → subgroup inside a database (e.g., Orders, Customers).
  • Table → grid with columns (fields) and rows (records).
  • Column → attribute of data (e.g., id, name, birthdate); each has a data type.
  • Row → one record (e.g., one customer).
  • Primary Key → column(s) that uniquely identify each row (e.g., id).

SQL is the language we use to talk to this data.

SQL commands are grouped into 3 big families:

  • DQL – Data Query Language

    • SELECT → read/query data.
  • DDL – Data Definition Language

    • CREATE, ALTER, DROP → define or change table structure.
  • DML – Data Manipulation Language

    • INSERT, UPDATE, DELETE → modify data inside tables.

2.2 Logical Query Execution Order (How SQL “Thinks”)

We write a query like this:

SELECT ...
FROM ...
WHERE ...
GROUP BY ...
HAVING ...
ORDER BY ...
TOP / LIMIT ...

But internally, SQL logically processes in this order:

  1. FROM (and joins – later)
  2. WHERE
  3. GROUP BY
  4. Aggregates (e.g., SUM, AVG, COUNT)
  5. HAVING
  6. SELECT (columns, expressions, aliases)
  7. DISTINCT
  8. ORDER BY
  9. TOP / LIMIT (engine-dependent)

Why this matters:

  • You cannot use aggregates in WHERE (they don’t exist yet).
  • HAVING filters groups, after GROUP BY.
  • Some aliases from SELECT can be used in ORDER BY but not in WHERE.

Mnemonic:

Fresh Water Gets Athletes Hydrated So Don’t Over Train.”

Fat Whales Go Around Heavy Ships Diving Over Tides.”

FROM → WHERE → GROUP BY → Aggregates → HAVING → SELECT → DISTINCT → ORDER BY → TOP


3. Key Concepts & Definitions

3.1 SQL Components in a Statement

Example:

-- Retrieve Customers Data
SELECT name, LOWER(country)
FROM customers
WHERE country = 'Italy';
  • Comment: -- Retrieve Customers Data
  • Clauses: SELECT, FROM, WHERE
  • Keywords: SELECT, FROM, WHERE (reserved words)
  • Function: LOWER(country) – built-in function applied to values
  • Identifiers: name, country, customers – table/column names
  • Operator: =
  • Value (literal): 'Italy'

3.2 DDL vs DML vs DQL – Quick Table

CategoryNamePurposeExamples
DQLData Query LanguageRead/query dataSELECT
DDLData Definition LanguageDefine/change table structureCREATE, ALTER, DROP
DMLData Manipulation LanguageInsert/update/delete table rowsINSERT, UPDATE, DELETE

3.3 WHERE Operators (Filtering)

  • Comparison: =, <> or !=, >, >=, <, <=
  • Logical: AND, OR, NOT
  • Range: BETWEEN
  • Membership: IN, NOT IN
  • Pattern: LIKE (with wildcards % and _)

3.4 Function Toolbox

CategoryFunctionDescriptionExample
StringLOWER()Convert text to lowercaseLOWER(country)
AggregateSUM()Add up numeric valuesSUM(score)
AggregateCOUNT()Count rows / non-NULL valuesCOUNT(id)
AggregateAVG()Average numeric valuesAVG(score)

You can extend this table later as you learn more functions.


4. Syntax Cheat-Sheet

4.1 Core SELECT Patterns

-- Basic SELECT all columns
SELECT *
FROM table_name;

-- Select specific columns
SELECT column1, column2, ...
FROM table_name;

-- SELECT with WHERE filter
SELECT column1, column2, ...
FROM table_name
WHERE condition;

-- ORDER BY for sorting
SELECT column1, column2, ...
FROM table_name
ORDER BY column1 ASC, column2 DESC;

-- GROUP BY with aggregates
SELECT group_column,
       AGG_FUNC(other_column) AS alias
FROM table_name
GROUP BY group_column;

-- GROUP BY with HAVING (filter groups)
SELECT group_column,
       AGG_FUNC(other_column) AS alias
FROM table_name
GROUP BY group_column
HAVING AGG_FUNC(other_column) condition;

-- DISTINCT values
SELECT DISTINCT column1
FROM table_name;

-- TOP N rows (T-SQL style)
SELECT TOP N *
FROM table_name
ORDER BY column_to_rank DESC;

4.2 DDL – Defining Table Structure

-- CREATE table
CREATE TABLE table_name (
    column1 DATA_TYPE [constraints],
    column2 DATA_TYPE [constraints],
    ...
    CONSTRAINT constraint_name PRIMARY KEY (column1)
);

-- ALTER table - add column
ALTER TABLE table_name
ADD new_column DATA_TYPE [constraints];

-- ALTER table - drop column
ALTER TABLE table_name
DROP COLUMN column_name;

-- DROP table
DROP TABLE table_name;

4.3 DML – Manipulating Data

-- INSERT values (explicit columns)
INSERT INTO table_name (column1, column2, ...)
VALUES 
  (value1, value2, ...),
  (value1b, value2b, ...);  -- multiple rows

-- INSERT using SELECT
INSERT INTO target_table (col1, col2, ...)
SELECT colA, colB, ...
FROM source_table
WHERE ...;

-- UPDATE rows
UPDATE table_name
SET column1 = value1,
    column2 = value2
WHERE condition;   -- always use WHERE unless you want all rows

-- DELETE rows
DELETE FROM table_name
WHERE condition;

-- Delete all rows (fast)
TRUNCATE TABLE table_name;

4.4 WHERE & Filtering Patterns

-- Comparison
WHERE column = value
WHERE column <> value
WHERE column > value
WHERE column >= value
WHERE column < value
WHERE column <= value

-- Logical
WHERE condition1 AND condition2
WHERE condition1 OR condition2
WHERE NOT condition

-- BETWEEN (inclusive)
WHERE column BETWEEN lower AND upper

-- IN / NOT IN
WHERE column IN (value1, value2, ...)
WHERE column NOT IN (value1, value2, ...)

-- LIKE patterns
WHERE text_column LIKE 'M%'    -- starts with M
WHERE text_column LIKE '%n'    -- ends with n
WHERE text_column LIKE '%r%'   -- contains r
WHERE text_column LIKE '__r%'  -- r in 3rd character position

5. Worked Examples (With Annotations)

5.1 SELECT – All Columns vs Few Columns

All customers:

-- Retrieve All Customer Data
SELECT *
FROM customers;
  • SELECT * → return every column from customers.
  • Good for quick exploration, but in real queries you usually list columns explicitly.

Specific columns:

-- Retrieve each customer's name, country, and score.
SELECT 
    first_name,
    country, 
    score
FROM customers;
  • Only returns 3 columns – clearer and often more efficient.

5.2 WHERE – Basic Filtering

Score not equal to 0:

-- Retrieve customers with a score not equal to 0
SELECT *
FROM customers
WHERE score != 0;
  • != or <> → “not equal to”.

Country equals Germany:

-- Retrieve customers from Germany
SELECT *
FROM customers
WHERE country = 'Germany';

Columns + filter combined:

-- Retrieve the name and country of customers from Germany
SELECT
    first_name,
    country
FROM customers
WHERE country = 'Germany';

5.3 ORDER BY – Sorting

/* Retrieve all customers and 
   sort the results by the highest score first. */
SELECT *
FROM customers
ORDER BY score DESC;
  • DESC → descending (high to low).
/* Retrieve all customers and 
   sort the results by the country and then by the highest score. */
SELECT *
FROM customers
ORDER BY country ASC, score DESC;
  • First sort by country (A–Z),
  • Then, within each country, sort by score from high to low.

5.4 GROUP BY & HAVING – Aggregation

Total score per country:

-- Find the total score for each country
SELECT 
    country,
    SUM(score) AS total_score
FROM customers
GROUP BY country;
  • SUM(score) → adds up score for each country.
  • GROUP BY country → one row per country.

Invalid example (on purpose):

SELECT 
    country,
    first_name,
    SUM(score) AS total_score
FROM customers
GROUP BY country;
  • This fails because first_name is neither:

    • in GROUP BY, nor
    • inside an aggregate like SUM.
  • SQL doesn’t know which first_name to show for each group.

Total score & customer count per country:

SELECT 
    country,
    SUM(score) AS total_score,
    COUNT(id) AS total_customers
FROM customers
GROUP BY country;

Using HAVING – Filter groups after aggregation:

/* Find the average score for each country
   and return only those countries with an average score greater than 430 */
SELECT
    country,
    AVG(score) AS avg_score
FROM customers
GROUP BY country
HAVING AVG(score) > 430;

Adds a row-level filter first:

/* considering only customers with a score not equal to 0 */
SELECT
    country,
    AVG(score) AS avg_score
FROM customers
WHERE score != 0
GROUP BY country
HAVING AVG(score) > 430;
  • WHERE limits rows before grouping.
  • HAVING limits groups after aggregating.

5.5 DISTINCT & TOP

-- Return unique list of all countries
SELECT DISTINCT country
FROM customers;
-- Retrieve the Top 3 Customers with the Highest Scores
SELECT TOP 3 *
FROM customers
ORDER BY score DESC;
  • Always use TOP with ORDER BY if you want the “top N” in a meaningful way.

5.6 DDL – CREATE / ALTER / DROP

/* Create a new table called persons 
   with columns: id, person_name, birth_date, and phone */
CREATE TABLE persons (
    id INT NOT NULL,
    person_name VARCHAR(50) NOT NULL,
    birth_date DATE,
    phone VARCHAR(15) NOT NULL,
    CONSTRAINT pk_persons PRIMARY KEY (id)
);
-- Add a new column called email to the persons table
ALTER TABLE persons
ADD email VARCHAR(50) NOT NULL;
-- Remove the column phone from the persons table
ALTER TABLE persons
DROP COLUMN phone;
-- Delete the table persons from the database
DROP TABLE persons;
  • DROP TABLE removes structure + data permanently.

5.7 DML – INSERT, UPDATE, DELETE

INSERT – explicit columns:

-- Insert new records into the customers table
INSERT INTO customers (id, first_name, country, score)
VALUES 
    (6, 'Anna', 'USA', NULL),
    (7, 'Sam', NULL, 100);
  • Two rows inserted.
  • NULL → missing/unknown.

Incorrect type example:

-- Incorrect data type in values
INSERT INTO customers (id, first_name, country, score)
VALUES 
    ('Max', 9, 'Max', NULL);
  • 'Max' where id INT is expected → error.

No column list (not recommended):

INSERT INTO customers 
VALUES 
    (9, 'Andreas', 'Germany', NULL);
  • Values must match all columns in exact table order – fragile when table changes.

Partial columns (others become NULL or default):

INSERT INTO customers (id, first_name)
VALUES 
    (10, 'Sahra');

INSERT using SELECT (move data):

-- Copy data from the 'customers' table into 'persons'
INSERT INTO persons (id, person_name, birth_date, phone)
SELECT
    id,
    first_name,
    NULL,
    'Unknown'
FROM customers;

UPDATE:

-- Change the score of customer with ID 6 to 0
UPDATE customers
SET score = 0
WHERE id = 6;
-- Change the score of customer with ID 10 to 0 and update the country to 'UK'
UPDATE customers
SET score = 0,
    country = 'UK'
WHERE id = 10;
-- Update all customers with a NULL score by setting their score to 0
UPDATE customers
SET score = 0
WHERE score IS NULL;
  • Always use IS NULL/IS NOT NULL, not = NULL.

DELETE & TRUNCATE:

-- Delete all customers with an ID greater than 5
DELETE FROM customers
WHERE id > 5;
-- Delete all data from the persons table
DELETE FROM persons;

-- Faster method to delete all rows
TRUNCATE TABLE persons;
  • DELETE without WHERE deletes all rows.
  • TRUNCATE TABLE is faster but also removes all rows; structure stays.

5.8 Filtering Examples (Comparison, Logical, BETWEEN, IN, LIKE)

Comparison & logical:

-- Retrieve all customers with a score greater than 500.
SELECT *
FROM customers
WHERE score > 500;

-- from USA and score > 500
SELECT *
FROM customers
WHERE country = 'USA' AND score > 500;

-- from USA or score > 500
SELECT *
FROM customers
WHERE country = 'USA' OR score > 500;

-- score not less than 500 (i.e. >= 500)
SELECT *
FROM customers
WHERE NOT score < 500;

BETWEEN (inclusive):

-- score between 100 and 500 (inclusive)
SELECT *
FROM customers
WHERE score BETWEEN 100 AND 500;

IN:

-- from Germany or USA
SELECT *
FROM customers
WHERE country IN ('Germany', 'USA');

LIKE patterns:

-- first name starts with 'M'
SELECT *
FROM customers
WHERE first_name LIKE 'M%';

-- ends with 'n'
SELECT *
FROM customers
WHERE first_name LIKE '%n';

-- contains 'r'
SELECT *
FROM customers
WHERE first_name LIKE '%r%';

-- 'r' in third position
SELECT *
FROM customers
WHERE first_name LIKE '__r%';
  • % = any number of characters (including zero).
  • _ = exactly one character.

6. Common Mistakes, Tips & Error Patterns

6.1 Typical Mistakes & How to Avoid Them

  1. Forgetting WHERE in UPDATE / DELETE

    • Risk: change or delete all rows.
    • Tip: Always test with SELECT first, then turn it into UPDATE/DELETE.
  2. Mismatched columns and values in INSERT

    • Number and order must line up.
    • Tip: Always specify the column list explicitly.
  3. Wrong data types in INSERT

    • Example: string 'Max' into an INT column.
    • Tip: Check CREATE TABLE definitions.
  4. Using = with NULL

    • Wrong: WHERE score = NULL (always false).
    • Correct: WHERE score IS NULL or IS NOT NULL.
  5. SELECT with GROUP BY – non-aggregated columns

    • Error when you select a column not in GROUP BY or an aggregate.
    • Rule: every selected column must be grouped or aggregated.
  6. Using HAVING for row-level filtering

    • Row conditions (e.g., score != 0) belong in WHERE, not HAVING.
  7. Misunderstanding BETWEEN boundaries

    • BETWEEN 100 AND 500 includes both 100 and 500.
  8. Relying on TOP without ORDER BY

    • Without ORDER BY, the “top 3” could be random-ish.

6.2 Error-Pattern Cheat Sheet

When you see an error, think of this quick map:

  • “Column … is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause” → You’re selecting a non-aggregated column that’s not in GROUP BY. ✱ Fix: add it to GROUP BY or wrap it in an aggregate.

  • “Cannot insert the value NULL into column ‘…’; column does not allow nulls” → You didn’t provide a value for a NOT NULL column. ✱ Fix: insert a value or change the column definition.

  • Type mismatch errors (e.g., converting varchar to int failed) → Value type doesn’t match the column type. ✱ Fix: correct the literal or adjust the column type.

  • “Syntax error near …” around WHERE / GROUP BY / HAVING → Often caused by wrong clause order or missing comma/semicolon. ✱ Fix: check statement order and punctuation.


7. Memory Boosters

7.1 Mnemonics

  1. SQL Categories – “Q, D, M”:

    • Query → DQLSELECT
    • Define → DDLCREATE, ALTER, DROP
    • Modify → DMLINSERT, UPDATE, DELETE
  2. SELECT Clause Order – “SF WOGHD” → “See Funny WOGH Dogs”:

    • SELECT
    • FROM
    • WHERE
    • ORDER BY
    • GROUP BY
    • HAVING
    • (DISTINCT, TOP are usually next to SELECT)
  3. Filtering Operators – “CLRMS” (think “colors”):

    • Comparison (=, >, <, <> …)
    • Logical (AND, OR, NOT)
    • Range (BETWEEN)
    • Membership (IN, NOT IN)
    • Search (LIKE)
  4. LIKE Wildcards – “Percent = many, Underscore = one”:

    • % → any number of characters.
    • _ → exactly one character.

7.2 Analogies

  • Database = folder; table = Excel sheet; row = one line; column = one field/heading.
  • SELECT = you ask a question; result set = the answer.
  • WHERE = filter funnel: only rows passing the condition flow through.
  • GROUP BY = grouping exam papers by class, then summarizing each group with stats (avg, sum, count).

8. Active Recall, Flashcards & Practice

8.1 Active Recall Questions (Based on This Material Only)

Try answering without looking; then check.

Concepts & Types

  1. What are the three main categories of SQL commands and which verbs belong to each?
  2. What is the difference between DDL and DML?
  3. What is a primary key and why is it important?

SELECT & Filtering

  1. Write a query to get first_name and country from customers where score is greater than 500.
  2. How do you sort customers by country ascending and then by score descending?
  3. What’s the difference between WHERE and HAVING? When do you use each?

Aggregation

  1. Write a query to calculate the total score and number of customers per country.
  2. Modify that query to only show countries with total score > 1000. Should that filter be in WHERE or HAVING?

DDL / DML

  1. Write a CREATE TABLE statement for a simple products table with id, product_name, and price.
  2. How do you add an email column to the persons table?
  3. How do you change the score of customer with ID 6 to 0?
  4. How do you delete all customers with id > 5?

Filtering Operators

  1. Write a query to get all customers whose score is between 100 and 500 (inclusive).
  2. Write a query to get all customers from either Germany or USA using IN.
  3. Write a query to get all customers whose first name starts with M and ends with a.
  4. How do you find all customers where country is not Germany and score is greater than 500?

8.2 Flashcard Templates (Reusable)

You can create infinite cards with these templates:

Template A – Table Understanding

  • Q: “What are the columns and data types of <table_name>?”
  • Q: “What is the primary key of <table_name> and why?”

Template B – Query Variants

For any table X and column Y:

  • Q: “Write a query to list all rows in X ordered by Y descending.”
  • Q: “Write a query to count rows in X where Y meets some condition.”
  • Q: “Write a query to get the top 3 rows in X according to Y.”

Template C – Clause Replacement

Pick one query and ask:

  • “What happens if I remove WHERE?”
  • “What happens if I move a condition from WHERE to HAVING or vice versa?”

8.3 Practice Checklist (Level 1–3)

Level 1 – Basics

  • Write 5 SELECT queries on customers:

    • 2 with WHERE,
    • 1 with ORDER BY,
    • 1 with DISTINCT,
    • 1 with TOP.
  • For each operator (=, <>, >, <, BETWEEN, IN, LIKE), write one example.

Level 2 – Aggregation

  • Write 3 GROUP BY queries on customers:

    • Total score per country,
    • Average score per country (excluding score = 0),
    • Number of customers per country.
  • For each, add a HAVING condition and explain in plain words what it does.

Level 3 – DDL & DML

  • Create your own table (e.g., books or movies) using CREATE TABLE.

  • Insert at least 5 rows (include one intentional mistake, then fix it).

  • Write:

    • 1 UPDATE with a WHERE that affects 1 row,
    • 1 DELETE with a WHERE that affects a few rows,
    • 1 TRUNCATE TABLE (on a test table only).

9. Very Short Recap

9.1 Key Takeaways (Bullet Recap)

  1. Databases live on a server and store data in tables made of rows and columns.
  2. SQL commands are grouped into DQL (SELECT), DDL (CREATE, ALTER, DROP), and DML (INSERT, UPDATE, DELETE).
  3. SELECT is built with clauses like FROM, WHERE, GROUP BY, HAVING, ORDER BY, plus extras like DISTINCT and TOP.
  4. WHERE filters individual rows; HAVING filters groups after aggregation.
  5. ORDER BY controls sort order; always combine TOP with ORDER BY for meaningful “top N” queries.
  6. GROUP BY + aggregates (SUM, AVG, COUNT) let you summarize data per group.
  7. DDL defines the structure of your tables; DML changes the data in them.
  8. Common pitfalls: missing WHERE in UPDATE/DELETE, mismatched columns/values in INSERT, using = with NULL, violating GROUP BY rules.
  9. Logical execution order (FROM → WHERE → GROUP BY → Aggregates → HAVING → SELECT → DISTINCT → ORDER BY → TOP) explains many rules and errors.
  10. Active recall questions + regular practice on a small table like customers will make these basics automatic.