Master SQL - Complete Database Course

01. 🗃️ SQL Fundamentals & Introduction

SQL (Structured Query Language) is the standard language for managing and manipulating relational databases. It allows you to create, read, update, and delete data efficiently.

What is SQL?

SQL is a domain-specific language used for managing data in relational database management systems (RDBMS). It's used by popular databases like MySQL, PostgreSQL, SQL Server, Oracle, and SQLite.

SQL Language Components

DDL (Data Definition Language): CREATE, ALTER, DROP
DML (Data Manipulation Language): SELECT, INSERT, UPDATE, DELETE
DCL (Data Control Language): GRANT, REVOKE
TCL (Transaction Control Language): COMMIT, ROLLBACK, SAVEPOINT

Key Concept: SQL is declarative - you describe what data you want, not how to get it. The database engine determines the most efficient way to execute your query.

Example: Basic SQL Syntax

-- Single line comment
/* Multi-line
   comment */

-- Create a database
CREATE DATABASE company;

-- Use a database
USE company;

-- Create a table
CREATE TABLE employees (
    id INT PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE,
    hire_date DATE,
    salary DECIMAL(10,2),
    department_id INT
);

-- Basic SELECT query
SELECT * FROM employees;

-- Insert data
INSERT INTO employees (first_name, last_name, email, hire_date, salary)
VALUES ('John', 'Doe', 'john.doe@company.com', '2023-01-15', 50000.00);

-- Update data
UPDATE employees 
SET salary = 55000.00 
WHERE id = 1;

-- Delete data
DELETE FROM employees 
WHERE id = 1;

Note: SQL keywords are not case-sensitive, but it's common practice to write them in uppercase for readability. Always end statements with a semicolon.

02. 🏗️ Databases & Tables Design

A well-designed database structure is crucial for efficient data management. Understanding tables, relationships, and data types is essential.

Database Schema Example

Employees

PK employee_id INT

first_name VARCHAR(50)

last_name VARCHAR(50)

email VARCHAR(100)

hire_date DATE

FK department_id INT

Departments

PK department_id INT

department_name VARCHAR(100)

location VARCHAR(100)

budget DECIMAL(15,2)

Projects

PK project_id INT

project_name VARCHAR(100)

start_date DATE

end_date DATE

budget DECIMAL(15,2)

Common Data Types

INT, BIGINT: Integer numbers
DECIMAL(p,s), FLOAT: Decimal numbers
VARCHAR(n), TEXT: Variable-length strings
CHAR(n): Fixed-length strings
DATE, TIME, DATETIME: Date and time
BOOLEAN: True/False values
BLOB: Binary large objects

Constraints

Constraints enforce rules on data:
PRIMARY KEY: Unique identifier
FOREIGN KEY: References another table
NOT NULL: Column cannot be empty
UNIQUE: All values must be unique
CHECK: Validates values against condition
DEFAULT: Sets default value

Example: Table Creation with Constraints

-- Create a comprehensive database schema
CREATE DATABASE ecommerce;
USE ecommerce;

-- Customers table
CREATE TABLE customers (
    customer_id INT PRIMARY KEY AUTO_INCREMENT,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    birth_date DATE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    is_active BOOLEAN DEFAULT TRUE,
    CONSTRAINT chk_email CHECK (email LIKE '%@%.%')
);

-- Products table
CREATE TABLE products (
    product_id INT PRIMARY KEY AUTO_INCREMENT,
    product_name VARCHAR(200) NOT NULL,
    description TEXT,
    price DECIMAL(10,2) NOT NULL CHECK (price > 0),
    stock_quantity INT NOT NULL DEFAULT 0 CHECK (stock_quantity >= 0),
    category_id INT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_category (category_id),
    INDEX idx_price (price)
);

-- Categories table
CREATE TABLE categories (
    category_id INT PRIMARY KEY AUTO_INCREMENT,
    category_name VARCHAR(100) NOT NULL UNIQUE,
    parent_category_id INT NULL,
    FOREIGN KEY (parent_category_id) REFERENCES categories(category_id)
);

-- Orders table
CREATE TABLE orders (
    order_id INT PRIMARY KEY AUTO_INCREMENT,
    customer_id INT NOT NULL,
    order_date DATETIME DEFAULT CURRENT_TIMESTAMP,
    total_amount DECIMAL(12,2) CHECK (total_amount >= 0),
    status ENUM('pending', 'processing', 'shipped', 'delivered', 'cancelled') DEFAULT 'pending',
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id) ON DELETE CASCADE,
    INDEX idx_order_date (order_date),
    INDEX idx_status (status)
);

-- Order items table (junction table)
CREATE TABLE order_items (
    order_item_id INT PRIMARY KEY AUTO_INCREMENT,
    order_id INT NOT NULL,
    product_id INT NOT NULL,
    quantity INT NOT NULL CHECK (quantity > 0),
    unit_price DECIMAL(10,2) NOT NULL CHECK (unit_price >= 0),
    subtotal DECIMAL(12,2) GENERATED ALWAYS AS (quantity * unit_price) STORED,
    FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE,
    FOREIGN KEY (product_id) REFERENCES products(product_id),
    UNIQUE KEY unique_order_product (order_id, product_id)
);

-- Alter table examples
ALTER TABLE products 
ADD COLUMN weight_kg DECIMAL(5,2) AFTER price;

ALTER TABLE customers 
MODIFY COLUMN email VARCHAR(150) NOT NULL;

ALTER TABLE products 
ADD CONSTRAINT fk_category 
FOREIGN KEY (category_id) REFERENCES categories(category_id);

-- Drop table (be careful!)
-- DROP TABLE IF EXISTS old_table_name;

Database Schema Visualization

customers

PK customer_id

username

email

first_name

last_name

orders

PK order_id

FK customer_id

order_date

total_amount

status

products

PK product_id

product_name

price

stock_quantity

FK category_id

Relationships: customers 1 → ∞ orders, orders ∞ → ∞ products (via order_items), products N → 1 categories

03. 🔍 Basic SELECT Queries

The SELECT statement is the most frequently used SQL command. It retrieves data from one or more tables.

SELECT Statement Syntax

The basic structure: SELECT columns FROM table WHERE conditions ORDER BY columns LIMIT count

SELECT Clause Components

SELECT: Specifies columns to retrieve
FROM: Specifies table(s) to query
WHERE: Filters rows based on conditions
ORDER BY: Sorts the result set
LIMIT/OFFSET: Controls result pagination
DISTINCT: Removes duplicate rows

Key Concept: Use SELECT * cautiously in production. Always specify only the columns you need for better performance and security.

Example: Basic SELECT Queries

-- Sample data for examples
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50),
    salary DECIMAL(10,2),
    hire_date DATE,
    manager_id INT
);

INSERT INTO employees VALUES
(1, 'John', 'Doe', 'Sales', 55000, '2020-03-15', NULL),
(2, 'Jane', 'Smith', 'Marketing', 60000, '2019-07-22', 1),
(3, 'Bob', 'Johnson', 'Sales', 52000, '2021-01-10', 1),
(4, 'Alice', 'Williams', 'IT', 75000, '2018-11-05', NULL),
(5, 'Charlie', 'Brown', 'Marketing', 58000, '2022-05-30', 2),
(6, 'Diana', 'Miller', 'Sales', 56000, '2020-08-12', 1),
(7, 'Eve', 'Davis', 'HR', 48000, '2021-03-25', NULL),
(8, 'Frank', 'Wilson', 'IT', 80000, '2017-09-18', 4);

-- Select all columns from table
SELECT * FROM employees;

-- Select specific columns
SELECT first_name, last_name, department 
FROM employees;

-- Select with column aliases
SELECT 
    first_name AS "First Name",
    last_name AS "Last Name",
    salary AS "Annual Salary"
FROM employees;

-- Select with expressions
SELECT 
    first_name,
    last_name,
    salary,
    salary * 1.1 AS "Proposed Salary",
    salary / 12 AS "Monthly Salary"
FROM employees;

-- Select distinct values
SELECT DISTINCT department 
FROM employees;

-- Select with calculated columns
SELECT 
    CONCAT(first_name, ' ', last_name) AS "Full Name",
    department,
    YEAR(hire_date) AS "Hire Year"
FROM employees;

-- Select from multiple tables (preview of joins)
SELECT 
    e.first_name,
    e.last_name,
    d.department_name,
    d.location
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id;

-- Select with system functions
SELECT 
    CURRENT_DATE AS "Today",
    CURRENT_TIME AS "Now",
    DATABASE() AS "Database Name",
    USER() AS "Current User";

Query Results Demo

employee_id	first_name	last_name	department	salary	hire_date
1	John	Doe	Sales	55000.00	2020-03-15
2	Jane	Smith	Marketing	60000.00	2019-07-22
3	Bob	Johnson	Sales	52000.00	2021-01-10
4	Alice	Williams	IT	75000.00	2018-11-05
5	Charlie	Brown	Marketing	58000.00	2022-05-30

Example Queries on This Data:

SELECT DISTINCT department FROM employees; → Sales, Marketing, IT
SELECT first_name, salary FROM employees WHERE department = 'Sales'; → John 55000, Bob 52000
SELECT AVG(salary) FROM employees; → 60000 (average)

04. 🎯 Filtering, Sorting & Operators

Filtering allows you to retrieve only the rows that meet specific criteria. Sorting organizes the result set in a meaningful order.

WHERE Clause Operators

SQL provides various operators for filtering:
Comparison: =, <>, <, >, <=, >=
Logical: AND, OR, NOT
Pattern Matching: LIKE, ILIKE
Range: BETWEEN, IN

Filtering Techniques

WHERE: Basic condition filtering
LIKE/ILIKE: Pattern matching with wildcards
IN: Match against list of values
BETWEEN: Range filtering
IS NULL/IS NOT NULL: Null value checking
ORDER BY: Single/multi-column sorting
LIMIT/OFFSET: Result pagination

Example: Advanced Filtering & Sorting

-- Using WHERE with comparison operators
SELECT * FROM employees 
WHERE salary > 60000;

SELECT * FROM employees 
WHERE department = 'Sales' AND salary > 50000;

-- Using OR operator
SELECT * FROM employees 
WHERE department = 'Sales' OR department = 'Marketing';

-- Using NOT operator
SELECT * FROM employees 
WHERE NOT department = 'HR';

-- Using IN operator
SELECT * FROM employees 
WHERE department IN ('Sales', 'Marketing', 'IT');

SELECT * FROM employees 
WHERE salary IN (50000, 55000, 60000);

-- Using BETWEEN operator
SELECT * FROM employees 
WHERE salary BETWEEN 50000 AND 70000;

SELECT * FROM employees 
WHERE hire_date BETWEEN '2020-01-01' AND '2021-12-31';

-- Using LIKE for pattern matching
SELECT * FROM employees 
WHERE first_name LIKE 'J%';  -- Starts with J

SELECT * FROM employees 
WHERE last_name LIKE '%son';  -- Ends with son

SELECT * FROM employees 
WHERE first_name LIKE '_a%';  -- Second letter is 'a'

SELECT * FROM employees 
WHERE email LIKE '%@company.com';

-- Using ILIKE (case-insensitive, PostgreSQL)
SELECT * FROM employees 
WHERE first_name ILIKE 'john%';

-- Using IS NULL and IS NOT NULL
SELECT * FROM employees 
WHERE manager_id IS NULL;

SELECT * FROM employees 
WHERE manager_id IS NOT NULL;

-- Sorting with ORDER BY
SELECT * FROM employees 
ORDER BY salary DESC;  -- Highest salary first

SELECT * FROM employees 
ORDER BY department ASC, salary DESC;

SELECT * FROM employees 
ORDER BY hire_date DESC;  -- Most recent hires first

-- Using LIMIT and OFFSET for pagination
SELECT * FROM employees 
ORDER BY salary DESC 
LIMIT 5;  -- Top 5 highest paid

SELECT * FROM employees 
ORDER BY hire_date 
LIMIT 10 OFFSET 20;  -- Page 3 with 10 records per page

-- Complex filtering combinations
SELECT 
    first_name,
    last_name,
    department,
    salary,
    hire_date
FROM employees 
WHERE (department = 'Sales' OR department = 'Marketing')
    AND salary >= 55000
    AND hire_date >= '2020-01-01'
    AND manager_id IS NOT NULL
ORDER BY department ASC, salary DESC
LIMIT 10;

-- Using CASE in ORDER BY for custom sorting
SELECT 
    first_name,
    last_name,
    department,
    salary
FROM employees 
ORDER BY 
    CASE department
        WHEN 'IT' THEN 1
        WHEN 'Sales' THEN 2
        WHEN 'Marketing' THEN 3
        ELSE 4
    END,
    salary DESC;

Filtering Examples

Query	Result Count	Description
`WHERE salary > 60000`	3 rows	High earners
`WHERE department IN ('Sales', 'Marketing')`	5 rows	Sales & Marketing staff
`WHERE first_name LIKE 'J%'`	2 rows	Names starting with J
`WHERE hire_date BETWEEN '2020-01-01' AND '2021-12-31'`	4 rows	Hired in 2020-2021
`ORDER BY salary DESC LIMIT 3`	3 rows	Top 3 salaries

Wildcard Patterns

% - Matches any sequence of characters
_ - Matches any single character
[abc] - Matches a, b, or c
[a-z] - Matches any lowercase letter
[^abc] - Matches any character except a, b, or c

05. ⚙️ SQL Functions & Operations

SQL functions perform operations on data, modify individual data items, and return results. They're categorized into different types based on their purpose.

Function Categories

Aggregate Functions: Operate on multiple rows (SUM, AVG, COUNT)
Scalar Functions: Operate on single values (UPPER, LOWER, ROUND)
Date Functions: Handle date/time operations
String Functions: Manipulate text data
Conversion Functions: Convert between data types

Common SQL Functions

String: CONCAT, SUBSTRING, LENGTH, REPLACE, TRIM
Numeric: ROUND, CEIL, FLOOR, ABS, MOD
Date: NOW, DATE_ADD, DATEDIFF, EXTRACT, DATE_FORMAT
Aggregate: COUNT, SUM, AVG, MIN, MAX
Conditional: CASE, COALESCE, NULLIF, IFNULL

Key Concept: Aggregate functions ignore NULL values by default. Use COUNT(*) to count all rows, but COUNT(column) counts non-null values only.

Example: SQL Functions in Action

-- String Functions
SELECT 
    first_name,
    last_name,
    CONCAT(first_name, ' ', last_name) AS full_name,
    UPPER(first_name) AS upper_first,
    LOWER(last_name) AS lower_last,
    LENGTH(CONCAT(first_name, last_name)) AS name_length,
    SUBSTRING(first_name, 1, 3) AS first_three,
    REPLACE(department, 'Sales', 'Business Development') AS new_dept_name,
    TRIM('  Hello World  ') AS trimmed_text
FROM employees;

-- Numeric Functions
SELECT 
    salary,
    ROUND(salary, -3) AS rounded_salary,  -- Nearest thousand
    CEIL(salary / 1000) * 1000 AS ceiling_salary,
    FLOOR(salary / 1000) * 1000 AS floor_salary,
    ABS(salary - 60000) AS diff_from_60k,
    MOD(salary, 1000) AS remainder,
    salary * 0.1 AS bonus,
    ROUND(salary * 1.07, 2) AS salary_with_raise
FROM employees;

-- Date Functions
SELECT 
    hire_date,
    CURRENT_DATE AS today,
    DATEDIFF(CURRENT_DATE, hire_date) AS days_employed,
    DATEDIFF(CURRENT_DATE, hire_date) / 365.25 AS years_employed,
    DATE_ADD(hire_date, INTERVAL 1 YEAR) AS first_anniversary,
    DATE_ADD(hire_date, INTERVAL 90 DAY) AS probation_end,
    EXTRACT(YEAR FROM hire_date) AS hire_year,
    EXTRACT(MONTH FROM hire_date) AS hire_month,
    DAYNAME(hire_date) AS hire_day_name,
    DATE_FORMAT(hire_date, '%M %d, %Y') AS formatted_date
FROM employees;

-- Conditional Functions
SELECT 
    first_name,
    last_name,
    salary,
    CASE 
        WHEN salary < 50000 THEN 'Entry Level'
        WHEN salary BETWEEN 50000 AND 70000 THEN 'Mid Level'
        WHEN salary > 70000 THEN 'Senior Level'
        ELSE 'Not Specified'
    END AS salary_level,
    CASE department
        WHEN 'IT' THEN 'Technology'
        WHEN 'Sales' THEN 'Revenue'
        WHEN 'Marketing' THEN 'Growth'
        ELSE 'Other'
    END AS department_category,
    COALESCE(manager_id, 0) AS manager_or_zero,
    IFNULL(department, 'No Department') AS safe_department
FROM employees;

-- Aggregate Functions (preview)
SELECT 
    COUNT(*) AS total_employees,
    COUNT(DISTINCT department) AS unique_departments,
    AVG(salary) AS average_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary,
    SUM(salary) AS total_salary_budget
FROM employees;

-- Combining multiple functions
SELECT 
    CONCAT(
        UPPER(SUBSTRING(first_name, 1, 1)),
        LOWER(SUBSTRING(first_name, 2))
    ) AS proper_first_name,
    CONCAT(
        UPPER(SUBSTRING(last_name, 1, 1)),
        LOWER(SUBSTRING(last_name, 2))
    ) AS proper_last_name,
    ROUND(salary * 1.05, 0) AS new_salary,
    CONCAT(
        'Hired on ',
        DATE_FORMAT(hire_date, '%M %d, %Y'),
        ' (',
        DATEDIFF(CURRENT_DATE, hire_date),
        ' days ago)'
    ) AS employment_details
FROM employees;

Function Results Demo

Function	Example	Result
CONCAT	`CONCAT('Hello', ' ', 'World')`	Hello World
UPPER/LOWER	`UPPER('sql'), LOWER('DATABASE')`	SQL, database
LENGTH	`LENGTH('Database')`	8
SUBSTRING	`SUBSTRING('Programming', 1, 4)`	Prog
ROUND	`ROUND(123.4567, 2)`	123.46
CEIL/FLOOR	`CEIL(4.3), FLOOR(4.7)`	5, 4
NOW()	`SELECT NOW();`	Current date/time
DATEDIFF	`DATEDIFF('2024-01-01', '2023-01-01')`	365
COALESCE	`COALESCE(NULL, 'default')`	default

Important Notes:

Function names and syntax may vary between database systems
Always test functions with your specific database
Use database documentation for exact syntax
Consider performance when using functions in WHERE clauses

06. 🔗 SQL Joins & Relationships

JOIN operations combine rows from two or more tables based on related columns. Understanding joins is crucial for working with relational databases.

Visualizing JOIN Types

Table A

INNER JOIN

Table B

A

B

FULL OUTER JOIN

Types of JOINs

INNER JOIN: Returns matching rows from both tables
LEFT JOIN: All rows from left table, matching from right
RIGHT JOIN: All rows from right table, matching from left
FULL OUTER JOIN: All rows from both tables
CROSS JOIN: Cartesian product of both tables
SELF JOIN: Join a table with itself

Key Concept: Always use explicit JOIN syntax instead of implicit joins (comma-separated tables in FROM clause). It's clearer and less error-prone.

Example: SQL JOIN Operations

-- Sample data for JOIN examples
CREATE TABLE departments (
    dept_id INT PRIMARY KEY,
    dept_name VARCHAR(50),
    location VARCHAR(50)
);

CREATE TABLE employees (
    emp_id INT PRIMARY KEY,
    emp_name VARCHAR(100),
    salary DECIMAL(10,2),
    dept_id INT,
    FOREIGN KEY (dept_id) REFERENCES departments(dept_id)
);

CREATE TABLE projects (
    project_id INT PRIMARY KEY,
    project_name VARCHAR(100),
    budget DECIMAL(12,2)
);

CREATE TABLE employee_projects (
    emp_id INT,
    project_id INT,
    role VARCHAR(50),
    PRIMARY KEY (emp_id, project_id),
    FOREIGN KEY (emp_id) REFERENCES employees(emp_id),
    FOREIGN KEY (project_id) REFERENCES projects(project_id)
);

INSERT INTO departments VALUES
(1, 'Engineering', 'New York'),
(2, 'Sales', 'Chicago'),
(3, 'Marketing', 'Los Angeles'),
(4, 'HR', 'Boston');

INSERT INTO employees VALUES
(101, 'Alice Johnson', 75000, 1),
(102, 'Bob Smith', 65000, 2),
(103, 'Charlie Brown', 80000, 1),
(104, 'Diana Prince', 55000, 3),
(105, 'Eve Adams', 70000, NULL),
(106, 'Frank Wilson', 60000, 2);

INSERT INTO projects VALUES
(201, 'Website Redesign', 50000),
(202, 'Mobile App', 75000),
(203, 'Marketing Campaign', 30000),
(204, 'Sales Training', 20000);

INSERT INTO employee_projects VALUES
(101, 201, 'Lead Developer'),
(101, 202, 'Consultant'),
(102, 203, 'Project Manager'),
(103, 201, 'Developer'),
(103, 202, 'Lead Developer'),
(104, 203, 'Designer'),
(106, 204, 'Trainer');

-- INNER JOIN (most common)
SELECT 
    e.emp_name,
    e.salary,
    d.dept_name,
    d.location
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id;

-- LEFT JOIN (all employees, even without department)
SELECT 
    e.emp_name,
    e.salary,
    COALESCE(d.dept_name, 'No Department') AS department,
    COALESCE(d.location, 'N/A') AS location
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id;

-- RIGHT JOIN (all departments, even without employees)
SELECT 
    COALESCE(e.emp_name, 'No Employee') AS employee,
    d.dept_name,
    d.location
FROM employees e
RIGHT JOIN departments d ON e.dept_id = d.dept_id;

-- FULL OUTER JOIN (MySQL doesn't have FULL JOIN, emulate with UNION)
SELECT 
    e.emp_name,
    d.dept_name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id
UNION
SELECT 
    e.emp_name,
    d.dept_name
FROM employees e
RIGHT JOIN departments d ON e.dept_id = d.dept_id
WHERE e.dept_id IS NULL;

-- Multiple JOINs (employees with their projects)
SELECT 
    e.emp_name,
    d.dept_name,
    p.project_name,
    ep.role,
    p.budget
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id
INNER JOIN employee_projects ep ON e.emp_id = ep.emp_id
INNER JOIN projects p ON ep.project_id = p.project_id;

-- SELF JOIN (employees and their managers - assuming manager_id column)
ALTER TABLE employees ADD COLUMN manager_id INT REFERENCES employees(emp_id);

UPDATE employees SET manager_id = 103 WHERE emp_id = 101;
UPDATE employees SET manager_id = 103 WHERE emp_id = 102;

SELECT 
    e.emp_name AS employee,
    m.emp_name AS manager,
    e.salary AS employee_salary,
    m.salary AS manager_salary
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.emp_id;

-- CROSS JOIN (cartesian product)
SELECT 
    e.emp_name,
    d.dept_name
FROM employees e
CROSS JOIN departments d
WHERE e.dept_id IS NULL OR d.dept_id != e.dept_id;

-- JOIN with filtering
SELECT 
    e.emp_name,
    d.dept_name,
    e.salary
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id
WHERE d.location = 'New York'
    AND e.salary > 70000;

-- JOIN with aggregation
SELECT 
    d.dept_name,
    COUNT(e.emp_id) AS employee_count,
    AVG(e.salary) AS avg_salary,
    SUM(e.salary) AS total_salary
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
GROUP BY d.dept_name
ORDER BY employee_count DESC;

-- JOIN using USING clause (when column names match)
SELECT 
    e.emp_name,
    d.dept_name
FROM employees e
INNER JOIN departments d USING (dept_id);

-- NATURAL JOIN (automatic join on matching column names - use with caution)
SELECT 
    e.emp_name,
    d.dept_name
FROM employees e
NATURAL JOIN departments d;

JOIN Visualization

Employees

Alice Johnson

1

Bob Smith

2

Charlie Brown

1

Diana Prince

3

Eve Adams

NULL

Departments

Engineering

1

Sales

2

Marketing

3

HR

4

INNER JOIN Results

Alice Johnson

Engineering

Bob Smith

Sales

Charlie Brown

Engineering

Diana Prince

Marketing

Note: Eve Adams (dept_id NULL) is excluded from INNER JOIN

07. 📊 Aggregation & Grouping

Aggregate functions perform calculations on sets of rows and return single values. GROUP BY groups rows that have the same values in specified columns.

Aggregate Function Behavior

Aggregate functions ignore NULL values except COUNT(*). Use GROUP BY to create summary rows for each unique combination of grouping columns. HAVING filters groups after aggregation.

Key Aggregation Concepts

GROUP BY: Groups rows sharing a property
HAVING: Filters groups (like WHERE for groups)
Aggregate Functions: COUNT, SUM, AVG, MIN, MAX
ROLLUP: Creates subtotals and grand totals
CUBE: All possible grouping combinations
GROUPING SETS: Specific grouping combinations

Key Concept: The order of operations in SQL: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT. HAVING comes after GROUP BY, while WHERE comes before.

Example: Aggregation & Grouping Queries

-- Basic aggregate functions
SELECT 
    COUNT(*) AS total_employees,
    COUNT(DISTINCT dept_id) AS departments_with_employees,
    AVG(salary) AS average_salary,
    MIN(salary) AS minimum_salary,
    MAX(salary) AS maximum_salary,
    SUM(salary) AS total_salary_budget,
    STDDEV(salary) AS salary_std_dev,
    VARIANCE(salary) AS salary_variance
FROM employees;

-- GROUP BY with single column
SELECT 
    dept_id,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary,
    SUM(salary) AS dept_salary_budget
FROM employees
WHERE dept_id IS NOT NULL
GROUP BY dept_id
ORDER BY avg_salary DESC;

-- GROUP BY with multiple columns
SELECT 
    d.dept_name,
    d.location,
    COUNT(e.emp_id) AS employee_count,
    AVG(e.salary) AS avg_salary
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
GROUP BY d.dept_name, d.location
ORDER BY employee_count DESC;

-- HAVING clause (filter groups)
SELECT 
    dept_id,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
WHERE dept_id IS NOT NULL
GROUP BY dept_id
HAVING COUNT(*) > 1 
    AND AVG(salary) > 60000;

-- Complex aggregation with JOIN
SELECT 
    d.dept_name,
    COUNT(DISTINCT e.emp_id) AS total_employees,
    COUNT(DISTINCT ep.project_id) AS projects_involved,
    AVG(e.salary) AS average_salary,
    SUM(p.budget) AS total_project_budget
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
LEFT JOIN employee_projects ep ON e.emp_id = ep.emp_id
LEFT JOIN projects p ON ep.project_id = p.project_id
GROUP BY d.dept_name
HAVING COUNT(DISTINCT e.emp_id) > 0
ORDER BY total_project_budget DESC NULLS LAST;

-- Using ROLLUP for subtotals (MySQL)
SELECT 
    COALESCE(d.dept_name, 'All Departments') AS department,
    COALESCE(e.emp_name, 'Department Total') AS employee,
    COUNT(*) AS project_count,
    SUM(p.budget) AS total_budget
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
LEFT JOIN employee_projects ep ON e.emp_id = ep.emp_id
LEFT JOIN projects p ON ep.project_id = p.project_id
GROUP BY d.dept_name, e.emp_name WITH ROLLUP
HAVING d.dept_name IS NOT NULL OR e.emp_name IS NULL;

-- Conditional aggregation with CASE
SELECT 
    dept_id,
    COUNT(*) AS total_employees,
    COUNT(CASE WHEN salary > 70000 THEN 1 END) AS high_earners,
    COUNT(CASE WHEN salary BETWEEN 50000 AND 70000 THEN 1 END) AS mid_earners,
    COUNT(CASE WHEN salary < 50000 THEN 1 END) AS low_earners,
    SUM(CASE WHEN salary > 70000 THEN salary ELSE 0 END) AS high_earners_salary,
    ROUND(AVG(CASE WHEN salary > 60000 THEN salary END), 2) AS avg_high_salary
FROM employees
GROUP BY dept_id;

-- Percentages and ratios
SELECT 
    dept_id,
    COUNT(*) AS employee_count,
    ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (), 2) AS percentage_of_total,
    AVG(salary) AS avg_salary,
    ROUND(AVG(salary) * 100.0 / AVG(AVG(salary)) OVER (), 2) AS salary_percentage_of_avg
FROM employees
WHERE dept_id IS NOT NULL
GROUP BY dept_id;

-- Window functions with aggregation
SELECT 
    emp_name,
    salary,
    dept_id,
    AVG(salary) OVER (PARTITION BY dept_id) AS dept_avg_salary,
    salary - AVG(salary) OVER (PARTITION BY dept_id) AS diff_from_dept_avg,
    RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS salary_rank_in_dept,
    salary * 100.0 / SUM(salary) OVER (PARTITION BY dept_id) AS salary_percentage_of_dept
FROM employees
WHERE dept_id IS NOT NULL
ORDER BY dept_id, salary DESC;

-- Multiple aggregation levels
SELECT 
    dept_id,
    COUNT(*) AS total,
    AVG(salary) AS avg_salary,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) AS median_salary,
    PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY salary) AS q1_salary,
    PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY salary) AS q3_salary,
    MAX(salary) - MIN(salary) AS salary_range
FROM employees
GROUP BY dept_id
HAVING COUNT(*) >= 2;

-- Aggregation with date functions
SELECT 
    YEAR(hire_date) AS hire_year,
    MONTH(hire_date) AS hire_month,
    COUNT(*) AS hires_count,
    AVG(salary) AS avg_starting_salary
FROM employees
WHERE hire_date IS NOT NULL
GROUP BY YEAR(hire_date), MONTH(hire_date)
ORDER BY hire_year DESC, hire_month DESC;

Aggregation Results

dept_name	employee_count	avg_salary	total_salary
Engineering	2	77500.00	155000.00
Sales	2	62500.00	125000.00
Marketing	1	55000.00	55000.00
HR	0	NULL	NULL
TOTAL	5	65000.00	325000.00

Aggregate Function Notes:

COUNT(*) counts all rows including NULLs
COUNT(column) counts non-NULL values only
Aggregate functions ignore NULL values in calculations
Use DISTINCT inside aggregate functions for unique values
HAVING can use aggregate functions, WHERE cannot

08. 🎭 Subqueries & CTEs

Subqueries (nested queries) are queries within queries. CTEs (Common Table Expressions) are temporary named result sets that improve query readability.

Subquery Types

Scalar Subquery: Returns single value
Row Subquery: Returns single row
Table Subquery: Returns multiple rows/columns
Correlated Subquery: References outer query

Subquery Positions

SELECT clause: As a column expression
FROM clause: As a derived table
WHERE clause: With operators (IN, ANY, ALL)
HAVING clause: Filter aggregated results
WITH clause: CTE definition

Key Concept: CTEs make complex queries more readable and maintainable. They can be recursive (self-referencing) which is powerful for hierarchical data.

Example: Subqueries & CTEs

-- Scalar subquery in SELECT
SELECT 
    emp_name,
    salary,
    (SELECT AVG(salary) FROM employees) AS company_avg,
    salary - (SELECT AVG(salary) FROM employees) AS diff_from_avg
FROM employees;

-- Subquery in WHERE clause
SELECT emp_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

SELECT emp_name, dept_id
FROM employees
WHERE dept_id IN (SELECT dept_id FROM departments WHERE location = 'New York');

-- Correlated subquery
SELECT 
    e.emp_name,
    e.salary,
    e.dept_id
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.dept_id = e.dept_id
);

-- Subquery in FROM clause (derived table)
SELECT 
    dept_name,
    avg_salary,
    employee_count
FROM (
    SELECT 
        d.dept_name,
        AVG(e.salary) AS avg_salary,
        COUNT(e.emp_id) AS employee_count
    FROM departments d
    LEFT JOIN employees e ON d.dept_id = e.dept_id
    GROUP BY d.dept_name
) AS dept_stats
WHERE avg_salary > 60000 OR avg_salary IS NULL;

-- EXISTS subquery
SELECT d.dept_name
FROM departments d
WHERE EXISTS (
    SELECT 1
    FROM employees e
    WHERE e.dept_id = d.dept_id
    AND e.salary > 70000
);

-- NOT EXISTS subquery
SELECT d.dept_name
FROM departments d
WHERE NOT EXISTS (
    SELECT 1
    FROM employees e
    WHERE e.dept_id = d.dept_id
);

-- ANY/ALL operators with subqueries
SELECT emp_name, salary
FROM employees
WHERE salary > ANY (
    SELECT salary
    FROM employees
    WHERE dept_id = 1
);

SELECT emp_name, salary
FROM employees
WHERE salary > ALL (
    SELECT salary
    FROM employees
    WHERE dept_id = 2
);

-- Basic CTE (Common Table Expression)
WITH department_stats AS (
    SELECT 
        dept_id,
        COUNT(*) AS employee_count,
        AVG(salary) AS avg_salary
    FROM employees
    WHERE dept_id IS NOT NULL
    GROUP BY dept_id
),
high_performing_depts AS (
    SELECT dept_id
    FROM department_stats
    WHERE avg_salary > 65000
)
SELECT 
    d.dept_name,
    ds.employee_count,
    ds.avg_salary
FROM departments d
INNER JOIN department_stats ds ON d.dept_id = ds.dept_id
WHERE d.dept_id IN (SELECT dept_id FROM high_performing_depts);

-- Multiple CTEs
WITH 
employee_projects_count AS (
    SELECT 
        emp_id,
        COUNT(*) AS project_count
    FROM employee_projects
    GROUP BY emp_id
),
department_summary AS (
    SELECT 
        e.dept_id,
        COUNT(DISTINCT e.emp_id) AS total_employees,
        AVG(e.salary) AS avg_salary,
        SUM(epc.project_count) AS total_projects
    FROM employees e
    LEFT JOIN employee_projects_count epc ON e.emp_id = epc.emp_id
    GROUP BY e.dept_id
)
SELECT 
    d.dept_name,
    ds.total_employees,
    ds.avg_salary,
    ds.total_projects,
    ROUND(ds.total_projects * 1.0 / NULLIF(ds.total_employees, 0), 2) AS projects_per_employee
FROM departments d
LEFT JOIN department_summary ds ON d.dept_id = ds.dept_id
ORDER BY ds.avg_salary DESC;

-- Recursive CTE (for hierarchical data)
WITH RECURSIVE employee_hierarchy AS (
    -- Anchor member: top-level employees (no manager)
    SELECT 
        emp_id,
        emp_name,
        manager_id,
        1 AS level,
        CAST(emp_name AS CHAR(200)) AS hierarchy_path
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Recursive member: employees with managers
    SELECT 
        e.emp_id,
        e.emp_name,
        e.manager_id,
        eh.level + 1 AS level,
        CAST(CONCAT(eh.hierarchy_path, ' → ', e.emp_name) AS CHAR(200)) AS hierarchy_path
    FROM employees e
    INNER JOIN employee_hierarchy eh ON e.manager_id = eh.emp_id
)
SELECT 
    emp_name,
    level,
    hierarchy_path
FROM employee_hierarchy
ORDER BY level, emp_name;

-- CTE for data transformation
WITH monthly_sales AS (
    SELECT 
        DATE_TRUNC('month', order_date) AS month,
        SUM(total_amount) AS monthly_total,
        COUNT(DISTINCT customer_id) AS unique_customers
    FROM orders
    GROUP BY DATE_TRUNC('month', order_date)
),
monthly_growth AS (
    SELECT 
        month,
        monthly_total,
        unique_customers,
        LAG(monthly_total) OVER (ORDER BY month) AS previous_month_total,
        LAG(unique_customers) OVER (ORDER BY month) AS previous_month_customers
    FROM monthly_sales
)
SELECT 
    TO_CHAR(month, 'Mon YYYY') AS month_display,
    monthly_total,
    unique_customers,
    ROUND(
        (monthly_total - previous_month_total) * 100.0 / NULLIF(previous_month_total, 0), 
        2
    ) AS revenue_growth_percent,
    unique_customers - previous_month_customers AS customer_growth
FROM monthly_growth
ORDER BY month;

-- Subquery vs JOIN comparison
-- Using subquery
SELECT emp_name
FROM employees
WHERE dept_id IN (
    SELECT dept_id
    FROM departments
    WHERE location = 'New York'
);

-- Equivalent JOIN
SELECT DISTINCT e.emp_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id
WHERE d.location = 'New York';

CTE Structure Visualization

CTE Execution Flow:

WITH cte1 AS (
    -- First CTE definition
    SELECT ... FROM table1
),
cte2 AS (
    -- Second CTE can reference cte1
    SELECT ... FROM cte1 JOIN table2
),
cte3 AS (
    -- Third CTE can reference previous CTEs
    SELECT ... FROM cte2 WHERE ...
)
-- Main query using all CTEs
SELECT * FROM cte3 ORDER BY ...;

Key Points:
1. CTEs are defined in the order they're written
2. Later CTEs can reference earlier ones
3. Main query can reference any CTE
4. CTEs exist only for the duration of the query

Query Type	Best For	Performance
Scalar Subquery	Single value comparisons	Good for small datasets
Correlated Subquery	Row-by-row processing	Can be slow for large data
CTE	Complex multi-step queries	Optimized by database
Derived Table	Simple temp results	Similar to CTE

09. ✏️ Data Modification Operations

Data modification statements (INSERT, UPDATE, DELETE) change the data in your database. These operations should be used carefully, especially in production environments.

ACID Properties

Database transactions should maintain:
Atomicity: All or nothing execution
Consistency: Valid state before and after
Isolation: Concurrent transactions don't interfere
Durability: Committed changes persist

Data Modification Operations

INSERT: Add new rows to tables
UPDATE: Modify existing rows
DELETE: Remove rows from tables
MERGE/UPSERT: Insert or update based on condition
TRUNCATE: Remove all rows quickly

Key Concept: Always use transactions (BEGIN/COMMIT/ROLLBACK) when modifying data. Test with SELECT first, use WHERE clauses carefully, and backup before major changes.

Example: Data Modification Statements

-- Basic INSERT
INSERT INTO employees (emp_id, emp_name, salary, dept_id)
VALUES (107, 'Grace Lee', 72000, 1);

-- Insert multiple rows
INSERT INTO employees (emp_id, emp_name, salary, dept_id)
VALUES 
    (108, 'Henry Ford', 68000, 2),
    (109, 'Irene Adler', 81000, 1),
    (110, 'Jack Ryan', 59000, 3);

-- INSERT with SELECT (copy data)
INSERT INTO employees_backup (emp_id, emp_name, salary, dept_id, backup_date)
SELECT 
    emp_id,
    emp_name,
    salary,
    dept_id,
    CURRENT_DATE
FROM employees
WHERE hire_date < '2022-01-01';

-- Basic UPDATE
UPDATE employees
SET salary = 75000
WHERE emp_id = 101;

-- UPDATE with calculation
UPDATE employees
SET salary = salary * 1.05  -- 5% raise
WHERE dept_id = 1;

-- UPDATE with subquery
UPDATE employees e
SET salary = (
    SELECT AVG(salary)
    FROM employees e2
    WHERE e2.dept_id = e.dept_id
) * 1.1
WHERE dept_id IS NOT NULL;

-- UPDATE multiple columns
UPDATE employees
SET 
    salary = salary * 1.03,
    last_raise_date = CURRENT_DATE,
    updated_at = CURRENT_TIMESTAMP
WHERE performance_rating >= 4;

-- Basic DELETE
DELETE FROM employees
WHERE emp_id = 110;

-- DELETE with condition
DELETE FROM employees
WHERE dept_id IS NULL
   OR salary < 50000;

-- DELETE using subquery
DELETE FROM employees
WHERE dept_id IN (
    SELECT dept_id
    FROM departments
    WHERE location = 'Chicago'
);

-- Safe DELETE with SELECT first
-- Always run SELECT first to see what will be deleted
SELECT * FROM employees WHERE dept_id = 4;
-- Then if correct, run:
-- DELETE FROM employees WHERE dept_id = 4;

-- TRUNCATE (removes all rows, cannot be rolled back)
TRUNCATE TABLE audit_log;
-- Faster than DELETE FROM audit_log;

-- MERGE (UPSERT) - Insert or Update
-- MySQL uses INSERT ... ON DUPLICATE KEY UPDATE
INSERT INTO employees (emp_id, emp_name, salary, dept_id)
VALUES (111, 'Karen White', 65000, 2)
ON DUPLICATE KEY UPDATE
    emp_name = VALUES(emp_name),
    salary = VALUES(salary),
    dept_id = VALUES(dept_id),
    updated_at = CURRENT_TIMESTAMP;

-- PostgreSQL uses INSERT ... ON CONFLICT
INSERT INTO employees (emp_id, emp_name, salary, dept_id)
VALUES (111, 'Karen White', 65000, 2)
ON CONFLICT (emp_id) DO UPDATE SET
    emp_name = EXCLUDED.emp_name,
    salary = EXCLUDED.salary,
    dept_id = EXCLUDED.dept_id,
    updated_at = CURRENT_TIMESTAMP;

-- SQL Server uses MERGE
MERGE INTO employees AS target
USING (VALUES (111, 'Karen White', 65000, 2)) AS source (emp_id, emp_name, salary, dept_id)
ON target.emp_id = source.emp_id
WHEN MATCHED THEN
    UPDATE SET 
        emp_name = source.emp_name,
        salary = source.salary,
        dept_id = source.dept_id,
        updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (emp_id, emp_name, salary, dept_id)
    VALUES (source.emp_id, source.emp_name, source.salary, source.dept_id);

-- Transactions
BEGIN TRANSACTION;  -- or START TRANSACTION in MySQL

INSERT INTO orders (order_id, customer_id, total_amount)
VALUES (1001, 501, 299.99);

INSERT INTO order_items (order_id, product_id, quantity, unit_price)
VALUES 
    (1001, 2001, 2, 99.99),
    (1001, 2002, 1, 100.01);

UPDATE products
SET stock_quantity = stock_quantity - 2
WHERE product_id = 2001;

UPDATE products
SET stock_quantity = stock_quantity - 1
WHERE product_id = 2002;

-- Check if everything is ok
SELECT * FROM products WHERE product_id IN (2001, 2002);

-- If everything is correct
COMMIT;

-- If something went wrong
-- ROLLBACK;

-- Savepoints for partial rollback
BEGIN TRANSACTION;

INSERT INTO employees (emp_id, emp_name, salary) VALUES (112, 'Larry King', 55000);

SAVEPOINT sp1;

UPDATE employees SET salary = salary * 1.1 WHERE dept_id = 1;

-- Oops, wrong department! Rollback to savepoint
ROLLBACK TO SAVEPOINT sp1;

-- Correct update
UPDATE employees SET salary = salary * 1.1 WHERE dept_id = 2;

COMMIT;

-- INSERT with RETURNING (PostgreSQL) / OUTPUT (SQL Server)
INSERT INTO employees (emp_name, salary, dept_id)
VALUES ('Mona Lisa', 88000, 1)
RETURNING emp_id, emp_name, salary;  -- PostgreSQL

INSERT INTO employees (emp_name, salary, dept_id)
OUTPUT INSERTED.emp_id, INSERTED.emp_name, INSERTED.salary
VALUES ('Mona Lisa', 88000, 1);  -- SQL Server

-- Bulk operations
-- Create temp table for bulk insert
CREATE TEMPORARY TABLE temp_employees (
    emp_name VARCHAR(100),
    salary DECIMAL(10,2),
    dept_id INT
);

-- Load data into temp table (from file or application)
INSERT INTO temp_employees VALUES
('Nancy Drew', 62000, 2),
('Oliver Twist', 58000, 3),
('Peter Pan', 71000, 1);

-- Insert from temp table with transformation
INSERT INTO employees (emp_name, salary, dept_id, hire_date)
SELECT 
    emp_name,
    salary,
    dept_id,
    CURRENT_DATE
FROM temp_employees
WHERE NOT EXISTS (
    SELECT 1 FROM employees e 
    WHERE e.emp_name = temp_employees.emp_name
);

-- Clean up
DROP TEMPORARY TABLE temp_employees;

Data Modification Best Practices

⚠️ DANGER ZONE - Proceed with Caution!

These operations can cause data loss if used incorrectly:

DELETE FROM table; - Deletes ALL rows!
UPDATE table SET column = value; - Updates ALL rows!
DROP TABLE table; - Deletes table structure and data!
TRUNCATE TABLE table; - Cannot be rolled back!

Always: Backup first, use WHERE clauses, test with SELECT, use transactions.

Operation	Use Case	Transaction Safe?
INSERT	Adding new records	Yes
UPDATE	Modifying existing records	Yes
DELETE	Removing specific records	Yes
TRUNCATE	Removing all records quickly	No (auto-commit)
DROP	Removing table entirely	No (auto-commit)

10. 🚀 Advanced SQL Techniques

Advanced SQL features provide powerful capabilities for complex data analysis, optimization, and specialized operations.

Window Functions

Window functions perform calculations across a set of table rows that are somehow related to the current row. Unlike aggregate functions, they don't group rows into a single output row.

Advanced SQL Features

Window Functions: ROW_NUMBER, RANK, LAG, LEAD
Indexes: Performance optimization
Views: Virtual tables for security/simplicity
Stored Procedures: Reusable code blocks
Triggers: Automatic actions on events
Full-Text Search: Advanced text searching
JSON Support: Native JSON operations

Example: Advanced SQL Features

-- Window Functions
SELECT 
    emp_name,
    salary,
    dept_id,
    -- Ranking functions
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS salary_rank,
    RANK() OVER (ORDER BY salary DESC) AS salary_rank_with_ties,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_salary_rank,
    
    -- Department rankings
    ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS dept_salary_rank,
    
    -- Navigation functions
    LAG(salary) OVER (ORDER BY salary) AS previous_salary,
    LEAD(salary) OVER (ORDER BY salary) AS next_salary,
    FIRST_VALUE(salary) OVER (PARTITION BY dept_id ORDER BY salary DESC) AS dept_highest_salary,
    LAST_VALUE(salary) OVER (PARTITION BY dept_id ORDER BY salary DESC 
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS dept_lowest_salary,
    
    -- Aggregate window functions
    SUM(salary) OVER (PARTITION BY dept_id) AS dept_total_salary,
    AVG(salary) OVER (PARTITION BY dept_id) AS dept_avg_salary,
    MAX(salary) OVER (PARTITION BY dept_id) AS dept_max_salary,
    MIN(salary) OVER (PARTITION BY dept_id) AS dept_min_salary,
    
    -- Statistical functions
    PERCENT_RANK() OVER (ORDER BY salary) AS salary_percentile,
    CUME_DIST() OVER (ORDER BY salary) AS salary_cumulative_dist,
    NTILE(4) OVER (ORDER BY salary) AS salary_quartile
FROM employees
WHERE dept_id IS NOT NULL
ORDER BY dept_id, salary DESC;

-- Window function with frame clause
SELECT 
    order_date,
    total_amount,
    SUM(total_amount) OVER (
        ORDER BY order_date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS seven_day_running_total,
    AVG(total_amount) OVER (
        ORDER BY order_date
        RANGE BETWEEN INTERVAL '7 days' PRECEDING AND CURRENT ROW
    ) AS seven_day_moving_avg
FROM orders
ORDER BY order_date;

-- Indexes for performance
-- Create indexes
CREATE INDEX idx_employees_dept ON employees(dept_id);
CREATE INDEX idx_employees_salary ON employees(salary DESC);
CREATE INDEX idx_employees_name ON employees(emp_name);
CREATE UNIQUE INDEX idx_employees_email ON employees(email);

-- Composite index
CREATE INDEX idx_employees_dept_salary ON employees(dept_id, salary DESC);

-- Full-text search index (PostgreSQL)
CREATE INDEX idx_products_description_fts ON products 
USING GIN(to_tsvector('english', description));

-- Search with full-text
SELECT product_name, description
FROM products
WHERE to_tsvector('english', description) 
    @@ to_tsquery('english', 'organic & sustainable');

-- Views
CREATE VIEW employee_summary AS
SELECT 
    e.emp_id,
    e.emp_name,
    e.salary,
    d.dept_name,
    d.location,
    (SELECT COUNT(*) FROM employee_projects ep WHERE ep.emp_id = e.emp_id) AS project_count
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id;

-- Use the view like a table
SELECT * FROM employee_summary WHERE salary > 70000;

-- Materialized view (cached results)
CREATE MATERIALIZED VIEW department_stats AS
SELECT 
    d.dept_name,
    COUNT(e.emp_id) AS employee_count,
    AVG(e.salary) AS avg_salary,
    SUM(e.salary) AS total_salary
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
GROUP BY d.dept_name;

-- Refresh materialized view
REFRESH MATERIALIZED VIEW department_stats;

-- Stored Procedure (MySQL)
DELIMITER //
CREATE PROCEDURE give_raise(
    IN dept_id_param INT,
    IN raise_percent DECIMAL(5,2)
)
BEGIN
    DECLARE exit handler for sqlexception
    BEGIN
        ROLLBACK;
        RESIGNAL;
    END;
    
    START TRANSACTION;
    
    UPDATE employees
    SET salary = salary * (1 + raise_percent / 100)
    WHERE dept_id = dept_id_param;
    
    INSERT INTO salary_changes_log (dept_id, raise_percent, changed_by, change_date)
    VALUES (dept_id_param, raise_percent, CURRENT_USER(), NOW());
    
    COMMIT;
END //
DELIMITER ;

-- Execute procedure
CALL give_raise(1, 5.0);  -- Give 5% raise to department 1

-- Trigger (automated action)
CREATE TRIGGER before_employee_update
BEFORE UPDATE ON employees
FOR EACH ROW
BEGIN
    IF NEW.salary < OLD.salary THEN
        SIGNAL SQLSTATE '45000'
        SET MESSAGE_TEXT = 'Salary cannot be decreased';
    END IF;
    
    SET NEW.updated_at = NOW();
END;

-- JSON operations (MySQL 5.7+/PostgreSQL)
CREATE TABLE products_json (
    product_id INT PRIMARY KEY,
    product_data JSON
);

INSERT INTO products_json VALUES (1, '{
    "name": "Laptop",
    "price": 999.99,
    "specs": {"ram": "16GB", "storage": "512GB"},
    "tags": ["electronics", "computers"]
}');

-- Query JSON data
SELECT 
    product_id,
    JSON_EXTRACT(product_data, '$.name') AS product_name,
    JSON_EXTRACT(product_data, '$.price') AS price,
    JSON_EXTRACT(product_data, '$.specs.ram') AS ram,
    JSON_EXTRACT(product_data, '$.tags[0]') AS primary_tag
FROM products_json
WHERE JSON_EXTRACT(product_data, '$.price') > 500;

-- JSON functions
SELECT 
    JSON_OBJECT('id', emp_id, 'name', emp_name, 'salary', salary) AS employee_json
FROM employees
LIMIT 3;

-- Recursive query for organizational hierarchy
WITH RECURSIVE org_chart AS (
    -- CEO (no manager)
    SELECT 
        emp_id,
        emp_name,
        manager_id,
        1 AS level,
        CAST(emp_name AS CHAR(200)) AS path
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Employees reporting to managers
    SELECT 
        e.emp_id,
        e.emp_name,
        e.manager_id,
        oc.level + 1 AS level,
        CAST(CONCAT(oc.path, ' > ', e.emp_name) AS CHAR(200)) AS path
    FROM employees e
    INNER JOIN org_chart oc ON e.manager_id = oc.emp_id
)
SELECT 
    emp_name,
    level,
    path,
    REPEAT('  ', level - 1) || emp_name AS indented_name
FROM org_chart
ORDER BY path;

-- PIVOT operations (transform rows to columns)
-- MySQL doesn't have PIVOT, but can simulate with CASE
SELECT 
    dept_id,
    SUM(CASE WHEN salary < 50000 THEN 1 ELSE 0 END) AS under_50k,
    SUM(CASE WHEN salary BETWEEN 50000 AND 70000 THEN 1 ELSE 0 END) AS between_50k_70k,
    SUM(CASE WHEN salary > 70000 THEN 1 ELSE 0 END) AS over_70k,
    COUNT(*) AS total_employees
FROM employees
WHERE dept_id IS NOT NULL
GROUP BY dept_id;

-- Dynamic SQL (execute constructed queries)
-- Note: This is typically done in stored procedures
SET @table_name = 'employees';
SET @column_name = 'salary';
SET @sql = CONCAT('SELECT AVG(', @column_name, ') FROM ', @table_name);

PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

-- Temporary tables for complex operations
CREATE TEMPORARY TABLE temp_dept_stats AS
SELECT 
    dept_id,
    COUNT(*) AS emp_count,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY dept_id;

-- Use temp table in queries
SELECT 
    d.dept_name,
    tds.emp_count,
    tds.avg_salary
FROM departments d
LEFT JOIN temp_dept_stats tds ON d.dept_id = tds.dept_id
ORDER BY tds.avg_salary DESC;

-- Clean up
DROP TEMPORARY TABLE temp_dept_stats;

Window Functions Results

emp_name	dept_id	salary	dept_rank	dept_avg	diff_from_avg
Charlie Brown	1	80000	1	77500	+2500
Alice Johnson	1	75000	2	77500	-2500
Bob Smith	2	65000	1	62500	+2500
Frank Wilson	2	60000	2	62500	-2500
Diana Prince	3	55000	1	55000	0

Performance Tips:

Use indexes on frequently queried columns
Avoid SELECT * - specify only needed columns
Use EXPLAIN to analyze query execution plans
Limit use of correlated subqueries on large datasets
Consider materialized views for complex aggregations
Use appropriate data types to reduce storage
Regularly update table statistics for query optimizer