Codcups
Progress: 0%

01. 🗃️ SQL Fundamentals & Introduction

SQL (Structured Query Language) is the standard language for managing and manipulating relational databases. It allows you to create, read, update, and delete data efficiently.

What is SQL?

SQL is a domain-specific language used for managing data in relational database management systems (RDBMS). It's used by popular databases like MySQL, PostgreSQL, SQL Server, Oracle, and SQLite.

SQL Language Components

  • DDL (Data Definition Language): CREATE, ALTER, DROP
  • DML (Data Manipulation Language): SELECT, INSERT, UPDATE, DELETE
  • DCL (Data Control Language): GRANT, REVOKE
  • TCL (Transaction Control Language): COMMIT, ROLLBACK, SAVEPOINT
Key Concept: SQL is declarative - you describe what data you want, not how to get it. The database engine determines the most efficient way to execute your query.

Example: Basic SQL Syntax

-- Single line comment
/* Multi-line
   comment */

-- Create a database
CREATE DATABASE company;

-- Use a database
USE company;

-- Create a table
CREATE TABLE employees (
    id INT PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE,
    hire_date DATE,
    salary DECIMAL(10,2),
    department_id INT
);

-- Basic SELECT query
SELECT * FROM employees;

-- Insert data
INSERT INTO employees (first_name, last_name, email, hire_date, salary)
VALUES ('John', 'Doe', 'john.doe@company.com', '2023-01-15', 50000.00);

-- Update data
UPDATE employees 
SET salary = 55000.00 
WHERE id = 1;

-- Delete data
DELETE FROM employees 
WHERE id = 1;

Note: SQL keywords are not case-sensitive, but it's common practice to write them in uppercase for readability. Always end statements with a semicolon.

02. 🏗️ Databases & Tables Design

A well-designed database structure is crucial for efficient data management. Understanding tables, relationships, and data types is essential.

Database Schema Example

Employees
PK employee_id INT
first_name VARCHAR(50)
last_name VARCHAR(50)
email VARCHAR(100)
hire_date DATE
FK department_id INT
Departments
PK department_id INT
department_name VARCHAR(100)
location VARCHAR(100)
budget DECIMAL(15,2)
Projects
PK project_id INT
project_name VARCHAR(100)
start_date DATE
end_date DATE
budget DECIMAL(15,2)

Common Data Types

  • INT, BIGINT: Integer numbers
  • DECIMAL(p,s), FLOAT: Decimal numbers
  • VARCHAR(n), TEXT: Variable-length strings
  • CHAR(n): Fixed-length strings
  • DATE, TIME, DATETIME: Date and time
  • BOOLEAN: True/False values
  • BLOB: Binary large objects

Constraints

Constraints enforce rules on data:
PRIMARY KEY: Unique identifier
FOREIGN KEY: References another table
NOT NULL: Column cannot be empty
UNIQUE: All values must be unique
CHECK: Validates values against condition
DEFAULT: Sets default value

Example: Table Creation with Constraints

-- Create a comprehensive database schema
CREATE DATABASE ecommerce;
USE ecommerce;

-- Customers table
CREATE TABLE customers (
    customer_id INT PRIMARY KEY AUTO_INCREMENT,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    birth_date DATE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    is_active BOOLEAN DEFAULT TRUE,
    CONSTRAINT chk_email CHECK (email LIKE '%@%.%')
);

-- Products table
CREATE TABLE products (
    product_id INT PRIMARY KEY AUTO_INCREMENT,
    product_name VARCHAR(200) NOT NULL,
    description TEXT,
    price DECIMAL(10,2) NOT NULL CHECK (price > 0),
    stock_quantity INT NOT NULL DEFAULT 0 CHECK (stock_quantity >= 0),
    category_id INT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_category (category_id),
    INDEX idx_price (price)
);

-- Categories table
CREATE TABLE categories (
    category_id INT PRIMARY KEY AUTO_INCREMENT,
    category_name VARCHAR(100) NOT NULL UNIQUE,
    parent_category_id INT NULL,
    FOREIGN KEY (parent_category_id) REFERENCES categories(category_id)
);

-- Orders table
CREATE TABLE orders (
    order_id INT PRIMARY KEY AUTO_INCREMENT,
    customer_id INT NOT NULL,
    order_date DATETIME DEFAULT CURRENT_TIMESTAMP,
    total_amount DECIMAL(12,2) CHECK (total_amount >= 0),
    status ENUM('pending', 'processing', 'shipped', 'delivered', 'cancelled') DEFAULT 'pending',
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id) ON DELETE CASCADE,
    INDEX idx_order_date (order_date),
    INDEX idx_status (status)
);

-- Order items table (junction table)
CREATE TABLE order_items (
    order_item_id INT PRIMARY KEY AUTO_INCREMENT,
    order_id INT NOT NULL,
    product_id INT NOT NULL,
    quantity INT NOT NULL CHECK (quantity > 0),
    unit_price DECIMAL(10,2) NOT NULL CHECK (unit_price >= 0),
    subtotal DECIMAL(12,2) GENERATED ALWAYS AS (quantity * unit_price) STORED,
    FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE,
    FOREIGN KEY (product_id) REFERENCES products(product_id),
    UNIQUE KEY unique_order_product (order_id, product_id)
);

-- Alter table examples
ALTER TABLE products 
ADD COLUMN weight_kg DECIMAL(5,2) AFTER price;

ALTER TABLE customers 
MODIFY COLUMN email VARCHAR(150) NOT NULL;

ALTER TABLE products 
ADD CONSTRAINT fk_category 
FOREIGN KEY (category_id) REFERENCES categories(category_id);

-- Drop table (be careful!)
-- DROP TABLE IF EXISTS old_table_name;
Database Schema Visualization
customers
PK customer_id
username
email
first_name
last_name
orders
PK order_id
FK customer_id
order_date
total_amount
status
products
PK product_id
product_name
price
stock_quantity
FK category_id

Relationships: customers 1 → ∞ orders, orders ∞ → ∞ products (via order_items), products N → 1 categories

03. 🔍 Basic SELECT Queries

The SELECT statement is the most frequently used SQL command. It retrieves data from one or more tables.

SELECT Statement Syntax

The basic structure: SELECT columns FROM table WHERE conditions ORDER BY columns LIMIT count

SELECT Clause Components

  • SELECT: Specifies columns to retrieve
  • FROM: Specifies table(s) to query
  • WHERE: Filters rows based on conditions
  • ORDER BY: Sorts the result set
  • LIMIT/OFFSET: Controls result pagination
  • DISTINCT: Removes duplicate rows
Key Concept: Use SELECT * cautiously in production. Always specify only the columns you need for better performance and security.

Example: Basic SELECT Queries

-- Sample data for examples
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50),
    salary DECIMAL(10,2),
    hire_date DATE,
    manager_id INT
);

INSERT INTO employees VALUES
(1, 'John', 'Doe', 'Sales', 55000, '2020-03-15', NULL),
(2, 'Jane', 'Smith', 'Marketing', 60000, '2019-07-22', 1),
(3, 'Bob', 'Johnson', 'Sales', 52000, '2021-01-10', 1),
(4, 'Alice', 'Williams', 'IT', 75000, '2018-11-05', NULL),
(5, 'Charlie', 'Brown', 'Marketing', 58000, '2022-05-30', 2),
(6, 'Diana', 'Miller', 'Sales', 56000, '2020-08-12', 1),
(7, 'Eve', 'Davis', 'HR', 48000, '2021-03-25', NULL),
(8, 'Frank', 'Wilson', 'IT', 80000, '2017-09-18', 4);

-- Select all columns from table
SELECT * FROM employees;

-- Select specific columns
SELECT first_name, last_name, department 
FROM employees;

-- Select with column aliases
SELECT 
    first_name AS "First Name",
    last_name AS "Last Name",
    salary AS "Annual Salary"
FROM employees;

-- Select with expressions
SELECT 
    first_name,
    last_name,
    salary,
    salary * 1.1 AS "Proposed Salary",
    salary / 12 AS "Monthly Salary"
FROM employees;

-- Select distinct values
SELECT DISTINCT department 
FROM employees;

-- Select with calculated columns
SELECT 
    CONCAT(first_name, ' ', last_name) AS "Full Name",
    department,
    YEAR(hire_date) AS "Hire Year"
FROM employees;

-- Select from multiple tables (preview of joins)
SELECT 
    e.first_name,
    e.last_name,
    d.department_name,
    d.location
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id;

-- Select with system functions
SELECT 
    CURRENT_DATE AS "Today",
    CURRENT_TIME AS "Now",
    DATABASE() AS "Database Name",
    USER() AS "Current User";
Query Results Demo
employee_id first_name last_name department salary hire_date
1 John Doe Sales 55000.00 2020-03-15
2 Jane Smith Marketing 60000.00 2019-07-22
3 Bob Johnson Sales 52000.00 2021-01-10
4 Alice Williams IT 75000.00 2018-11-05
5 Charlie Brown Marketing 58000.00 2022-05-30
Example Queries on This Data:
  • SELECT DISTINCT department FROM employees; → Sales, Marketing, IT
  • SELECT first_name, salary FROM employees WHERE department = 'Sales'; → John 55000, Bob 52000
  • SELECT AVG(salary) FROM employees; → 60000 (average)

04. 🎯 Filtering, Sorting & Operators

Filtering allows you to retrieve only the rows that meet specific criteria. Sorting organizes the result set in a meaningful order.

WHERE Clause Operators

SQL provides various operators for filtering:
Comparison: =, <>, <, >, <=, >=
Logical: AND, OR, NOT
Pattern Matching: LIKE, ILIKE
Range: BETWEEN, IN

Filtering Techniques

  • WHERE: Basic condition filtering
  • LIKE/ILIKE: Pattern matching with wildcards
  • IN: Match against list of values
  • BETWEEN: Range filtering
  • IS NULL/IS NOT NULL: Null value checking
  • ORDER BY: Single/multi-column sorting
  • LIMIT/OFFSET: Result pagination

Example: Advanced Filtering & Sorting

-- Using WHERE with comparison operators
SELECT * FROM employees 
WHERE salary > 60000;

SELECT * FROM employees 
WHERE department = 'Sales' AND salary > 50000;

-- Using OR operator
SELECT * FROM employees 
WHERE department = 'Sales' OR department = 'Marketing';

-- Using NOT operator
SELECT * FROM employees 
WHERE NOT department = 'HR';

-- Using IN operator
SELECT * FROM employees 
WHERE department IN ('Sales', 'Marketing', 'IT');

SELECT * FROM employees 
WHERE salary IN (50000, 55000, 60000);

-- Using BETWEEN operator
SELECT * FROM employees 
WHERE salary BETWEEN 50000 AND 70000;

SELECT * FROM employees 
WHERE hire_date BETWEEN '2020-01-01' AND '2021-12-31';

-- Using LIKE for pattern matching
SELECT * FROM employees 
WHERE first_name LIKE 'J%';  -- Starts with J

SELECT * FROM employees 
WHERE last_name LIKE '%son';  -- Ends with son

SELECT * FROM employees 
WHERE first_name LIKE '_a%';  -- Second letter is 'a'

SELECT * FROM employees 
WHERE email LIKE '%@company.com';

-- Using ILIKE (case-insensitive, PostgreSQL)
SELECT * FROM employees 
WHERE first_name ILIKE 'john%';

-- Using IS NULL and IS NOT NULL
SELECT * FROM employees 
WHERE manager_id IS NULL;

SELECT * FROM employees 
WHERE manager_id IS NOT NULL;

-- Sorting with ORDER BY
SELECT * FROM employees 
ORDER BY salary DESC;  -- Highest salary first

SELECT * FROM employees 
ORDER BY department ASC, salary DESC;

SELECT * FROM employees 
ORDER BY hire_date DESC;  -- Most recent hires first

-- Using LIMIT and OFFSET for pagination
SELECT * FROM employees 
ORDER BY salary DESC 
LIMIT 5;  -- Top 5 highest paid

SELECT * FROM employees 
ORDER BY hire_date 
LIMIT 10 OFFSET 20;  -- Page 3 with 10 records per page

-- Complex filtering combinations
SELECT 
    first_name,
    last_name,
    department,
    salary,
    hire_date
FROM employees 
WHERE (department = 'Sales' OR department = 'Marketing')
    AND salary >= 55000
    AND hire_date >= '2020-01-01'
    AND manager_id IS NOT NULL
ORDER BY department ASC, salary DESC
LIMIT 10;

-- Using CASE in ORDER BY for custom sorting
SELECT 
    first_name,
    last_name,
    department,
    salary
FROM employees 
ORDER BY 
    CASE department
        WHEN 'IT' THEN 1
        WHEN 'Sales' THEN 2
        WHEN 'Marketing' THEN 3
        ELSE 4
    END,
    salary DESC;
Filtering Examples
Query Result Count Description
WHERE salary > 60000 3 rows High earners
WHERE department IN ('Sales', 'Marketing') 5 rows Sales & Marketing staff
WHERE first_name LIKE 'J%' 2 rows Names starting with J
WHERE hire_date BETWEEN '2020-01-01' AND '2021-12-31' 4 rows Hired in 2020-2021
ORDER BY salary DESC LIMIT 3 3 rows Top 3 salaries

Wildcard Patterns

  • % - Matches any sequence of characters
  • _ - Matches any single character
  • [abc] - Matches a, b, or c
  • [a-z] - Matches any lowercase letter
  • [^abc] - Matches any character except a, b, or c

05. ⚙️ SQL Functions & Operations

SQL functions perform operations on data, modify individual data items, and return results. They're categorized into different types based on their purpose.

Function Categories

Aggregate Functions: Operate on multiple rows (SUM, AVG, COUNT)
Scalar Functions: Operate on single values (UPPER, LOWER, ROUND)
Date Functions: Handle date/time operations
String Functions: Manipulate text data
Conversion Functions: Convert between data types

Common SQL Functions

  • String: CONCAT, SUBSTRING, LENGTH, REPLACE, TRIM
  • Numeric: ROUND, CEIL, FLOOR, ABS, MOD
  • Date: NOW, DATE_ADD, DATEDIFF, EXTRACT, DATE_FORMAT
  • Aggregate: COUNT, SUM, AVG, MIN, MAX
  • Conditional: CASE, COALESCE, NULLIF, IFNULL
Key Concept: Aggregate functions ignore NULL values by default. Use COUNT(*) to count all rows, but COUNT(column) counts non-null values only.

Example: SQL Functions in Action

-- String Functions
SELECT 
    first_name,
    last_name,
    CONCAT(first_name, ' ', last_name) AS full_name,
    UPPER(first_name) AS upper_first,
    LOWER(last_name) AS lower_last,
    LENGTH(CONCAT(first_name, last_name)) AS name_length,
    SUBSTRING(first_name, 1, 3) AS first_three,
    REPLACE(department, 'Sales', 'Business Development') AS new_dept_name,
    TRIM('  Hello World  ') AS trimmed_text
FROM employees;

-- Numeric Functions
SELECT 
    salary,
    ROUND(salary, -3) AS rounded_salary,  -- Nearest thousand
    CEIL(salary / 1000) * 1000 AS ceiling_salary,
    FLOOR(salary / 1000) * 1000 AS floor_salary,
    ABS(salary - 60000) AS diff_from_60k,
    MOD(salary, 1000) AS remainder,
    salary * 0.1 AS bonus,
    ROUND(salary * 1.07, 2) AS salary_with_raise
FROM employees;

-- Date Functions
SELECT 
    hire_date,
    CURRENT_DATE AS today,
    DATEDIFF(CURRENT_DATE, hire_date) AS days_employed,
    DATEDIFF(CURRENT_DATE, hire_date) / 365.25 AS years_employed,
    DATE_ADD(hire_date, INTERVAL 1 YEAR) AS first_anniversary,
    DATE_ADD(hire_date, INTERVAL 90 DAY) AS probation_end,
    EXTRACT(YEAR FROM hire_date) AS hire_year,
    EXTRACT(MONTH FROM hire_date) AS hire_month,
    DAYNAME(hire_date) AS hire_day_name,
    DATE_FORMAT(hire_date, '%M %d, %Y') AS formatted_date
FROM employees;

-- Conditional Functions
SELECT 
    first_name,
    last_name,
    salary,
    CASE 
        WHEN salary < 50000 THEN 'Entry Level'
        WHEN salary BETWEEN 50000 AND 70000 THEN 'Mid Level'
        WHEN salary > 70000 THEN 'Senior Level'
        ELSE 'Not Specified'
    END AS salary_level,
    CASE department
        WHEN 'IT' THEN 'Technology'
        WHEN 'Sales' THEN 'Revenue'
        WHEN 'Marketing' THEN 'Growth'
        ELSE 'Other'
    END AS department_category,
    COALESCE(manager_id, 0) AS manager_or_zero,
    IFNULL(department, 'No Department') AS safe_department
FROM employees;

-- Aggregate Functions (preview)
SELECT 
    COUNT(*) AS total_employees,
    COUNT(DISTINCT department) AS unique_departments,
    AVG(salary) AS average_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary,
    SUM(salary) AS total_salary_budget
FROM employees;

-- Combining multiple functions
SELECT 
    CONCAT(
        UPPER(SUBSTRING(first_name, 1, 1)),
        LOWER(SUBSTRING(first_name, 2))
    ) AS proper_first_name,
    CONCAT(
        UPPER(SUBSTRING(last_name, 1, 1)),
        LOWER(SUBSTRING(last_name, 2))
    ) AS proper_last_name,
    ROUND(salary * 1.05, 0) AS new_salary,
    CONCAT(
        'Hired on ',
        DATE_FORMAT(hire_date, '%M %d, %Y'),
        ' (',
        DATEDIFF(CURRENT_DATE, hire_date),
        ' days ago)'
    ) AS employment_details
FROM employees;
Function Results Demo
Function Example Result
CONCAT CONCAT('Hello', ' ', 'World') Hello World
UPPER/LOWER UPPER('sql'), LOWER('DATABASE') SQL, database
LENGTH LENGTH('Database') 8
SUBSTRING SUBSTRING('Programming', 1, 4) Prog
ROUND ROUND(123.4567, 2) 123.46
CEIL/FLOOR CEIL(4.3), FLOOR(4.7) 5, 4
NOW() SELECT NOW(); Current date/time
DATEDIFF DATEDIFF('2024-01-01', '2023-01-01') 365
COALESCE COALESCE(NULL, 'default') default

Important Notes:

  • Function names and syntax may vary between database systems
  • Always test functions with your specific database
  • Use database documentation for exact syntax
  • Consider performance when using functions in WHERE clauses

06. 🔗 SQL Joins & Relationships

JOIN operations combine rows from two or more tables based on related columns. Understanding joins is crucial for working with relational databases.

Visualizing JOIN Types

Table A
INNER JOIN
Table B
A
B
FULL OUTER JOIN

Types of JOINs

  • INNER JOIN: Returns matching rows from both tables
  • LEFT JOIN: All rows from left table, matching from right
  • RIGHT JOIN: All rows from right table, matching from left
  • FULL OUTER JOIN: All rows from both tables
  • CROSS JOIN: Cartesian product of both tables
  • SELF JOIN: Join a table with itself
Key Concept: Always use explicit JOIN syntax instead of implicit joins (comma-separated tables in FROM clause). It's clearer and less error-prone.

Example: SQL JOIN Operations

-- Sample data for JOIN examples
CREATE TABLE departments (
    dept_id INT PRIMARY KEY,
    dept_name VARCHAR(50),
    location VARCHAR(50)
);

CREATE TABLE employees (
    emp_id INT PRIMARY KEY,
    emp_name VARCHAR(100),
    salary DECIMAL(10,2),
    dept_id INT,
    FOREIGN KEY (dept_id) REFERENCES departments(dept_id)
);

CREATE TABLE projects (
    project_id INT PRIMARY KEY,
    project_name VARCHAR(100),
    budget DECIMAL(12,2)
);

CREATE TABLE employee_projects (
    emp_id INT,
    project_id INT,
    role VARCHAR(50),
    PRIMARY KEY (emp_id, project_id),
    FOREIGN KEY (emp_id) REFERENCES employees(emp_id),
    FOREIGN KEY (project_id) REFERENCES projects(project_id)
);

INSERT INTO departments VALUES
(1, 'Engineering', 'New York'),
(2, 'Sales', 'Chicago'),
(3, 'Marketing', 'Los Angeles'),
(4, 'HR', 'Boston');

INSERT INTO employees VALUES
(101, 'Alice Johnson', 75000, 1),
(102, 'Bob Smith', 65000, 2),
(103, 'Charlie Brown', 80000, 1),
(104, 'Diana Prince', 55000, 3),
(105, 'Eve Adams', 70000, NULL),
(106, 'Frank Wilson', 60000, 2);

INSERT INTO projects VALUES
(201, 'Website Redesign', 50000),
(202, 'Mobile App', 75000),
(203, 'Marketing Campaign', 30000),
(204, 'Sales Training', 20000);

INSERT INTO employee_projects VALUES
(101, 201, 'Lead Developer'),
(101, 202, 'Consultant'),
(102, 203, 'Project Manager'),
(103, 201, 'Developer'),
(103, 202, 'Lead Developer'),
(104, 203, 'Designer'),
(106, 204, 'Trainer');

-- INNER JOIN (most common)
SELECT 
    e.emp_name,
    e.salary,
    d.dept_name,
    d.location
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id;

-- LEFT JOIN (all employees, even without department)
SELECT 
    e.emp_name,
    e.salary,
    COALESCE(d.dept_name, 'No Department') AS department,
    COALESCE(d.location, 'N/A') AS location
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id;

-- RIGHT JOIN (all departments, even without employees)
SELECT 
    COALESCE(e.emp_name, 'No Employee') AS employee,
    d.dept_name,
    d.location
FROM employees e
RIGHT JOIN departments d ON e.dept_id = d.dept_id;

-- FULL OUTER JOIN (MySQL doesn't have FULL JOIN, emulate with UNION)
SELECT 
    e.emp_name,
    d.dept_name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id
UNION
SELECT 
    e.emp_name,
    d.dept_name
FROM employees e
RIGHT JOIN departments d ON e.dept_id = d.dept_id
WHERE e.dept_id IS NULL;

-- Multiple JOINs (employees with their projects)
SELECT 
    e.emp_name,
    d.dept_name,
    p.project_name,
    ep.role,
    p.budget
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id
INNER JOIN employee_projects ep ON e.emp_id = ep.emp_id
INNER JOIN projects p ON ep.project_id = p.project_id;

-- SELF JOIN (employees and their managers - assuming manager_id column)
ALTER TABLE employees ADD COLUMN manager_id INT REFERENCES employees(emp_id);

UPDATE employees SET manager_id = 103 WHERE emp_id = 101;
UPDATE employees SET manager_id = 103 WHERE emp_id = 102;

SELECT 
    e.emp_name AS employee,
    m.emp_name AS manager,
    e.salary AS employee_salary,
    m.salary AS manager_salary
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.emp_id;

-- CROSS JOIN (cartesian product)
SELECT 
    e.emp_name,
    d.dept_name
FROM employees e
CROSS JOIN departments d
WHERE e.dept_id IS NULL OR d.dept_id != e.dept_id;

-- JOIN with filtering
SELECT 
    e.emp_name,
    d.dept_name,
    e.salary
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id
WHERE d.location = 'New York'
    AND e.salary > 70000;

-- JOIN with aggregation
SELECT 
    d.dept_name,
    COUNT(e.emp_id) AS employee_count,
    AVG(e.salary) AS avg_salary,
    SUM(e.salary) AS total_salary
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
GROUP BY d.dept_name
ORDER BY employee_count DESC;

-- JOIN using USING clause (when column names match)
SELECT 
    e.emp_name,
    d.dept_name
FROM employees e
INNER JOIN departments d USING (dept_id);

-- NATURAL JOIN (automatic join on matching column names - use with caution)
SELECT 
    e.emp_name,
    d.dept_name
FROM employees e
NATURAL JOIN departments d;
JOIN Visualization

Employees

emp_name
dept_id
Alice Johnson
1
Bob Smith
2
Charlie Brown
1
Diana Prince
3
Eve Adams
NULL

Departments

dept_name
dept_id
Engineering
1
Sales
2
Marketing
3
HR
4

INNER JOIN Results

emp_name
dept_name
Alice Johnson
Engineering
Bob Smith
Sales
Charlie Brown
Engineering
Diana Prince
Marketing

Note: Eve Adams (dept_id NULL) is excluded from INNER JOIN

07. 📊 Aggregation & Grouping

Aggregate functions perform calculations on sets of rows and return single values. GROUP BY groups rows that have the same values in specified columns.

Aggregate Function Behavior

Aggregate functions ignore NULL values except COUNT(*). Use GROUP BY to create summary rows for each unique combination of grouping columns. HAVING filters groups after aggregation.

Key Aggregation Concepts

  • GROUP BY: Groups rows sharing a property
  • HAVING: Filters groups (like WHERE for groups)
  • Aggregate Functions: COUNT, SUM, AVG, MIN, MAX
  • ROLLUP: Creates subtotals and grand totals
  • CUBE: All possible grouping combinations
  • GROUPING SETS: Specific grouping combinations
Key Concept: The order of operations in SQL: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT. HAVING comes after GROUP BY, while WHERE comes before.

Example: Aggregation & Grouping Queries

-- Basic aggregate functions
SELECT 
    COUNT(*) AS total_employees,
    COUNT(DISTINCT dept_id) AS departments_with_employees,
    AVG(salary) AS average_salary,
    MIN(salary) AS minimum_salary,
    MAX(salary) AS maximum_salary,
    SUM(salary) AS total_salary_budget,
    STDDEV(salary) AS salary_std_dev,
    VARIANCE(salary) AS salary_variance
FROM employees;

-- GROUP BY with single column
SELECT 
    dept_id,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary,
    SUM(salary) AS dept_salary_budget
FROM employees
WHERE dept_id IS NOT NULL
GROUP BY dept_id
ORDER BY avg_salary DESC;

-- GROUP BY with multiple columns
SELECT 
    d.dept_name,
    d.location,
    COUNT(e.emp_id) AS employee_count,
    AVG(e.salary) AS avg_salary
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
GROUP BY d.dept_name, d.location
ORDER BY employee_count DESC;

-- HAVING clause (filter groups)
SELECT 
    dept_id,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
WHERE dept_id IS NOT NULL
GROUP BY dept_id
HAVING COUNT(*) > 1 
    AND AVG(salary) > 60000;

-- Complex aggregation with JOIN
SELECT 
    d.dept_name,
    COUNT(DISTINCT e.emp_id) AS total_employees,
    COUNT(DISTINCT ep.project_id) AS projects_involved,
    AVG(e.salary) AS average_salary,
    SUM(p.budget) AS total_project_budget
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
LEFT JOIN employee_projects ep ON e.emp_id = ep.emp_id
LEFT JOIN projects p ON ep.project_id = p.project_id
GROUP BY d.dept_name
HAVING COUNT(DISTINCT e.emp_id) > 0
ORDER BY total_project_budget DESC NULLS LAST;

-- Using ROLLUP for subtotals (MySQL)
SELECT 
    COALESCE(d.dept_name, 'All Departments') AS department,
    COALESCE(e.emp_name, 'Department Total') AS employee,
    COUNT(*) AS project_count,
    SUM(p.budget) AS total_budget
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
LEFT JOIN employee_projects ep ON e.emp_id = ep.emp_id
LEFT JOIN projects p ON ep.project_id = p.project_id
GROUP BY d.dept_name, e.emp_name WITH ROLLUP
HAVING d.dept_name IS NOT NULL OR e.emp_name IS NULL;

-- Conditional aggregation with CASE
SELECT 
    dept_id,
    COUNT(*) AS total_employees,
    COUNT(CASE WHEN salary > 70000 THEN 1 END) AS high_earners,
    COUNT(CASE WHEN salary BETWEEN 50000 AND 70000 THEN 1 END) AS mid_earners,
    COUNT(CASE WHEN salary < 50000 THEN 1 END) AS low_earners,
    SUM(CASE WHEN salary > 70000 THEN salary ELSE 0 END) AS high_earners_salary,
    ROUND(AVG(CASE WHEN salary > 60000 THEN salary END), 2) AS avg_high_salary
FROM employees
GROUP BY dept_id;

-- Percentages and ratios
SELECT 
    dept_id,
    COUNT(*) AS employee_count,
    ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (), 2) AS percentage_of_total,
    AVG(salary) AS avg_salary,
    ROUND(AVG(salary) * 100.0 / AVG(AVG(salary)) OVER (), 2) AS salary_percentage_of_avg
FROM employees
WHERE dept_id IS NOT NULL
GROUP BY dept_id;

-- Window functions with aggregation
SELECT 
    emp_name,
    salary,
    dept_id,
    AVG(salary) OVER (PARTITION BY dept_id) AS dept_avg_salary,
    salary - AVG(salary) OVER (PARTITION BY dept_id) AS diff_from_dept_avg,
    RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS salary_rank_in_dept,
    salary * 100.0 / SUM(salary) OVER (PARTITION BY dept_id) AS salary_percentage_of_dept
FROM employees
WHERE dept_id IS NOT NULL
ORDER BY dept_id, salary DESC;

-- Multiple aggregation levels
SELECT 
    dept_id,
    COUNT(*) AS total,
    AVG(salary) AS avg_salary,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) AS median_salary,
    PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY salary) AS q1_salary,
    PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY salary) AS q3_salary,
    MAX(salary) - MIN(salary) AS salary_range
FROM employees
GROUP BY dept_id
HAVING COUNT(*) >= 2;

-- Aggregation with date functions
SELECT 
    YEAR(hire_date) AS hire_year,
    MONTH(hire_date) AS hire_month,
    COUNT(*) AS hires_count,
    AVG(salary) AS avg_starting_salary
FROM employees
WHERE hire_date IS NOT NULL
GROUP BY YEAR(hire_date), MONTH(hire_date)
ORDER BY hire_year DESC, hire_month DESC;
Aggregation Results
dept_name employee_count avg_salary total_salary
Engineering 2 77500.00 155000.00
Sales 2 62500.00 125000.00
Marketing 1 55000.00 55000.00
HR 0 NULL NULL
TOTAL 5 65000.00 325000.00

Aggregate Function Notes:

  • COUNT(*) counts all rows including NULLs
  • COUNT(column) counts non-NULL values only
  • Aggregate functions ignore NULL values in calculations
  • Use DISTINCT inside aggregate functions for unique values
  • HAVING can use aggregate functions, WHERE cannot

08. 🎭 Subqueries & CTEs

Subqueries (nested queries) are queries within queries. CTEs (Common Table Expressions) are temporary named result sets that improve query readability.

Subquery Types

Scalar Subquery: Returns single value
Row Subquery: Returns single row
Table Subquery: Returns multiple rows/columns
Correlated Subquery: References outer query

Subquery Positions

  • SELECT clause: As a column expression
  • FROM clause: As a derived table
  • WHERE clause: With operators (IN, ANY, ALL)
  • HAVING clause: Filter aggregated results
  • WITH clause: CTE definition
Key Concept: CTEs make complex queries more readable and maintainable. They can be recursive (self-referencing) which is powerful for hierarchical data.

Example: Subqueries & CTEs

-- Scalar subquery in SELECT
SELECT 
    emp_name,
    salary,
    (SELECT AVG(salary) FROM employees) AS company_avg,
    salary - (SELECT AVG(salary) FROM employees) AS diff_from_avg
FROM employees;

-- Subquery in WHERE clause
SELECT emp_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

SELECT emp_name, dept_id
FROM employees
WHERE dept_id IN (SELECT dept_id FROM departments WHERE location = 'New York');

-- Correlated subquery
SELECT 
    e.emp_name,
    e.salary,
    e.dept_id
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.dept_id = e.dept_id
);

-- Subquery in FROM clause (derived table)
SELECT 
    dept_name,
    avg_salary,
    employee_count
FROM (
    SELECT 
        d.dept_name,
        AVG(e.salary) AS avg_salary,
        COUNT(e.emp_id) AS employee_count
    FROM departments d
    LEFT JOIN employees e ON d.dept_id = e.dept_id
    GROUP BY d.dept_name
) AS dept_stats
WHERE avg_salary > 60000 OR avg_salary IS NULL;

-- EXISTS subquery
SELECT d.dept_name
FROM departments d
WHERE EXISTS (
    SELECT 1
    FROM employees e
    WHERE e.dept_id = d.dept_id
    AND e.salary > 70000
);

-- NOT EXISTS subquery
SELECT d.dept_name
FROM departments d
WHERE NOT EXISTS (
    SELECT 1
    FROM employees e
    WHERE e.dept_id = d.dept_id
);

-- ANY/ALL operators with subqueries
SELECT emp_name, salary
FROM employees
WHERE salary > ANY (
    SELECT salary
    FROM employees
    WHERE dept_id = 1
);

SELECT emp_name, salary
FROM employees
WHERE salary > ALL (
    SELECT salary
    FROM employees
    WHERE dept_id = 2
);

-- Basic CTE (Common Table Expression)
WITH department_stats AS (
    SELECT 
        dept_id,
        COUNT(*) AS employee_count,
        AVG(salary) AS avg_salary
    FROM employees
    WHERE dept_id IS NOT NULL
    GROUP BY dept_id
),
high_performing_depts AS (
    SELECT dept_id
    FROM department_stats
    WHERE avg_salary > 65000
)
SELECT 
    d.dept_name,
    ds.employee_count,
    ds.avg_salary
FROM departments d
INNER JOIN department_stats ds ON d.dept_id = ds.dept_id
WHERE d.dept_id IN (SELECT dept_id FROM high_performing_depts);

-- Multiple CTEs
WITH 
employee_projects_count AS (
    SELECT 
        emp_id,
        COUNT(*) AS project_count
    FROM employee_projects
    GROUP BY emp_id
),
department_summary AS (
    SELECT 
        e.dept_id,
        COUNT(DISTINCT e.emp_id) AS total_employees,
        AVG(e.salary) AS avg_salary,
        SUM(epc.project_count) AS total_projects
    FROM employees e
    LEFT JOIN employee_projects_count epc ON e.emp_id = epc.emp_id
    GROUP BY e.dept_id
)
SELECT 
    d.dept_name,
    ds.total_employees,
    ds.avg_salary,
    ds.total_projects,
    ROUND(ds.total_projects * 1.0 / NULLIF(ds.total_employees, 0), 2) AS projects_per_employee
FROM departments d
LEFT JOIN department_summary ds ON d.dept_id = ds.dept_id
ORDER BY ds.avg_salary DESC;

-- Recursive CTE (for hierarchical data)
WITH RECURSIVE employee_hierarchy AS (
    -- Anchor member: top-level employees (no manager)
    SELECT 
        emp_id,
        emp_name,
        manager_id,
        1 AS level,
        CAST(emp_name AS CHAR(200)) AS hierarchy_path
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Recursive member: employees with managers
    SELECT 
        e.emp_id,
        e.emp_name,
        e.manager_id,
        eh.level + 1 AS level,
        CAST(CONCAT(eh.hierarchy_path, ' → ', e.emp_name) AS CHAR(200)) AS hierarchy_path
    FROM employees e
    INNER JOIN employee_hierarchy eh ON e.manager_id = eh.emp_id
)
SELECT 
    emp_name,
    level,
    hierarchy_path
FROM employee_hierarchy
ORDER BY level, emp_name;

-- CTE for data transformation
WITH monthly_sales AS (
    SELECT 
        DATE_TRUNC('month', order_date) AS month,
        SUM(total_amount) AS monthly_total,
        COUNT(DISTINCT customer_id) AS unique_customers
    FROM orders
    GROUP BY DATE_TRUNC('month', order_date)
),
monthly_growth AS (
    SELECT 
        month,
        monthly_total,
        unique_customers,
        LAG(monthly_total) OVER (ORDER BY month) AS previous_month_total,
        LAG(unique_customers) OVER (ORDER BY month) AS previous_month_customers
    FROM monthly_sales
)
SELECT 
    TO_CHAR(month, 'Mon YYYY') AS month_display,
    monthly_total,
    unique_customers,
    ROUND(
        (monthly_total - previous_month_total) * 100.0 / NULLIF(previous_month_total, 0), 
        2
    ) AS revenue_growth_percent,
    unique_customers - previous_month_customers AS customer_growth
FROM monthly_growth
ORDER BY month;

-- Subquery vs JOIN comparison
-- Using subquery
SELECT emp_name
FROM employees
WHERE dept_id IN (
    SELECT dept_id
    FROM departments
    WHERE location = 'New York'
);

-- Equivalent JOIN
SELECT DISTINCT e.emp_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id
WHERE d.location = 'New York';
CTE Structure Visualization

CTE Execution Flow:

WITH cte1 AS (
    -- First CTE definition
    SELECT ... FROM table1
),
cte2 AS (
    -- Second CTE can reference cte1
    SELECT ... FROM cte1 JOIN table2
),
cte3 AS (
    -- Third CTE can reference previous CTEs
    SELECT ... FROM cte2 WHERE ...
)
-- Main query using all CTEs
SELECT * FROM cte3 ORDER BY ...;

Key Points:
1. CTEs are defined in the order they're written
2. Later CTEs can reference earlier ones
3. Main query can reference any CTE
4. CTEs exist only for the duration of the query

Query Type Best For Performance
Scalar Subquery Single value comparisons Good for small datasets
Correlated Subquery Row-by-row processing Can be slow for large data
CTE Complex multi-step queries Optimized by database
Derived Table Simple temp results Similar to CTE

09. ✏️ Data Modification Operations

Data modification statements (INSERT, UPDATE, DELETE) change the data in your database. These operations should be used carefully, especially in production environments.

ACID Properties

Database transactions should maintain:
Atomicity: All or nothing execution
Consistency: Valid state before and after
Isolation: Concurrent transactions don't interfere
Durability: Committed changes persist

Data Modification Operations

  • INSERT: Add new rows to tables
  • UPDATE: Modify existing rows
  • DELETE: Remove rows from tables
  • MERGE/UPSERT: Insert or update based on condition
  • TRUNCATE: Remove all rows quickly
Key Concept: Always use transactions (BEGIN/COMMIT/ROLLBACK) when modifying data. Test with SELECT first, use WHERE clauses carefully, and backup before major changes.

Example: Data Modification Statements

-- Basic INSERT
INSERT INTO employees (emp_id, emp_name, salary, dept_id)
VALUES (107, 'Grace Lee', 72000, 1);

-- Insert multiple rows
INSERT INTO employees (emp_id, emp_name, salary, dept_id)
VALUES 
    (108, 'Henry Ford', 68000, 2),
    (109, 'Irene Adler', 81000, 1),
    (110, 'Jack Ryan', 59000, 3);

-- INSERT with SELECT (copy data)
INSERT INTO employees_backup (emp_id, emp_name, salary, dept_id, backup_date)
SELECT 
    emp_id,
    emp_name,
    salary,
    dept_id,
    CURRENT_DATE
FROM employees
WHERE hire_date < '2022-01-01';

-- Basic UPDATE
UPDATE employees
SET salary = 75000
WHERE emp_id = 101;

-- UPDATE with calculation
UPDATE employees
SET salary = salary * 1.05  -- 5% raise
WHERE dept_id = 1;

-- UPDATE with subquery
UPDATE employees e
SET salary = (
    SELECT AVG(salary)
    FROM employees e2
    WHERE e2.dept_id = e.dept_id
) * 1.1
WHERE dept_id IS NOT NULL;

-- UPDATE multiple columns
UPDATE employees
SET 
    salary = salary * 1.03,
    last_raise_date = CURRENT_DATE,
    updated_at = CURRENT_TIMESTAMP
WHERE performance_rating >= 4;

-- Basic DELETE
DELETE FROM employees
WHERE emp_id = 110;

-- DELETE with condition
DELETE FROM employees
WHERE dept_id IS NULL
   OR salary < 50000;

-- DELETE using subquery
DELETE FROM employees
WHERE dept_id IN (
    SELECT dept_id
    FROM departments
    WHERE location = 'Chicago'
);

-- Safe DELETE with SELECT first
-- Always run SELECT first to see what will be deleted
SELECT * FROM employees WHERE dept_id = 4;
-- Then if correct, run:
-- DELETE FROM employees WHERE dept_id = 4;

-- TRUNCATE (removes all rows, cannot be rolled back)
TRUNCATE TABLE audit_log;
-- Faster than DELETE FROM audit_log;

-- MERGE (UPSERT) - Insert or Update
-- MySQL uses INSERT ... ON DUPLICATE KEY UPDATE
INSERT INTO employees (emp_id, emp_name, salary, dept_id)
VALUES (111, 'Karen White', 65000, 2)
ON DUPLICATE KEY UPDATE
    emp_name = VALUES(emp_name),
    salary = VALUES(salary),
    dept_id = VALUES(dept_id),
    updated_at = CURRENT_TIMESTAMP;

-- PostgreSQL uses INSERT ... ON CONFLICT
INSERT INTO employees (emp_id, emp_name, salary, dept_id)
VALUES (111, 'Karen White', 65000, 2)
ON CONFLICT (emp_id) DO UPDATE SET
    emp_name = EXCLUDED.emp_name,
    salary = EXCLUDED.salary,
    dept_id = EXCLUDED.dept_id,
    updated_at = CURRENT_TIMESTAMP;

-- SQL Server uses MERGE
MERGE INTO employees AS target
USING (VALUES (111, 'Karen White', 65000, 2)) AS source (emp_id, emp_name, salary, dept_id)
ON target.emp_id = source.emp_id
WHEN MATCHED THEN
    UPDATE SET 
        emp_name = source.emp_name,
        salary = source.salary,
        dept_id = source.dept_id,
        updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (emp_id, emp_name, salary, dept_id)
    VALUES (source.emp_id, source.emp_name, source.salary, source.dept_id);

-- Transactions
BEGIN TRANSACTION;  -- or START TRANSACTION in MySQL

INSERT INTO orders (order_id, customer_id, total_amount)
VALUES (1001, 501, 299.99);

INSERT INTO order_items (order_id, product_id, quantity, unit_price)
VALUES 
    (1001, 2001, 2, 99.99),
    (1001, 2002, 1, 100.01);

UPDATE products
SET stock_quantity = stock_quantity - 2
WHERE product_id = 2001;

UPDATE products
SET stock_quantity = stock_quantity - 1
WHERE product_id = 2002;

-- Check if everything is ok
SELECT * FROM products WHERE product_id IN (2001, 2002);

-- If everything is correct
COMMIT;

-- If something went wrong
-- ROLLBACK;

-- Savepoints for partial rollback
BEGIN TRANSACTION;

INSERT INTO employees (emp_id, emp_name, salary) VALUES (112, 'Larry King', 55000);

SAVEPOINT sp1;

UPDATE employees SET salary = salary * 1.1 WHERE dept_id = 1;

-- Oops, wrong department! Rollback to savepoint
ROLLBACK TO SAVEPOINT sp1;

-- Correct update
UPDATE employees SET salary = salary * 1.1 WHERE dept_id = 2;

COMMIT;

-- INSERT with RETURNING (PostgreSQL) / OUTPUT (SQL Server)
INSERT INTO employees (emp_name, salary, dept_id)
VALUES ('Mona Lisa', 88000, 1)
RETURNING emp_id, emp_name, salary;  -- PostgreSQL

INSERT INTO employees (emp_name, salary, dept_id)
OUTPUT INSERTED.emp_id, INSERTED.emp_name, INSERTED.salary
VALUES ('Mona Lisa', 88000, 1);  -- SQL Server

-- Bulk operations
-- Create temp table for bulk insert
CREATE TEMPORARY TABLE temp_employees (
    emp_name VARCHAR(100),
    salary DECIMAL(10,2),
    dept_id INT
);

-- Load data into temp table (from file or application)
INSERT INTO temp_employees VALUES
('Nancy Drew', 62000, 2),
('Oliver Twist', 58000, 3),
('Peter Pan', 71000, 1);

-- Insert from temp table with transformation
INSERT INTO employees (emp_name, salary, dept_id, hire_date)
SELECT 
    emp_name,
    salary,
    dept_id,
    CURRENT_DATE
FROM temp_employees
WHERE NOT EXISTS (
    SELECT 1 FROM employees e 
    WHERE e.emp_name = temp_employees.emp_name
);

-- Clean up
DROP TEMPORARY TABLE temp_employees;
Data Modification Best Practices

⚠️ DANGER ZONE - Proceed with Caution!

These operations can cause data loss if used incorrectly:

  • DELETE FROM table; - Deletes ALL rows!
  • UPDATE table SET column = value; - Updates ALL rows!
  • DROP TABLE table; - Deletes table structure and data!
  • TRUNCATE TABLE table; - Cannot be rolled back!

Always: Backup first, use WHERE clauses, test with SELECT, use transactions.

Operation Use Case Transaction Safe?
INSERT Adding new records Yes
UPDATE Modifying existing records Yes
DELETE Removing specific records Yes
TRUNCATE Removing all records quickly No (auto-commit)
DROP Removing table entirely No (auto-commit)

10. 🚀 Advanced SQL Techniques

Advanced SQL features provide powerful capabilities for complex data analysis, optimization, and specialized operations.

Window Functions

Window functions perform calculations across a set of table rows that are somehow related to the current row. Unlike aggregate functions, they don't group rows into a single output row.

Advanced SQL Features

  • Window Functions: ROW_NUMBER, RANK, LAG, LEAD
  • Indexes: Performance optimization
  • Views: Virtual tables for security/simplicity
  • Stored Procedures: Reusable code blocks
  • Triggers: Automatic actions on events
  • Full-Text Search: Advanced text searching
  • JSON Support: Native JSON operations

Example: Advanced SQL Features

-- Window Functions
SELECT 
    emp_name,
    salary,
    dept_id,
    -- Ranking functions
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS salary_rank,
    RANK() OVER (ORDER BY salary DESC) AS salary_rank_with_ties,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_salary_rank,
    
    -- Department rankings
    ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS dept_salary_rank,
    
    -- Navigation functions
    LAG(salary) OVER (ORDER BY salary) AS previous_salary,
    LEAD(salary) OVER (ORDER BY salary) AS next_salary,
    FIRST_VALUE(salary) OVER (PARTITION BY dept_id ORDER BY salary DESC) AS dept_highest_salary,
    LAST_VALUE(salary) OVER (PARTITION BY dept_id ORDER BY salary DESC 
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS dept_lowest_salary,
    
    -- Aggregate window functions
    SUM(salary) OVER (PARTITION BY dept_id) AS dept_total_salary,
    AVG(salary) OVER (PARTITION BY dept_id) AS dept_avg_salary,
    MAX(salary) OVER (PARTITION BY dept_id) AS dept_max_salary,
    MIN(salary) OVER (PARTITION BY dept_id) AS dept_min_salary,
    
    -- Statistical functions
    PERCENT_RANK() OVER (ORDER BY salary) AS salary_percentile,
    CUME_DIST() OVER (ORDER BY salary) AS salary_cumulative_dist,
    NTILE(4) OVER (ORDER BY salary) AS salary_quartile
FROM employees
WHERE dept_id IS NOT NULL
ORDER BY dept_id, salary DESC;

-- Window function with frame clause
SELECT 
    order_date,
    total_amount,
    SUM(total_amount) OVER (
        ORDER BY order_date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS seven_day_running_total,
    AVG(total_amount) OVER (
        ORDER BY order_date
        RANGE BETWEEN INTERVAL '7 days' PRECEDING AND CURRENT ROW
    ) AS seven_day_moving_avg
FROM orders
ORDER BY order_date;

-- Indexes for performance
-- Create indexes
CREATE INDEX idx_employees_dept ON employees(dept_id);
CREATE INDEX idx_employees_salary ON employees(salary DESC);
CREATE INDEX idx_employees_name ON employees(emp_name);
CREATE UNIQUE INDEX idx_employees_email ON employees(email);

-- Composite index
CREATE INDEX idx_employees_dept_salary ON employees(dept_id, salary DESC);

-- Full-text search index (PostgreSQL)
CREATE INDEX idx_products_description_fts ON products 
USING GIN(to_tsvector('english', description));

-- Search with full-text
SELECT product_name, description
FROM products
WHERE to_tsvector('english', description) 
    @@ to_tsquery('english', 'organic & sustainable');

-- Views
CREATE VIEW employee_summary AS
SELECT 
    e.emp_id,
    e.emp_name,
    e.salary,
    d.dept_name,
    d.location,
    (SELECT COUNT(*) FROM employee_projects ep WHERE ep.emp_id = e.emp_id) AS project_count
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id;

-- Use the view like a table
SELECT * FROM employee_summary WHERE salary > 70000;

-- Materialized view (cached results)
CREATE MATERIALIZED VIEW department_stats AS
SELECT 
    d.dept_name,
    COUNT(e.emp_id) AS employee_count,
    AVG(e.salary) AS avg_salary,
    SUM(e.salary) AS total_salary
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
GROUP BY d.dept_name;

-- Refresh materialized view
REFRESH MATERIALIZED VIEW department_stats;

-- Stored Procedure (MySQL)
DELIMITER //
CREATE PROCEDURE give_raise(
    IN dept_id_param INT,
    IN raise_percent DECIMAL(5,2)
)
BEGIN
    DECLARE exit handler for sqlexception
    BEGIN
        ROLLBACK;
        RESIGNAL;
    END;
    
    START TRANSACTION;
    
    UPDATE employees
    SET salary = salary * (1 + raise_percent / 100)
    WHERE dept_id = dept_id_param;
    
    INSERT INTO salary_changes_log (dept_id, raise_percent, changed_by, change_date)
    VALUES (dept_id_param, raise_percent, CURRENT_USER(), NOW());
    
    COMMIT;
END //
DELIMITER ;

-- Execute procedure
CALL give_raise(1, 5.0);  -- Give 5% raise to department 1

-- Trigger (automated action)
CREATE TRIGGER before_employee_update
BEFORE UPDATE ON employees
FOR EACH ROW
BEGIN
    IF NEW.salary < OLD.salary THEN
        SIGNAL SQLSTATE '45000'
        SET MESSAGE_TEXT = 'Salary cannot be decreased';
    END IF;
    
    SET NEW.updated_at = NOW();
END;

-- JSON operations (MySQL 5.7+/PostgreSQL)
CREATE TABLE products_json (
    product_id INT PRIMARY KEY,
    product_data JSON
);

INSERT INTO products_json VALUES (1, '{
    "name": "Laptop",
    "price": 999.99,
    "specs": {"ram": "16GB", "storage": "512GB"},
    "tags": ["electronics", "computers"]
}');

-- Query JSON data
SELECT 
    product_id,
    JSON_EXTRACT(product_data, '$.name') AS product_name,
    JSON_EXTRACT(product_data, '$.price') AS price,
    JSON_EXTRACT(product_data, '$.specs.ram') AS ram,
    JSON_EXTRACT(product_data, '$.tags[0]') AS primary_tag
FROM products_json
WHERE JSON_EXTRACT(product_data, '$.price') > 500;

-- JSON functions
SELECT 
    JSON_OBJECT('id', emp_id, 'name', emp_name, 'salary', salary) AS employee_json
FROM employees
LIMIT 3;

-- Recursive query for organizational hierarchy
WITH RECURSIVE org_chart AS (
    -- CEO (no manager)
    SELECT 
        emp_id,
        emp_name,
        manager_id,
        1 AS level,
        CAST(emp_name AS CHAR(200)) AS path
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Employees reporting to managers
    SELECT 
        e.emp_id,
        e.emp_name,
        e.manager_id,
        oc.level + 1 AS level,
        CAST(CONCAT(oc.path, ' > ', e.emp_name) AS CHAR(200)) AS path
    FROM employees e
    INNER JOIN org_chart oc ON e.manager_id = oc.emp_id
)
SELECT 
    emp_name,
    level,
    path,
    REPEAT('  ', level - 1) || emp_name AS indented_name
FROM org_chart
ORDER BY path;

-- PIVOT operations (transform rows to columns)
-- MySQL doesn't have PIVOT, but can simulate with CASE
SELECT 
    dept_id,
    SUM(CASE WHEN salary < 50000 THEN 1 ELSE 0 END) AS under_50k,
    SUM(CASE WHEN salary BETWEEN 50000 AND 70000 THEN 1 ELSE 0 END) AS between_50k_70k,
    SUM(CASE WHEN salary > 70000 THEN 1 ELSE 0 END) AS over_70k,
    COUNT(*) AS total_employees
FROM employees
WHERE dept_id IS NOT NULL
GROUP BY dept_id;

-- Dynamic SQL (execute constructed queries)
-- Note: This is typically done in stored procedures
SET @table_name = 'employees';
SET @column_name = 'salary';
SET @sql = CONCAT('SELECT AVG(', @column_name, ') FROM ', @table_name);

PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

-- Temporary tables for complex operations
CREATE TEMPORARY TABLE temp_dept_stats AS
SELECT 
    dept_id,
    COUNT(*) AS emp_count,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY dept_id;

-- Use temp table in queries
SELECT 
    d.dept_name,
    tds.emp_count,
    tds.avg_salary
FROM departments d
LEFT JOIN temp_dept_stats tds ON d.dept_id = tds.dept_id
ORDER BY tds.avg_salary DESC;

-- Clean up
DROP TEMPORARY TABLE temp_dept_stats;
Window Functions Results
emp_name dept_id salary dept_rank dept_avg diff_from_avg
Charlie Brown 1 80000 1 77500 +2500
Alice Johnson 1 75000 2 77500 -2500
Bob Smith 2 65000 1 62500 +2500
Frank Wilson 2 60000 2 62500 -2500
Diana Prince 3 55000 1 55000 0

Performance Tips:

  • Use indexes on frequently queried columns
  • Avoid SELECT * - specify only needed columns
  • Use EXPLAIN to analyze query execution plans
  • Limit use of correlated subqueries on large datasets
  • Consider materialized views for complex aggregations
  • Use appropriate data types to reduce storage
  • Regularly update table statistics for query optimizer