Master SQL Interview Questions: From Basic Queries to Advanced Analytics with Complete Solutions

You're sitting across from a hiring manager, and they've just asked: "Can you write a query to find our second-highest paid employee?" Your mind races. You know SQL, you use it daily, but something about the interview context makes even familiar concepts feel harder. The silence stretches as you wonder if there's a trick you're missing.

SQL interviews aren't just about knowing the syntax—they're about demonstrating problem-solving skills, understanding performance implications, and communicating your thought process clearly. Whether you're aiming for your first data analyst role or stepping up to senior data engineer, mastering SQL interview questions is crucial for landing the position you want.

This lesson will prepare you for the full spectrum of SQL questions you'll encounter, from basic syntax to complex analytical queries. We'll work through real interview scenarios, explore multiple solution approaches, and discuss what interviewers are really looking for in your answers.

What you'll learn:

How to approach SQL interview questions systematically and communicate your thinking
Solutions to common basic, intermediate, and advanced SQL interview problems
Multiple approaches for complex queries and when to use each
How to optimize queries and discuss performance considerations during interviews
Strategies for handling edge cases and demonstrating robust thinking
What interviewers evaluate beyond just correct syntax

Prerequisites

You should be comfortable with basic SQL operations (SELECT, WHERE, GROUP BY, JOINs) and understand fundamental database concepts like primary keys and relationships. Familiarity with window functions and CTEs will help with advanced sections, but we'll explain these concepts as we encounter them.

The Interview Mindset: Think Out Loud

Before diving into specific questions, let's establish the right approach. SQL interviews aren't just about getting the right answer—they're about demonstrating how you think through problems.

When faced with a SQL question, follow this framework:

Clarify the requirements: Ask about edge cases, expected output format, and any constraints
Break down the problem: Identify what tables you need and what transformations are required
Start simple: Write a basic version first, then add complexity
Explain your reasoning: Walk through your logic as you write
Consider alternatives: Mention different approaches and trade-offs

Let's see this in action with actual interview questions.

Basic SQL Questions: Building Confidence

These questions test fundamental SQL knowledge and are common in entry-level interviews. Even if you're experienced, you might encounter these as warm-up questions.

Question 1: Employee Salary Analysis

Scenario: You have an employees table with columns: employee_id, name, department, salary, hire_date. Find all employees earning more than the average salary.

The thought process: This requires a subquery to calculate the average, then filter against it.

-- Initial approach - straightforward but clear
SELECT employee_id, name, department, salary
FROM employees
WHERE salary > (
    SELECT AVG(salary) 
    FROM employees
);

Why this works: The subquery calculates the average salary across all employees, and the outer query filters for those above this average.

Alternative approach using window functions:

-- More advanced approach showing additional skills
SELECT employee_id, name, department, salary
FROM (
    SELECT employee_id, name, department, salary,
           AVG(salary) OVER () as avg_salary
    FROM employees
) t
WHERE salary > avg_salary;

Interview tip: Start with the simpler subquery approach, then mention the window function alternative to show you know multiple methods.

Question 2: Department Employee Count

Scenario: Count the number of employees in each department, including departments with zero employees.

This tests understanding of JOINs and handling NULL values.

-- Assuming we have a departments table
SELECT d.department_name, 
       COUNT(e.employee_id) as employee_count
FROM departments d
LEFT JOIN employees e ON d.department_id = e.department_id
GROUP BY d.department_name, d.department_id
ORDER BY employee_count DESC;

Key points to mention:

LEFT JOIN ensures all departments appear, even with zero employees
COUNT(e.employee_id) counts actual employees, not NULL values
GROUP BY includes department_id to handle potential duplicate names

Question 3: Recent Hires

Scenario: Find employees hired in the last 90 days, ordered by hire date.

SELECT name, department, hire_date
FROM employees
WHERE hire_date >= CURRENT_DATE - INTERVAL 90 DAY
ORDER BY hire_date DESC;

Discussion points:

Date arithmetic varies by database system (MySQL shown above)
Consider timezone implications in production systems
Alternative: WHERE hire_date >= DATE_SUB(CURRENT_DATE, INTERVAL 90 DAY)

Intermediate SQL Questions: Demonstrating Depth

These questions require combining multiple SQL concepts and show practical problem-solving skills.

Question 4: Top N per Group

Scenario: Find the highest-paid employee in each department.

This is a classic interview question with several valid approaches.

Approach 1: Window Functions (Recommended)

SELECT department, name, salary
FROM (
    SELECT department, name, salary,
           ROW_NUMBER() OVER (
               PARTITION BY department 
               ORDER BY salary DESC
           ) as rn
    FROM employees
) ranked
WHERE rn = 1;

Approach 2: Correlated Subquery

SELECT e1.department, e1.name, e1.salary
FROM employees e1
WHERE e1.salary = (
    SELECT MAX(e2.salary)
    FROM employees e2
    WHERE e2.department = e1.department
);

Which to choose? Discuss both approaches:

Window function is cleaner and typically faster
Correlated subquery handles ties differently (includes all tied employees)
Performance implications depend on data size and indexes

Question 5: Running Totals

Scenario: Calculate a running total of sales by date.

-- Sample sales table: sale_date, amount
SELECT sale_date,
       amount,
       SUM(amount) OVER (
           ORDER BY sale_date
           ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
       ) as running_total
FROM sales
ORDER BY sale_date;

Advanced variation: Running total by category:

SELECT sale_date, category, amount,
       SUM(amount) OVER (
           PARTITION BY category
           ORDER BY sale_date
           ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
       ) as running_total_by_category
FROM sales
ORDER BY category, sale_date;

Performance note: Window functions can be memory-intensive on large datasets. Mention this consideration in interviews.

Question 6: Finding Gaps in Data

Scenario: Find missing dates in a daily sales report.

This tests understanding of date generation and set operations.

-- Generate date series and find missing dates
WITH date_series AS (
    SELECT DATE_ADD('2024-01-01', INTERVAL n DAY) as report_date
    FROM (
        SELECT a.n + b.n * 10 + c.n * 100 as n
        FROM (SELECT 0 as n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) a
        CROSS JOIN (SELECT 0 as n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) b
        CROSS JOIN (SELECT 0 as n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) c
    ) numbers
    WHERE DATE_ADD('2024-01-01', INTERVAL n DAY) <= '2024-12-31'
)
SELECT ds.report_date
FROM date_series ds
LEFT JOIN sales s ON ds.report_date = s.sale_date
WHERE s.sale_date IS NULL
ORDER BY ds.report_date;

Alternative using recursive CTE (PostgreSQL/SQL Server):

WITH RECURSIVE date_series AS (
    SELECT DATE '2024-01-01' as report_date
    UNION ALL
    SELECT report_date + INTERVAL '1 day'
    FROM date_series
    WHERE report_date < '2024-12-31'
)
SELECT ds.report_date
FROM date_series ds
LEFT JOIN sales s ON ds.report_date = s.sale_date
WHERE s.sale_date IS NULL;

Advanced SQL Questions: Showcasing Expertise

These questions test deep SQL knowledge and complex analytical thinking.

Question 7: Customer Retention Analysis

Scenario: Calculate monthly customer retention rates. A customer is "retained" if they made a purchase in consecutive months.

WITH monthly_customers AS (
    -- Get unique customers per month
    SELECT 
        DATE_FORMAT(purchase_date, '%Y-%m') as purchase_month,
        customer_id
    FROM purchases
    GROUP BY DATE_FORMAT(purchase_date, '%Y-%m'), customer_id
),
customer_months AS (
    -- Add previous month for each customer
    SELECT 
        purchase_month,
        customer_id,
        LAG(purchase_month) OVER (
            PARTITION BY customer_id 
            ORDER BY purchase_month
        ) as previous_month
    FROM monthly_customers
),
retention_data AS (
    -- Identify retained customers
    SELECT 
        purchase_month,
        customer_id,
        CASE 
            WHEN DATE_ADD(STR_TO_DATE(CONCAT(previous_month, '-01'), '%Y-%m-%d'), INTERVAL 1 MONTH) = 
                 STR_TO_DATE(CONCAT(purchase_month, '-01'), '%Y-%m-%d')
            THEN 1 
            ELSE 0 
        END as is_retained
    FROM customer_months
    WHERE previous_month IS NOT NULL
)
SELECT 
    purchase_month,
    COUNT(*) as total_customers,
    SUM(is_retained) as retained_customers,
    ROUND(SUM(is_retained) * 100.0 / COUNT(*), 2) as retention_rate
FROM retention_data
GROUP BY purchase_month
ORDER BY purchase_month;

What this demonstrates:

Complex multi-step analysis using CTEs
Window function usage (LAG)
Date manipulation and comparison
Business metric calculation

Question 8: Hierarchical Data Queries

Scenario: You have an employee table with manager_id references. Find all employees who report to a specific manager, directly or indirectly.

-- Recursive CTE to traverse hierarchy
WITH RECURSIVE employee_hierarchy AS (
    -- Base case: direct reports
    SELECT employee_id, name, manager_id, 1 as level
    FROM employees
    WHERE manager_id = 123  -- Specific manager ID
    
    UNION ALL
    
    -- Recursive case: reports of reports
    SELECT e.employee_id, e.name, e.manager_id, eh.level + 1
    FROM employees e
    INNER JOIN employee_hierarchy eh ON e.manager_id = eh.employee_id
)
SELECT employee_id, name, level
FROM employee_hierarchy
ORDER BY level, name;

Alternative approach for systems without recursive CTEs:

-- Self-joins for limited depth (up to 3 levels shown)
SELECT DISTINCT
    COALESCE(e1.employee_id, e2.employee_id, e3.employee_id) as employee_id,
    COALESCE(e1.name, e2.name, e3.name) as name,
    CASE 
        WHEN e1.employee_id IS NOT NULL THEN 1
        WHEN e2.employee_id IS NOT NULL THEN 2
        WHEN e3.employee_id IS NOT NULL THEN 3
    END as level
FROM employees e1
LEFT JOIN employees e2 ON e1.employee_id = e2.manager_id
LEFT JOIN employees e3 ON e2.employee_id = e3.manager_id
WHERE e1.manager_id = 123
   OR e2.manager_id = 123
   OR e3.manager_id = 123;

Question 9: Advanced Analytical Query

Scenario: Calculate the percentage of total sales each product contributed each month, and identify products that increased their share month-over-month.

WITH monthly_sales AS (
    SELECT 
        DATE_FORMAT(sale_date, '%Y-%m') as sale_month,
        product_id,
        SUM(amount) as product_sales
    FROM sales
    GROUP BY DATE_FORMAT(sale_date, '%Y-%m'), product_id
),
monthly_totals AS (
    SELECT 
        sale_month,
        SUM(product_sales) as total_sales
    FROM monthly_sales
    GROUP BY sale_month
),
sales_with_percentage AS (
    SELECT 
        ms.sale_month,
        ms.product_id,
        ms.product_sales,
        mt.total_sales,
        ROUND(ms.product_sales * 100.0 / mt.total_sales, 2) as sales_percentage
    FROM monthly_sales ms
    INNER JOIN monthly_totals mt ON ms.sale_month = mt.sale_month
),
sales_with_previous AS (
    SELECT 
        *,
        LAG(sales_percentage) OVER (
            PARTITION BY product_id 
            ORDER BY sale_month
        ) as previous_percentage
    FROM sales_with_percentage
)
SELECT 
    sale_month,
    product_id,
    product_sales,
    sales_percentage,
    previous_percentage,
    ROUND(sales_percentage - COALESCE(previous_percentage, 0), 2) as percentage_change,
    CASE 
        WHEN sales_percentage > COALESCE(previous_percentage, 0) THEN 'Increased'
        WHEN sales_percentage < COALESCE(previous_percentage, 0) THEN 'Decreased'
        ELSE 'Unchanged'
    END as trend
FROM sales_with_previous
WHERE previous_percentage IS NOT NULL
    AND sales_percentage > previous_percentage
ORDER BY sale_month, percentage_change DESC;

Performance and Optimization Questions

Interviewers often ask about query performance, especially for senior roles.

Question 10: Query Optimization

Scenario: "How would you optimize this slow-running query?"

-- Original slow query
SELECT c.customer_name, 
       COUNT(o.order_id) as order_count,
       AVG(o.total_amount) as avg_order_value
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE c.registration_date >= '2023-01-01'
GROUP BY c.customer_id, c.customer_name
HAVING COUNT(o.order_id) > 5;

Optimization strategies to discuss:

Index considerations:

-- Suggested indexes
CREATE INDEX idx_customers_reg_date ON customers(registration_date);
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_orders_customer_total ON orders(customer_id, total_amount);

Query restructuring:

-- Pre-filter orders to reduce join size
WITH active_customers AS (
    SELECT customer_id, customer_name
    FROM customers 
    WHERE registration_date >= '2023-01-01'
),
customer_orders AS (
    SELECT customer_id, 
           COUNT(*) as order_count,
           AVG(total_amount) as avg_order_value
    FROM orders o
    WHERE EXISTS (SELECT 1 FROM active_customers ac WHERE ac.customer_id = o.customer_id)
    GROUP BY customer_id
    HAVING COUNT(*) > 5
)
SELECT ac.customer_name, co.order_count, co.avg_order_value
FROM customer_orders co
INNER JOIN active_customers ac ON co.customer_id = ac.customer_id;

Key optimization topics to discuss:

Index usage and design
JOIN order and type selection
WHERE vs HAVING clause placement
Subquery vs JOIN performance
Partitioning for large tables

Hands-On Exercise

Let's work through a comprehensive scenario that combines multiple concepts. You're analyzing an e-commerce database with these tables:

-- Table structures
customers: customer_id, name, email, registration_date, city
products: product_id, name, category, price
orders: order_id, customer_id, order_date, status
order_items: order_item_id, order_id, product_id, quantity, unit_price

Challenge: Write a query to find the top 3 customers by total purchase amount in each city, but only include customers who have made purchases in at least 2 different product categories.

Solution approach:

WITH customer_purchases AS (
    -- Calculate total purchases and category count per customer
    SELECT 
        c.customer_id,
        c.name,
        c.city,
        SUM(oi.quantity * oi.unit_price) as total_purchase_amount,
        COUNT(DISTINCT p.category) as category_count
    FROM customers c
    INNER JOIN orders o ON c.customer_id = o.customer_id
    INNER JOIN order_items oi ON o.order_id = oi.order_id
    INNER JOIN products p ON oi.product_id = p.product_id
    WHERE o.status = 'completed'
    GROUP BY c.customer_id, c.name, c.city
    HAVING COUNT(DISTINCT p.category) >= 2
),
ranked_customers AS (
    -- Rank customers within each city
    SELECT 
        *,
        ROW_NUMBER() OVER (
            PARTITION BY city 
            ORDER BY total_purchase_amount DESC
        ) as city_rank
    FROM customer_purchases
)
SELECT 
    city,
    name,
    total_purchase_amount,
    category_count,
    city_rank
FROM ranked_customers
WHERE city_rank <= 3
ORDER BY city, city_rank;

Work through this step-by-step:

Join all necessary tables to get purchase data
Filter for completed orders only
Group by customer and calculate totals
Use HAVING to filter for multi-category customers
Rank within each city
Filter for top 3 per city

Common Mistakes & Troubleshooting

Mistake 1: Forgetting NULL Handling

-- Problematic
SELECT AVG(salary) FROM employees WHERE department = 'Sales';

-- Better
SELECT AVG(COALESCE(salary, 0)) FROM employees WHERE department = 'Sales';
-- Or exclude NULLs explicitly
SELECT AVG(salary) FROM employees WHERE department = 'Sales' AND salary IS NOT NULL;

Mistake 2: Incorrect GROUP BY with Aggregates

-- Wrong - will cause errors in most databases
SELECT department, name, AVG(salary)
FROM employees
GROUP BY department;

-- Correct
SELECT department, AVG(salary)
FROM employees
GROUP BY department;

Mistake 3: Cartesian Products in JOINs

-- Dangerous - can create massive result sets
SELECT *
FROM customers c, orders o, order_items oi
WHERE c.customer_id = o.customer_id;  -- Missing join to order_items

-- Correct
SELECT *
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id;

Mistake 4: Performance Anti-patterns

-- Slow - function on column prevents index usage
SELECT * FROM orders WHERE YEAR(order_date) = 2024;

-- Fast - allows index usage
SELECT * FROM orders WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01';

Debugging Strategies

Start simple: Build queries incrementally
Use LIMIT: Test with small result sets first
Check row counts: Verify each step produces expected results
Explain plans: Use EXPLAIN to understand query execution
Test edge cases: Empty tables, NULL values, duplicates

Interview Communication Tips

Do:

Think out loud as you write
Start with a simple approach, then optimize
Ask clarifying questions
Explain your assumptions
Discuss alternative approaches
Mention performance considerations

Don't:

Jump straight to complex solutions
Ignore edge cases
Write code without explaining
Panic if you make syntax errors
Claim your first solution is always best

Sample dialogue:

"I need to find the second-highest salary. Let me think about this step by step. First, I could use a window function with ROW_NUMBER to rank salaries, or I could use a subquery to exclude the maximum. Let me start with the window function approach since it handles ties more predictably..."

Summary & Next Steps

SQL interviews test both technical knowledge and problem-solving ability. Success comes from:

Systematic approach: Break problems down, start simple, build complexity
Multiple solutions: Know different ways to solve common problems
Communication: Explain your thinking clearly
Performance awareness: Understand when queries might be slow
Edge case consideration: Think about NULLs, duplicates, and empty results

Practice strategies:

Work through problems on paper before coding
Time yourself on common question types
Practice explaining solutions out loud
Review different approaches for the same problem
Build a personal repository of go-to query patterns

Next learning areas:

Database-specific functions and syntax
Query optimization and execution plans
Data modeling and normalization principles
NoSQL query languages (MongoDB, etc.)
Advanced analytics functions and statistical queries

The key to SQL interview success isn't memorizing every possible question—it's developing a systematic approach to problem-solving and clearly communicating your technical thinking. With the patterns and strategies covered here, you'll be prepared to tackle SQL challenges confidently in any data role interview.