In MySQL (or any relational database), data is often stored in multiple tables. To get meaningful information, you often need to combine data from these tables. Joins allow you to do exactly that. A join is simply a query that fetches data from two or more tables based on a related column.
Think of joins as the glue that binds tables together, creating a cohesive story from fragmented data.
In this article, we will dive into the details of MySQL joins and explore the four most common join types:
- Inner Join
- Left Join
- Right Join
- Full Outer Join
INNER JOIN
An INNER JOIN in SQL is used to retrieve records from two or more tables where there is a matching value in the columns being joined. This join type focuses solely on the rows that meet the specified condition, effectively filtering out unmatched rows.
Let's open Stash and create two tables as shown below:
CREATE TABLE `my_database`.`customers` (
`customer_id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(100) NOT NULL,
PRIMARY KEY (`customer_id`)
) ENGINE = InnoDB;
CREATE TABLE `my_database`.`orders` (
`order_id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`customer_id` INT UNSIGNED NOT NULL,
`order_date` DATE NOT NULL,
PRIMARY KEY (`order_id`)
) ENGINE = InnoDB;
Then, add some example data:
INSERT INTO `my_database`.`customers` (`customer_id`, `name`) VALUES
(1, 'Alice'),
(2, 'John'),
(3, 'Charlie');
INSERT INTO `my_database`.`orders` (`order_id`, `customer_id`, `order_date`) VALUES
(NULL, 1, '2024-12-01'),
(NULL, 1, '2024-12-09'),
(NULL, 1, '2024-12-14'),
(NULL, 2, '2024-10-09'),
(NULL, 2, '2024-10-13'),
(NULL, 6, '2024-12-22'),
(NULL, 6, '2024-12-23');
Suppose we want to fetch a list of customers and their orders. Here’s how an INNER JOIN can help:
SELECT customers.name, orders.order_id, orders.order_date
FROM customers
INNER JOIN orders
ON customers.customer_id = orders.customer_id;
This query retrieves:
- The
name
from thecustomers
table. - The
order_id
andorder_date
from theorders
table.
When you run the query above, you will notice that Charlie’s details are missing from the result because there’s no matching customer_id
for Charlie in the orders
table. The INNER JOIN excludes non-matching rows by design.
LEFT JOIN
A LEFT JOIN in SQL retrieves all records from the left table (the first table in the query) and the matching records from the right table (the second table in the query). If no match is found in the right table, NULL values are returned for those columns.
LEFT JOIN is useful in scenarios such as:
- Including Unmatched Records: Ensure you don’t exclude data from the primary table, even if no match exists in the secondary table.
- Analyzing Gaps: Identify missing relationships, such as customers without orders or products without sales.
- Building Reports: Create comprehensive reports that include all primary data, even if secondary data is incomplete.
Let's write the same query we used in the previous example, but this time with a LEFT JOIN.
SELECT customers.name, orders.order_id, orders.order_date
FROM customers
LEFT JOIN orders
ON customers.customer_id = orders.customer_id;
As you can see, Charlie is included in the result, even though he hasn’t placed any orders. This behavior is what sets a LEFT JOIN apart from an INNER JOIN.
RIGHT JOIN
A RIGHT JOIN in SQL retrieves all records from the right table (the second table in the query) and the matching records from the left table (the first table in the query). If there’s no match in the left table, NULL values are returned for its columns.
If a LEFT JOIN ensures that the left table’s data is always present in the result, the RIGHT JOIN flips the script, ensuring that all rows from the right table are included.
Suppose you want to retrieve all orders, including those with no matching customer. Here’s how you’d write a RIGHT JOIN query:
SELECT customers.name, orders.order_id, orders.order_date
FROM customers
RIGHT JOIN orders
ON customers.customer_id = orders.customer_id;
In this result, the last two rows highlight an order with no corresponding customer in the customers
table. The RIGHT JOIN ensures that all orders are included, even if no matching customer exists.
FULL OUTER JOIN
A FULL OUTER JOIN retrieves all records from both tables, combining matching rows while including unmatched rows from each table. If a match doesn’t exist in one table, the result set fills the missing side with NULL values.
In databases that natively support FULL OUTER JOIN, the syntax looks like this:
SELECT column_list
FROM table1
FULL OUTER JOIN table2
ON table1.column_name = table2.column_name;
However, MySQL lacks direct support for FULL OUTER JOIN. Instead, you can simulate it by combining LEFT JOIN and RIGHT JOIN with a UNION.
SELECT customers.name, orders.order_id, orders.order_date
FROM customers
LEFT JOIN orders
ON customers.customer_id = orders.customer_id
UNION
SELECT customers.name, orders.order_id, orders.order_date
FROM customers
RIGHT JOIN orders
ON customers.customer_id = orders.customer_id;
This result includes:
- Matched rows (Alice and John with their orders).
- Unmatched rows from
customers
(Charlie). - Unmatched rows from
orders
(order_id = 6
andorder_id = 7
).
Conclusion
Joins are a fundamental aspect of working with relational databases, and mastering them can unlock powerful data analysis capabilities. With this guide, you should have a solid understanding of how each type of join works and when to use them.