This tutorial is adapted from the Web-Age course Introduction to SQL.
A SQL Join statement is used to combine data or rows from two or more tables based on a common field between them. A subquery is a query that is nested inside a SELECT
, INSERT
, UPDATE
, or DELETE
statement, or inside another subquery. Joins and subqueries are both used to combine data from different tables into a single result.
A primary key is a column in a table that’s used to uniquely identify a row in that table.
A foreign key is used to form a relationship between two tables. For example, let’s say you have a one-to-many relationship between customers in a CUSTOMER table and orders in an ORDER table. To relate the two tables, you would define an ORDER_ID column in the CUSTOMER table that corresponds with the ORDER_ID column in the ORDER table.
Note: The name of the foreign key column does not need to match the name of the primary key column.
The values defined in a foreign key column MUST match the values defined in the primary key column. This is referred to as referential integrity and is enforced by the RDBMS. If the primary key for a table is a composite key, the foreign key must also be a composite key.
Inner joins are used to combine related information from multiple tables. An inner join retrieves matching rows between two tables. Matches are normally made based on the primary key/foreign key relationships that exist. If there’s a match, a row is included in the results. Otherwise, it’s not.
Syntax:
SELECT <columns>
FROM <table1>
JOIN <table2>
ON <table1>.<columnA> = <table2>.<columnB>
Technically a foreign key does not need to exist in the related table in order to do an inner join. However, to guarantee referential integrity and for performance reasons, it’s recommended to have one.
If the primary key and foreign key are composite keys, then you would AND together any additional columns when specifying the join. For example, if you had a two-column composite key, then the syntax would be:
SELECT <columns>
FROM <table1>
JOIN <table2>
ON <table1>.<columnA> = <table2>.<columnB>
AND <table1>.<columnC> = <table2>.<columnD>
A database schema represents the organization and structure of a database. It contains the tables and other database objects (e.g,. views, indexes, stored procedures). A database can have more than one schema, which is the case here. The AdventureWorks database actually has 6 schemas (dbo, HumanResources, Person, Production, Purchasing, and Sales). In SQL Server, dbo is the default schema. On the other hand, Northwind only has the dbo schema.
In the example above, the HumanResources.Employee identifier signifies that HumanResources is the schema name and Employee is the table name. The Person. Person identifier signifies that Person is the schema name and Person is the table name.
Joining together 2 tables:
SELECT
Person.FirstName, Person.LastName,
Employee.BirthDate, Employee.HireDate
FROM HumanResources.Employee
JOIN Person.Person
ON Employee.BusinessEntityID = Person.BusinessEntityID
Joining together N tables:
SELECT
Person.FirstName, Person.LastName, Employee.BirthDate, Employee.HireDate, EmailAddress.EmailAddress
FROM HumanResources.Employee
JOIN Person.Person
ON Employee.BusinessEntityID = Person.BusinessEntityID
JOIN Person.EmailAddress
ON Person.BusinessEntityID = EmailAddress.BusinessEntityID
It’s normally good practice to prefix the column names with the table names so it’s clear which table the columns are coming from. However, if there’s no ambiguity, meaning the same columns don’t exist in both tables, you can omit them.
Example:
SELECT
FirstName, LastName,
BirthDate, HireDate
FROM HumanResources.Employee
JOIN Person.Person
ON Employee.BusinessEntityID = Person.BusinessEntityID
To make column names shorter and more readable when joining together two tables, we can use table aliases. Table aliases are similar to column aliases. Instead of abbreviating a column name, we abbreviate a table name. Often times table aliases are one or two characters in length. Table aliases can be used even if you’re not joining together tables.
Syntax:
SELECT <columns>
FROM <table1> <alias1>,
<table2> <alias2>
JOIN <table2>
ON <alias1>.<columnA> = <alias2>.<columnB>
Example:
SELECT
p.FirstName, p.LastName,
e.BirthDate, e.HireDate
FROM HumanResources.Employee e
JOIN Person.Person p
ON e.BusinessEntityID = p.BusinessEntityID
An alternate syntax exists for doing an inner join, which uses the WHERE clause for specifying the join criteria. You may see this style of join used in legacy SQL code.
Syntax:
SELECT <columns>
FROM <table1>, <table2>
WHERE <table1>.<columnA> = <table2>.<columnB>
Example:
SELECT
Person.FirstName, Person.LastName,
Employee.BirthDate, Employee.HireDate
FROM
HumanResources.Employee,
Person.Person
WHERE Employee.BusinessEntityID = Person.BusinessEntityID
To specify outer joins, you need to use a vendor-specific syntax. For example, in Oracle, you need to use the (+) operator and in SQL Server you need to use the *= and =* operators.
What happens if you’re doing an inner join and one of the records in a table doesn’t have a matching record in the other table? Then no row appears in the result set. However, you may still wish to see the row in your report. For example, you may have a customer, but that customer hasn’t placed any orders yet. You still want to list the customer. That’s where outer joins come to the rescue. There are several types of outer joins that we’ll discuss next.
A left outer join returns all records from the table on the left and the records that match from the table on the right. If there is no matching record, then NULL is returned for any columns selected from the table on the right. It is the equivalent of an inner join plus the unmatched rows from the table on the left.
Syntax:
SELECT <columns>
FROM <table1>
LEFT JOIN <table2>
ON <table1>.<columnA> = <table2>.<columnB>
Retrieve back all employee ids (i.e., business entity ids) and any matching purchase order ids.
SELECT
e.BusinessEntityID,
p.PurchaseOrderID
FROM HumanResources.Employee e
LEFT JOIN Purchasing.PurchaseOrderHeader p
ON e.BusinessEntityID = p.EmployeeID

Retrieve back all employee ids (i.e., business entity ids) and any matching job candidate ids and resumes
SELECT e.BusinessEntityID, j.JobCandidateID, j.Resume
FROM HumanResources.Employee e
LEFT JOIN HumanResources.JobCandidate j
ON e.BusinessEntityID = j.BusinessEntityID
ORDER BY j.JobCandidateID DESC

A right outer join is the opposite of a left outer join. A right outer join returns all records from the table on the right and the records that match from the table on the left. If there is no matching record, then NULL is returned for any columns selected from the table on the left. It is the equivalent of an inner join plus the unmatched rows from the table on the right.
Syntax:
SELECT <columns>
FROM <table1>
RIGHT JOIN <table2>
ON <table1>.<columnA> = <table2>.<columnB>
Example:
Retrieve back all matching job candidate ids and resumes and any matching employee ids (i.e., business entity ids)
SELECT e.BusinessEntityID, j.JobCandidateID, j.Resume
FROM HumanResources.Employee e
RIGHT JOIN HumanResources.JobCandidate j
ON e.BusinessEntityID = j.BusinessEntityID
ORDER BY e.BusinessEntityID DESC
A full outer join returns all matching records between the table on the left and the table on the right, as well as all non-matching records on both sides. It is the equivalent of an inner join plus the unmatched rows from the table on the left and the unmatched rows from the table on the right.
Syntax:
SELECT <columns>
FROM <table1>
FULL JOIN <table2>
ON <table1>.<columnA> = <table2>.<columnB>
Example:
Retrieve back all employee ids (i.e., business entity ids) and all job candidate ids and resumes. Match together any records possible.
SELECT e.BusinessEntityID, j.JobCandidateID, j.Resume
FROM HumanResources.Employee e
FULL JOIN HumanResources.JobCandidate j
ON e.BusinessEntityID = j.BusinessEntityID
ORDER BY e.BusinessEntityID DESC
A self-join is a join where the table is joined with itself. It is typically used where there is a hierarchical relationship between the entities (e.g., employee-manager), or you wish to compare rows within the same table. It uses either the inner join or left outer join syntax. Table aliases are used to assign different names to the same table within the query.
Syntax:
SELECT <columns>
FROM <table> AS <alias1>
JOIN <table> AS <alias2>
ON <alias1>.<columnA> = <alias2>.<columnB>
Example:
To match addresses from the same city, use the following query:
SELECT a1.AddressID, a2.AddressID, a1.City
FROM Person.Address a1
JOIN Person.Address a2
ON a1.AddressID > a2.AddressID
WHERE a1.City = a2.City
A subquery is a nested query (inner query) that’s used to filter the results of the outer query. Subqueries can be used as an alternative to joins. A subquery is typically nested inside the WHERE clause.
Syntax:
SELECT <columns>
FROM <table>
WHERE <column> <operator>
(SELECT <columns>
FROM <table>
)
Subqueries must always be enclosed within parentheses.
The operator can correspond with one of the following values:
IN, =, <>, <, >, >=, <=
If the subquery returns more than one result, then you can only use the IN operator
The table that’s specified in the subquery is typically different than the one in the outer query, but it can be the same.
SELECT Name, ProductNumber
FROM Production.Product
WHERE ProductID IN
(SELECT ProductID
FROM Purchasing.PurchaseOrderDetail
WHERE OrderQty > 5)

SELECT FirstName, LastName
FROM Person.Person p
WHERE BusinessEntityID =
(SELECT BusinessEntityID
FROM HumanResources.Employee
WHERE NationalIDNumber = 295847284)
The type of subquery we have studied so far, which we’ll refer to as a regular subquery, is independent of the outer query and gets executed once. Its results are used by the outer query. On the other hand, a correlated subquery depends on the outer query and gets executed once for each row returned by the outer query. It’s also referred to as a repeating subquery. Because correlated subqueries get executed multiple times, they can be slow.
Example:
To retrieve the salespeople whose commission percentage is 1%, we can use the following correlated subquery. The outer query retrieves each employee. The inner query evaluates each row to see if their commission percentage is 1%.
SELECT e.BusinessEntityID, p.FirstName, p.LastName
FROM Person.Person p
JOIN HumanResources.Employee e
ON e.BusinessEntityID = p.BusinessEntityID
WHERE 0.01 IN
(SELECT s.CommissionPct
FROM Sales.SalesPerson s
WHERE e.BusinessEntityID = s.BusinessEntityID)
In this tutorial, we explored how to:
- Use inner, outer, and full joins to query more than one table
- Use self joins to join a table to itself
- Use subqueries to filter the results of an outer query