Difference between revisions of "The JOIN operation"
m (Added some <code> tags) |
m (Grammatical change.) |
||
(27 intermediate revisions by 11 users not shown) | |||
Line 1: | Line 1: | ||
+ | [[File:footballERD.png]] | ||
<div class="ref_section"> | <div class="ref_section"> | ||
<table class='db_ref'> | <table class='db_ref'> | ||
Line 69: | Line 70: | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td> | + | <td>1002</td> |
<td>RUS</td> | <td>RUS</td> | ||
<td>Roman Pavlyuchenko</td> | <td>Roman Pavlyuchenko</td> | ||
Line 116: | Line 117: | ||
<p>This tutorial introduces <code>JOIN</code> which allows you to use data from two or more tables. The tables contain all matches and goals from UEFA EURO 2012 Football Championship in Poland and Ukraine.</p> | <p>This tutorial introduces <code>JOIN</code> which allows you to use data from two or more tables. The tables contain all matches and goals from UEFA EURO 2012 Football Championship in Poland and Ukraine.</p> | ||
+ | <p>The data is available (mysql format) at http://sqlzoo.net/euro2012.sql</p> | ||
<div class="progress_panel"><div> | <div class="progress_panel"><div> | ||
<div class="summary">Summary</div> | <div class="summary">Summary</div> | ||
Line 126: | Line 128: | ||
<div class='qu'> | <div class='qu'> | ||
− | The first example shows the goal scored by 'Bender'. | + | The first example shows the goal scored by a player with the last name 'Bender'. The <code>*</code> says to list all the columns in the table - a shorter way of saying <code>matchid, teamid, player, gtime</code> |
− | <p class='imper'> | + | <p class='imper'>Modify it to show the ''matchid'' and ''player'' name for all goals scored by Germany. To identify German players, check for: |
<code>teamid = 'GER'</code></p> | <code>teamid = 'GER'</code></p> | ||
Line 144: | Line 146: | ||
<div class='qu'> | <div class='qu'> | ||
− | From the previous query you can see that Lars Bender's goal | + | <p>From the previous query you can see that Lars Bender's scored a goal in game 1012. Now we want to know what teams were playing in that match.</P> |
− | Notice that the column <code>matchid</code> in the <code>goal</code> table corresponds to the <code>id</code> column in the <code>game</code> table. | + | <p>Notice in the that the column <code>matchid</code> in the <code>goal</code> table corresponds to the <code>id</code> column in the <code>game</code> table. We can look up information about game 1012 by finding that row in the '''game''' table.</p> |
− | <p class='imper'>Show id, stadium, team1, team2 for game 1012</p> | + | <p class='imper'>Show id, stadium, team1, team2 for just game 1012</p> |
<source lang='sql' class='def'> | <source lang='sql' class='def'> | ||
SELECT id,stadium,team1,team2 | SELECT id,stadium,team1,team2 | ||
FROM game | FROM game | ||
− | |||
</source> | </source> | ||
Line 161: | Line 162: | ||
<div class='qu'> | <div class='qu'> | ||
− | You can combine the two steps into a single query with a <code>JOIN</code>. | + | You can combine the two steps into a single query with a <code>JOIN</code>. |
SELECT * | SELECT * | ||
FROM game JOIN goal ON (id=matchid) | FROM game JOIN goal ON (id=matchid) | ||
− | <p class='imper'> | + | <p>The '''FROM''' clause says to merge data from the goal table with that from the game table. The '''ON''' says how to figure out which rows in '''game''' go with which rows in '''goal''' - the '''id''' from '''goal''' must match '''matchid''' from '''game'''. (If we wanted to be more clear/specific we could say <br/><code>ON (game.id=goal.matchid)</code></p> |
+ | <p>The code below shows the player (from the goal) and stadium name (from the game table) for every goal scored.</p> | ||
+ | <p class='imper'>Modify it to show the player, teamid, stadium and mdate for every German goal.</p> | ||
<source lang='sql' class='def'> | <source lang='sql' class='def'> | ||
Line 172: | Line 175: | ||
<source lang='sql' class='ans'> | <source lang='sql' class='ans'> | ||
− | SELECT player,teamid,mdate | + | SELECT player,teamid,stadium,mdate |
FROM game JOIN goal ON (id=matchid) | FROM game JOIN goal ON (id=matchid) | ||
WHERE teamid='GER' | WHERE teamid='GER' | ||
Line 213: | Line 216: | ||
Notice that because <code>id</code> is a column name in both <code>game</code> and <code>eteam</code> you must specify <code>eteam.id</code> instead of just <code>id</code> | Notice that because <code>id</code> is a column name in both <code>game</code> and <code>eteam</code> you must specify <code>eteam.id</code> instead of just <code>id</code> | ||
− | <p class='imper'>List the dates of the matches in which 'Fernando Santos' was the team1 coach.</p> | + | <p class='imper'>List the the dates of the matches and the name of the team in which 'Fernando Santos' was the team1 coach.</p> |
<source lang='sql' class='def'> | <source lang='sql' class='def'> | ||
Line 227: | Line 230: | ||
<div class='qu'> | <div class='qu'> | ||
− | <p class='imper'>List the player for every goal scored in a game where the | + | <p class='imper'>List the player for every goal scored in a game where the stadium was 'National Stadium, Warsaw'</p> |
<source lang='sql' class='def'> | <source lang='sql' class='def'> | ||
Line 244: | Line 247: | ||
<h2>More difficult questions</h2> | <h2>More difficult questions</h2> | ||
<div class='qu'> | <div class='qu'> | ||
− | <div> | + | <div>The example query shows all goals scored in the Germany-Greece quarterfinal.</div> |
− | <p class='imper'> | + | <p class='imper'>Instead show the '''name''' of all players who scored a goal against Germany.</p> |
<div class="hint" title="HINT"> | <div class="hint" title="HINT"> | ||
− | Select goals scored by non-German players in matches where GER was the id of either '''team1''' or '''team2'''. | + | Select goals scored only by non-German players in matches where GER was the id of either '''team1''' or '''team2'''. |
You can use <code>teamid!='GER'</code> to prevent listing German players. | You can use <code>teamid!='GER'</code> to prevent listing German players. | ||
− | You can use DISTINCT to stop players being listed twice. | + | You can use <code>DISTINCT</code> to stop players being listed twice. |
</div> | </div> | ||
Line 302: | Line 305: | ||
<div class='qu'> | <div class='qu'> | ||
− | <div class='imper'>For every match involving 'POL', show the matchid date and the number of goals scored.</div> | + | <div class='imper'>For every match involving 'POL', show the matchid, date and the number of goals scored.</div> |
<source lang='sql' class='def'> | <source lang='sql' class='def'> | ||
SELECT matchid,mdate, team1, team2,teamid | SELECT matchid,mdate, team1, team2,teamid | ||
Line 319: | Line 322: | ||
<div class='qu'> | <div class='qu'> | ||
− | <div class='imper'>For every match where 'GER' scored, show the number of goals scored by 'GER'</div> | + | <div class='imper'>For every match where 'GER' scored, show matchid, match date and the number of goals scored by 'GER'</div> |
<source lang='sql' class='def'> | <source lang='sql' class='def'> | ||
</source> | </source> | ||
Line 333: | Line 336: | ||
<div class='qu'> | <div class='qu'> | ||
− | <div class='imper'>List every match with the goals scored by each team as shown.</div> | + | <div class='imper'>List every match with the goals scored by each team as shown. This will use "[[CASE|CASE WHEN]]" which has not been explained in any previous exercises.</div> |
<table class="sqlmine"> | <table class="sqlmine"> | ||
<tr><th>mdate</th><th>team1</th><th>score1</th><th>team2</th><th>score2</th></tr> | <tr><th>mdate</th><th>team1</th><th>score1</th><th>team2</th><th>score2</th></tr> | ||
Line 341: | Line 344: | ||
<tr><td colspan=5>...</td></tr> | <tr><td colspan=5>...</td></tr> | ||
</table> | </table> | ||
− | Notice in the query given every goal is listed. If it was a team1 goal then a 1 appears in score1, otherwise there is a 0. You could SUM this column to get a count of the goals scored by team1. | + | Notice in the query given every goal is listed. If it was a team1 goal then a 1 appears in score1, otherwise there is a 0. You could SUM this column to get a count of the goals scored by team1. '''Sort your result by mdate, matchid, team1 and team2.''' |
<source lang='sql' class='def'> | <source lang='sql' class='def'> | ||
SELECT mdate, | SELECT mdate, | ||
Line 355: | Line 358: | ||
team2, | team2, | ||
SUM(CASE WHEN teamid=team2 THEN 1 ELSE 0 END) score2 | SUM(CASE WHEN teamid=team2 THEN 1 ELSE 0 END) score2 | ||
− | FROM game JOIN goal ON matchid = id | + | FROM game LEFT JOIN goal ON matchid = id |
GROUP BY mdate,matchid,team1,team2 | GROUP BY mdate,matchid,team1,team2 | ||
</source> | </source> |
Revision as of 09:46, 27 September 2017
id | mdate | stadium | team1 | team2 |
---|---|---|---|---|
1001 | 8 June 2012 | National Stadium, Warsaw | POL | GRE |
1002 | 8 June 2012 | Stadion Miejski (Wroclaw) | RUS | CZE |
1003 | 12 June 2012 | Stadion Miejski (Wroclaw) | GRE | CZE |
1004 | 12 June 2012 | National Stadium, Warsaw | POL | RUS |
... |
matchid | teamid | player | gtime | |
---|---|---|---|---|
1001 | POL | Robert Lewandowski | 17 | |
1001 | GRE | Dimitris Salpingidis | 51 | |
1002 | RUS | Alan Dzagoev | 15 | |
1002 | RUS | Roman Pavlyuchenko | 82 | |
... |
id | teamname | coach | ||
---|---|---|---|---|
POL | Poland | Franciszek Smuda | ||
RUS | Russia | Dick Advocaat | ||
CZE | Czech Republic | Michal Bilek | ||
GRE | Greece | Fernando Santos | ||
... |
JOIN and UEFA EURO 2012
This tutorial introduces JOIN
which allows you to use data from two or more tables. The tables contain all matches and goals from UEFA EURO 2012 Football Championship in Poland and Ukraine.
The data is available (mysql format) at http://sqlzoo.net/euro2012.sql
The first example shows the goal scored by a player with the last name 'Bender'. The *
says to list all the columns in the table - a shorter way of saying matchid, teamid, player, gtime
Modify it to show the matchid and player name for all goals scored by Germany. To identify German players, check for:
teamid = 'GER'
SELECT * FROM goal
WHERE player LIKE '%Bender'
SELECT matchid, player
FROM goal
WHERE teamid LIKE 'GER'
From the previous query you can see that Lars Bender's scored a goal in game 1012. Now we want to know what teams were playing in that match.
Notice in the that the column matchid
in the goal
table corresponds to the id
column in the game
table. We can look up information about game 1012 by finding that row in the game table.
Show id, stadium, team1, team2 for just game 1012
SELECT id,stadium,team1,team2
FROM game
SELECT id,stadium,team1,team2
FROM game
WHERE id=1012
You can combine the two steps into a single query with a JOIN
.
SELECT * FROM game JOIN goal ON (id=matchid)
The FROM clause says to merge data from the goal table with that from the game table. The ON says how to figure out which rows in game go with which rows in goal - the id from goal must match matchid from game. (If we wanted to be more clear/specific we could say ON (game.id=goal.matchid)
The code below shows the player (from the goal) and stadium name (from the game table) for every goal scored.
Modify it to show the player, teamid, stadium and mdate for every German goal.
SELECT player,stadium
FROM game JOIN goal ON (id=matchid)
SELECT player,teamid,stadium,mdate
FROM game JOIN goal ON (id=matchid)
WHERE teamid='GER'
Use the same JOIN
as in the previous question.
Show the team1, team2 and player for every goal scored by a player called Mario player LIKE 'Mario%'
SELECT team1, team2, player
FROM game JOIN goal ON (id=matchid)
WHERE player LIKE 'Mario%'
The table eteam
gives details of every national team including the coach. You can JOIN
goal
to eteam
using the phrase goal JOIN eteam on teamid=id
Show player
, teamid
, coach
, gtime
for all goals scored in the first 10 minutes gtime<=10
SELECT player, teamid, gtime
FROM goal
WHERE gtime<=10
SELECT player, teamid, coach, gtime
FROM goal JOIN eteam ON (teamid=id)
WHERE gtime<=10
To JOIN
game
with eteam
you could use either
game JOIN eteam ON (team1=eteam.id)
or game JOIN eteam ON (team2=eteam.id)
Notice that because id
is a column name in both game
and eteam
you must specify eteam.id
instead of just id
List the the dates of the matches and the name of the team in which 'Fernando Santos' was the team1 coach.
SELECT mdate,teamname
FROM game JOIN eteam ON (team1=eteam.id)
WHERE coach='Fernando Santos'
List the player for every goal scored in a game where the stadium was 'National Stadium, Warsaw'
SELECT player
FROM goal JOIN game ON (id=matchid)
WHERE stadium = 'National Stadium, Warsaw'
More difficult questions
Instead show the name of all players who scored a goal against Germany.
Select goals scored only by non-German players in matches where GER was the id of either team1 or team2.
You can use teamid!='GER'
to prevent listing German players.
You can use DISTINCT
to stop players being listed twice.
SELECT player, gtime
FROM game JOIN goal ON matchid = id
WHERE (team1='GER' AND team2='GRE')
SELECT DISTINCT player
FROM game JOIN goal ON matchid = id
WHERE (team1 = 'GER' OR team2 = 'GER')
AND teamid!='GER'
You should COUNT(*) in the SELECT line and GROUP BY teamname
SELECT teamname, player
FROM eteam JOIN goal ON id=teamid
ORDER BY teamname
SELECT teamname,COUNT(teamid)
FROM eteam JOIN goal ON id=teamid
GROUP BY teamname
SELECT stadium,COUNT(1)
FROM goal JOIN game ON id=matchid
GROUP BY stadium
SELECT matchid,mdate, team1, team2,teamid
FROM game JOIN goal ON matchid = id
WHERE (team1 = 'POL' OR team2 = 'POL')
SELECT matchid,mdate,COUNT(teamid)
FROM game JOIN goal ON matchid = id
WHERE (team1 = 'POL' OR team2 = 'POL')
GROUP BY matchid,mdate
SELECT matchid,mdate,COUNT(teamid)
FROM game JOIN goal ON matchid = id
WHERE (teamid='GER')
GROUP BY matchid,mdate
mdate | team1 | score1 | team2 | score2 |
---|---|---|---|---|
1 July 2012 | ESP | 4 | ITA | 0 |
10 June 2012 | ESP | 1 | ITA | 1 |
10 June 2012 | IRL | 1 | CRO | 3 |
... |
Notice in the query given every goal is listed. If it was a team1 goal then a 1 appears in score1, otherwise there is a 0. You could SUM this column to get a count of the goals scored by team1. Sort your result by mdate, matchid, team1 and team2.
SELECT mdate,
team1,
CASE WHEN teamid=team1 THEN 1 ELSE 0 END score1
FROM game JOIN goal ON matchid = id
SELECT mdate,
team1,
SUM(CASE WHEN teamid=team1 THEN 1 ELSE 0 END) score1,
team2,
SUM(CASE WHEN teamid=team2 THEN 1 ELSE 0 END) score2
FROM game LEFT JOIN goal ON matchid = id
GROUP BY mdate,matchid,team1,team2
The next tutorial about the Movie database involves some slightly more complicated joins.