Difference between revisions of "Incorrect GROUP BY"

From SQLZOO
Jump to: navigation, search
Line 19: Line 19:
 
<p>In a group by line each region shows up only once - however in a typical region such as Africa there are several different name values. WHich one should SQL pick?</p>  
 
<p>In a group by line each region shows up only once - however in a typical region such as Africa there are several different name values. WHich one should SQL pick?</p>  
 
<h2>Solutions</h2>
 
<h2>Solutions</h2>
<ul>
+
*Remove the offending field from the SELECT line
<li>Remove the offending field from the SELECT line</li>
+
  SELECT region, MAX(population)
  <li>Add the field to the GROUP BY clause</li>
+
  FROM bbc
  <li>Aggregate the offending field</li>
+
  GROUP BY region
</ul>
+
*Add the field to the GROUP BY clause
 +
  SELECT name, region, MAX(population)
 +
  FROM bbc
 +
  GROUP BY name,region
 +
*Aggregate the offending field
 +
SELECT MAX(name), region, MAX(population)
 +
  FROM bbc
 +
  GROUP BY region
 
</div>
 
</div>
  

Revision as of 12:48, 9 August 2012

schema:gisq

Problem

When using a GROUP BY clause every field in the SELECT list must be either:

  • One of the GROUP BY terms - in this case region
  • An aggregate function - for example SUM or COUNT
  • An expression based on the above

In the example the field name may not be used on the SELECT line.

In a group by line each region shows up only once - however in a typical region such as Africa there are several different name values. WHich one should SQL pick?

Solutions

  • Remove the offending field from the SELECT line
SELECT region, MAX(population)
 FROM bbc
 GROUP BY region
  • Add the field to the GROUP BY clause
SELECT name, region, MAX(population)
 FROM bbc
 GROUP BY name,region
  • Aggregate the offending field
SELECT MAX(name), region, MAX(population)
 FROM bbc
 GROUP BY region
 
 
SELECT name, region, MAX(population)
  FROM bbc
  GROUP BY region