Group By and Aggregate SQL Functions
Posted by: Craig Hatmaker
At this point, we have everything we need for about 95% of user reporting and analytical requests. On rare occasions we need to summarize the data prior to putting it into our Excel spreadsheets. This happens when the raw detail for an expected time frame exceeds Excel’s normal 65,536 row limit and is more detailed than required for proper analysis. An example might be a request to review orders for an entire year by customer and product. In moderately sized companies, such requests could easily exceed 65,536 records.
We can work around this limitation by aggregating records so that several database records are represented in Excel as one row. We do this with SQL’s aggregate functions which include:
COUNT(*), SUM(field), AVG(field), MAX(field), MIN(field), FIRST(field), LAST(field)
If you use these functions you must add a GROUP BY clause. I touched on this last month in the post Adding SQL Formulas to the Table. The basic rule for the GROUP BY clause is:
All fields in the SELECT clause that are NOT one of the SQL Aggregate Functions, MUST be included in the GROUP BY clause.
GROUP BY clauses are tedious and often a source of errors for the SQL novice. If you omit or mis-transcribe any SELECT field, your SQL will fail. That’s why SQL novices typically fail on their first attempts with GROUP BY clauses. Since the GROUP BY clause is nearly a replica of the SELECT clause, creating them is tedious. So of course we need to automate that to make our work easier, increase our productivity, and improve the quality of our products. Here is a routine that does just that (with a Fields Table to use test your results).
Function Build_SQL_Group_By(sFields As String, sTable As String) As String
' Build_SQL_Group_By: Build SQL "Group By" field list
' NOTE: SQL Aggregate Functions include:
' Avg(),Count(),First(),Last(),Max(),Min(),Sum()
' Parameters:
' sFields: Field definition range name. It must contain:
'
' Field: Field/Column names
' Alias: File/Table Alias (ex: L for the table LOADS)
' Table: Field's Table name
' SQL Func.: SQL function/expression
' sTable: Table being grouped. "*" indicates all fields from
' all tables will be grouped
' Example:
' sSQL = sSQL & "Group By " & _
' Build_SQL_Group_By("Fields", "*")
' Date Init Modification
' 01/01/01 CWH Initial Programming
On Error GoTo ErrHandler Build_SQL_Group_By = "" 'Assume Something went wrong
Dim lRow As Long Dim sSQL As String Dim sSQLFunc As String 'Value in the SQL Func. Dim sFld As String 'Field with Alias Dim lColTbl As Long lColTbl = FieldColumn("Table", sFields) Dim lColFld As Long lColFld = FieldColumn("Field", sFields) Dim lColSQL As Long lColSQL = FieldColumn("SQL Func.", sFields) Dim lColAls As Long lColAls = FieldColumn("Alias", sFields) If Trim(sFields) = "" Or Trim(sTable) = "" Then Exit Function sSQL = "" With Range(sFields) For lRow = 2 To .Rows.Count 'Include fields for indicated table If (sTable = .Cells(lRow, lColFld) Or sTable = "*") Then sSQLFunc = Trim(UCase(.Cells(lRow, lColSQL))) 'Include fields that are not an aggregate function If sSQLFunc = "" Or _ InStr(1, "AVG,COU,FIR,LAS,MAX,MIN,SUM", _ Left(sSQLFunc, 3)) = 0 Then 'Combine Field with Alias sFld = IIf(Trim(.Cells(lRow, lColAls)) > "", _ Trim(.Cells(lRow, lColAls)) & ".", "") & _ Replace(.Cells(lRow, lColFld), """", "'") 'If SQL Function has a "?", replace with Field If InStr(1, sSQLFunc, "?") > 0 Then _ sSQLFunc = _ Replace(sSQLFunc, "?", sFld) 'If SQL Function, use it instead of Field If sSQLFunc > "" Then sFld = sSQLFunc sSQL = sSQL & IIf(sSQL > "", ", ", "") & sFld End If End If Next lRow End With Build_SQL_Group_By = sSQL
ErrHandler: If Err.Number <> 0 Then MsgBox _ "Build_SQL_Group_By - Error#" & Err.Number & vbCrLf & _ Err.Description, vbCritical, "Error", Err.HelpFile, Err.HelpContext On Error GoTo 0
End Function
Below is the change to our Macro1() highlighted in red. Pivot_Template has also been removed because this example has no “State” column.
If .pOK Then 'OK button used to exit
sSQL = "SELECT " & Build_SQL_Select_Fields("Fields") & vbCr & _
"FROM Customers C, Orders O, " & vbCr & _
" `Order Details` D, Products P " & vbCr & _
"WHERE O.`Customer ID` = C.ID " & vbCr & _
" AND O.`Order ID` = D.`Order ID` " & vbCr & _
" AND D.`Product ID` = P.ID " & vbCr & _
" AND O.`Order Date` Between #" & _
Format(.pFrom, "mm/dd/yyyy") & "# And #" & _
Format(.pTo, "mm/dd/yyyy") & "# " & vbCr & _
Build_SQL_ID("O.`Customer ID`", Trim(.pID1), False) & vbCr & _
Build_SQL_ID("P.`Product Code`", Trim(.pID2), True) & vbCr & _
"GROUP BY " & Build_SQL_Group_By("Fields", "*")
SQLLoad sSQL, sConnect, "A4", "Data", "Data"
If NameExists("Data") Then
If Range("Data").Rows.Count > 1 Then
Add_XLFormula "Data", "Fields"
Freeze_Pane "Data", "Fields"
Sort_Data "Data", "Fields"
Format_Results "Data", "Fields"
End If
End If
End If





