by Garry Robinson
This article will introduce some basic concepts for exploring data stored in data tables. This will cover the SQL aggregate query, the where clause and the MS Graph form object.
I will demonstrate how the Orders Table in the Northwind database can be setup in a form for undertaking a drilldown. This will include adding the fieldnames for the aggregate functions (sum, count, avg) and the drilldown group by fields. Also demonstrated will be a simple way to setup a where clause for the query that copes with the vagaries of Date, String and Numerical data types.
Once the components for the aggregate query are established, we then have to make a SQL statement that will aggregate our data. The processes here must allow for the user selecting ANY allowable field and ANY suitable "where clause". Eg the following SQL shows the total freight costs by Region once the user has selected USA as a country to drilldown on.
SELECT Orders.ShipRegion, Sum(Orders.Freight) AS SumFreight
GROUP BY Orders.ShipRegion;
The fun bit.
Once you have established the SQL, you have to feed it to the Graph object. This actually is a very simple end to the process because whenever you change the SQL property for the graph, the graph object will change to reflect the changed data.
Tucked away in the bowels of Access is a powerful graphing tool called MS Graph. A lot of Office users will be familiar with the graphing capabilities of Excel but very few people seem to have successfully implemented graphs of much use inside Access. Both products use similar core elements from the Office charting engine. The difference between the two is that with Excel, you derive your graph data by pointing the graph to cells in a spreadsheet. This allows people to manually move the cells around until they end up with a graph that they are happy with. In Access, you have to manipulate queries until you end up the graph that you want. This is a lot more difficult but the end result is far more versatile as the Access graph will work irrespective of the number of rows of data that you introduce via the query. The other important feature of Access that you will deploy for data mining is the direct support of Structured Query Language in Visual Basic code. This allows you to easily change the exploration queries on the fly.
Customer Order Date Shipped Date Freight City Postal Code Country
VINET 04-Aug-94 16-Aug-94 $32.38 Reims 51100 France
TOMSP 05-Aug-94 10-Aug-94 $11.61 Münster 44087 Germany
HANAR 08-Aug-94 12-Aug-94 $65.83 Rio 05454 Brazil
Table 1: NORTHWIND ORDERS TABLE
Applying The SQL To MS Graph
To make a graph in Access, you need to build a particular SQL statement called a consolidation query and then apply that to an MS Graph object in an Access form. The best way to read about the different filtering techniques in Access is to start up the Office Assistant (97) or Answer Wizard (95) and enter "Group By". Then choose "Group By Clause" and read the information plus review See Also.
If we were to setup a consolidation query on the nwOrders table, i.e. to show the sum of freight costs by country after a certain date, we would have the following SQL query and sample result as shown in Figure 2. This consolidation query is broken up into the following parts.
SELECT shipCountry, Group By Columns
Sum(Freight) AS totalFreight Aggregated values
FROM nwOrders Table/Query source
WHERE orderDate > #01-Jan-1996# Where Filter
GROUP BY shipCountry Group By Columns
ORDER BY shipCountry; Order By Clause
Table 2: CONSOLIDATION QUERY -- A Consolidation Query shown into its constituent parts plus a sample listing derived from this query. Refer to this figure for a reference on terms used in building these types of queries.
Introduction To Data Mining
Following are some exploratory data mining on the nwOrders table to illustrate the technique. In this exercise, I was interested in finding out where the top Freight costs were in the Orders table. So I started off with the query in Figure 2 which produced the following graph in an Access form
Figure 3: TOTAL FREIGHT COSTS BY COUNTRY -- A Microsoft Access 3D graph showing
the Sum of Freight Costs by the country where those cost occurred
Now to review the freight costs in USA (which were the highest), I needed to drilldown on the country USA. This produces the following graph
Figure 4: USA FREIGHT COSTS BY CITY - Drilldown on freight orders for the USA and show
the total freight costs by the city where those costs occurred
Now as the largest freight costs are in Boise, Idaho, we will look at the freight costs by month to see when the main costs occurred. This produces figure 5 which gives a very clear indication that the highest costs occurred during May in 1995. We have now pinpointed the time and location of the problem and Accounts can now look into the detail.
Figure 5: MONTHLY COSTS FOR BOISE, IDAHO -- Total Freight Costs For Boise, Idaho
shown by consolidating the data by month.
Your Data Mining Software
Now we have seen how we can use consolidation queries and graphs, how can you explore your data with the tools in YourDrilldown+FX.mdb. When you open the YourDrilldown+FX.mdb database, you will see the form in Figure 6 to which I have added some colorful numbers to reference different controls on the form and the software and properties behind those controls.
Figure 6: THE DATA MINING FORM -- The data mining form in quickGraf.mdb setup to produce the graf of freight costs by cities in the USA (see figure 4).
Whenever you add a new table or attached table to the database, you will need to update the data mining system tables. The onclick event under this button runs a module called makeDataMiningTables which refreshes the z_sysTables and z_sysColumns tables (see figure 7) that you will need in your database. The source for this routine can be found module FX_DrillDownCode.
tabname colpos colname coltype groupby sumby
NWOrders 1 CustomerID Text Yes No
NWOrders 2 OrderDate Date Yes No
NWOrders 3 ShippedDate Date Yes No
NWOrders 4 ShipVia Number Yes Yes
NWOrders 5 Freight Number Yes Yes
Table 7 Shows the z_sysColumns table that is used in the data mining form to provide information on columns in the currently selected table.
This tableReq combo box will show all the tables in the database that you can drilldown on. The important properties for this control are
rowSourceType = Table/Query
RowSource = z_SysTables
The one exit event event will reset the other combo box controls to look at a newly selected table as follows
Me!field1.RowSource = Me!field1.RowSource
The Field1 controls that allow you to setup the aggregate equation in your consolidation query do not have any VB code but have the following important properties that draw on the z_SysColumns table show in figure 7. As aggregate equations generally apply to numerical fields, only those fields in z_SysColumns where the SumBy flag is set to true are selected as rows in this combo box.
Me!Field1.rowsource = "SELECT DISTINCTROW z_SysColumns.colname, z_SysColumns.coltype FROM z_SysColumns
WHERE (((z_SysColumns.sumby)=True) AND ((z_SysColumns.tabname)=[Forms]![FX_YourDrilldown]![tableReq]));"
The second field1 combo box called Aggreg1 displays a list of the valid aggregate equations such as Sum, Average, Count that can be used in the consolidation query. The Name1 text box allows you to give the aggregate equation a useful name when it is displayed in a graph. You will change this manually.
The GroupBy field is used to select which field or function in the table that you wish to consolidate on. In figure 6, we are showing the total results by the city where the freight was shipped from. The important property in this field is the RowSource and in this case we limit the fields in the table to those that have the groupBy flag set to true. This will basically preclude any fields such as memo fields and insert objects.
GroupBy.RowSource = SELECT DISTINCTROW z_SysColumns.colname, z_SysColumns.coltype FROM z_SysColumns
WHERE (((z_SysColumns.groupby)=True) AND ((z_SysColumns.tabname)=[Forms]![FX_YourDrilldown]![tableReq]));
A very powerful feature of Access queries (that may not be supported by SQL backend databases) is the ability to consolidate data using a function. In figure 5, I showed data by month using the format function Format([ShippedDate],"yyyy-mm"). Try inserting this and other Access Functions such as Year([shippedDate]) to group your data into segments not actually supported in the data itself.
To allow you to drilldown on your data, the DrillFld combo box will display a list of fields like the GroupBy combo box. Choose one of these and then in the DrillVal combo choose a unique value that will be displayed for that field. There are 2 important visual basic events under this control to firstly build the unique values list for the combo box record source and then when you exit the combo box to generate and display a where filter that will be used for the drilldowns.
Private Sub drillVal_Enter()
' Build a unique list of values from the drill
' field that has been selected
Dim drillsql As String
If Not IsNull(Me!tableReq) And Not IsNull(Me!groupBy) Then
' Make a unique values RowSource SQL statement
drillsql = "SELECT " & Me!drillFld & " from " & Me!tableReq _
& " where " & Me!drillFld & " is not null " & " group by " & Me!drillFld & ";"
' Now set the rowSource to this new SQL
Me!DrillVal.RowSource = drillsql
Private Sub drillVal_Exit(Cancel As Integer)
' Prepare a suitable where filter by
' enclosing the data within suitable characters
Dim txtEnclose As String
Select Case Me!drillFld.Column(1)
txtEnclose = "'"
txtEnclose = "#"
Case Else ' Number fields
txtEnclose = ""
Me![txtFilter] = "(" & Me!drillFld & " = " _
& txtEnclose & Me!DrillVal & txtEnclose & ")"
The txtFilter field serves as a place where the valid drilldown where filter is placed. If you have any understanding of SQL where statements, you can manual add any valid filter into this field and it will be immediately applied to the graph. An example of this would be to look at the results from both USA and Canada using the following manually entered where filter
(ShipCountry = 'USA') and (ShipCountry = 'Canada')
The cmdOpenGraf button is where all these different fields are brought together to make a valid SQL consolidation query. This is then applied to the record source of the Form with the graph object to display the data. The code that sits behind this button is
Private Sub cmdOpenGraf_Click()
' Build a consolidation query based on the
' selections in the quikGraf form
' and then pass that query to the graph
' object in the Access form so that we
' can view the results.
Dim grafsql As String, sqlOK As Integer
Dim grafName As String
If Not IsNull(Me!tableReq) And _
Not IsNull(Me!field1) And _
Not IsNull(Me!groupBy) Then
' Consolidate By and
' Aggregrate equation for field 1
grafsql = "SELECT " & Me!groupBy & ", " _
& Me!aggreg1 & "(" & field1 & ") as " & name1
If Not IsNull(Field2) And Not IsNull(aggreg2) Then
' Aggregrate equation for 2
grafsql = grafsql & ", " & Me!aggreg2 & _
"(" & Me!Field2 & ") as " & Me!Name2
' Table To Be Viewed
grafsql = grafsql & " from " & Me!tableReq
' Where filter for drilldown
If Not IsNull(Me!txtFilter) Then
grafsql = grafsql & " where " & Me!txtFilter
' Consolidate By
grafsql = grafsql & " group by " & Me!groupBy & ";"
sqlOK = MsgBox(grafsql, vbYesNo, _
"Send This SQL to the Graph")
If sqlOK = vbYes Then
' Send it the sql to both the graph object
' and the header field in the form
grafName = "frmGraf3D"
Forms(grafName)!Graph1.RowSource = grafsql
Forms(grafName)!Main_Title.Caption = grafsql
MsgBox "Fill In Table, First, Consolidate Fields", _
vbInformation, "Problem with The Graphing"
This event code will produce a valid sql statement that is displayed in a message box before you apply it to the graph object. The SQL that made the graph in figure 5 is as follows
SELECT format([shippedDate], "yyyy-mm"), Sum(Freight) as totalFreight from NWOrders where (ShipCity = 'Boise') group by format([shippedDate], "yyyy-mm");
Microsoft Access has a very powerful query engine that can be used to analyse data through consolidation queries. If you add simple where filters to these queries and then switch the fields that you consolidate the data on, you can start to gain a good understanding of your valuable data. All of this can be done manually using consolidation queries or even by driving the Microsoft graph wizard. Expand on this approach by automating the creation of the consolidation queries using visual basic and then apply this SQL to your graphs so that you can quickly discover trends in your data that you did not know exist. This really only scratches the surface of data mining but if it offers you an insight into your information that can make or save your organization money, then you may be encouraged to invest more resources into this rapidly expanding field of computing.
The Software Database
The supporting Access accdb database called quikGraf.accdb includes the following objects
NwOrders The orders table from the Access 97 demonstration database (see Fig 1)
ZWorldDemo A trial table of 300 sales and budget results for practicing
z_sysTables A table that will hold the data mining definitions of all tables in your database
z_sysColumns A table that will hold the data mining definitions of all the field columns in the database.
FX_YourDrilldown A form that allows you to data mine your data using filters. This form includes code behind the form illustrated in this help file
FrmGraf3D A form that displays the data mining results in graphical form
FX_DrillDownCode A module that contains functions to fill the data mining system tables
The file name in the downloads from SWReg is called quikGraf.accdb
Click here to find out how to purchase all the downloads