Exploring Your Data Visually

<< Click to Display Table of Contents >>

Navigation:  Smart Access 1996-2006 > Aug-1998 >

Exploring Your Data Visually

Garry Robinson            
 
Graf-FX is a shareware application written entirely in Access (and available in all current versions). It's designed to provide tools to explore your databases with graphs and queries using a technique called data mining. It's also a quick way to generate/prototype Access Graphs without running the Wizards.

 

Here is the full magazine in PDF format - This is Garry's first every article

 

 
Microsoft Access provides a good foundation for the database technique called data mining. To be exact, Access's strengths in this area are:

interfaces to most database formats

industrial-strength SQL queries

the very flexible Microsoft Graph.

 
In this article, I'm going to introduce you to the field of data mining. You can explore with me by using Graf-FX. Graf-FX is a fully functional shareware data mining application, and three versions are available for Download -- Access 2.0, Access 95, and Access 97. I'm going to use the Access 97 version of the Graf-FX (GRAF8-FX.MDE) and the data in the Northwind database that ships with Access 97.

199808_gr1
Figure 1 - The Graf-FX Interface

 

199808_gr2
Figure 2

 

199808_gr3
Figure 3

 

199808_gr4
Figure 4

 
 
Data mining essentially allows you (or your users) to explore your data. This means providing your users with a list of the data sets available to be explored and allowing them to review that data. Since the data is normally too detailed to be useful, data mining usually assumes that the user will want to summarize the data in a variety of ways. Unfortunately, summarizing conceals information in the detail data that the user wants to get at.
 
To handle the problems created by summarization, data mining tools provide the ability to use drilldowns. A drilldown allows the user to see the various categories within the summarized data. A set of sales data might have several different categories: salesperson, geographical area, customer, and so forth. A data mining user should be able to select a category and see a summarized data item broken out by the various items in the category. For instance, by selecting geographical area, a data mining user would be able to see the single sales figure broken down into the sales for the North, South, East, and West regions. This drilldown process should be able to be repeated until no more summarization is possible. In this sales example, for instance, it should be possible to keep drilling down until you reach a single sale to a single customer.
 
While data mining tools are almost synonymous with graphing, that shouldn't prevent the user from looking at the actual numbers. Data mining is most useful when the user can see a graph that visually shows the trends over a period of time, or the distribution over sales areas. However, the user should always have the option to abandon the visual display and see the actual numbers behind it.
 
Enough theory! The best way to see what data mining is like is to do it. The following tutorial will let you see what a data mining session is like. I'll also lift up the hood to show you how you Graf-FX makes it all happen.
 
Set-up
Graf-FX has been configured in a front-end software database rather than as an Access add-in. To use Graf-FX to start experimenting with data mining your data, you'll first need to link/attach your database to the Graf-FX database (you'll also need to import any queries that you want to mine for data). If you build your own data mining tool, my recommendation is to create a separate software database for it. Data mining will produce a lot of additional query and form objects that would clutter up an existing Access application.
 
After opening up GRAF8-FX.MDE and experimenting with the trial data, change to the database container and link to all the tables in NORTHWIND.MDB. You should now have links to the tables shown in column 1 of Table 1.
 
Table 1. Linked tables and imported queries from the Northwind.mdb demo database.

Table name

Query name

[Order Details]

Invoices

Categories

[Order Details Extended]

Customers

[Order Subtotals]

Employees

[Product Sales for 1995]

Orders

[Quarterly Orders by Product]

Products

[Quarterly Orders]

Shippers

[Sales by Category]

Suppliers

 

 
 
Now change to the query container and import just the queries shown in column 2 of Table 1. You can import all the queries, but the queries in Table 1 seemed more useful for data mining than some of the others. Test the queries first to make sure that the links are working. Graf-FX will support blank spaces in query and table names, but it's not a practice that should be encouraged.
 
Drilling down
The following example leads to an analysis of the performance of two sales representatives in Anchorage. To start drilling down, open Graf-FX's Launchpad form. The numbered discussion points refer to the highlighted numbers in Figure 1.
 
1. Select the Invoices query in the field to the right of the Select ? prompt. This combo will give you a list of all the tables and queries that exist.
 
2. Select the Quantity field from the Invoice query field list as a summary field. You can show up to eight different summary fields at once (only numerical fields can be selected).
 
3. Select the Sum aggregate function to be used to summarize the field. You can choose from Sum, Avg, Count, Max, Min, StDev, Variance, First, Last, and Weighted Average.
 
4. Select the Country field for the group by consolidation of Invoices data. Normally, you'll be doing the "group by" queries in data mining that use string fields rather than numerical fields.
 
5. Now the software generates the query:
 

Select Country,  Sum(Quantity) 

From Invoices  

Group By Country; 

 
This query is used to extract the data that will create your graph. Now, you'll probably want to review the graph. To enable you to do that, Graf-FX opens an Access form that has a predefined MS Graph object with a title text box. The software changes the row source of the graph object to the SQL query that it just generated and then updates the title text box with a text string that reflects the selected information.
 
6. Now you decide to drill down on the USA invoice information. In this case, the combo box will show you a list of all the unique entries for Country from the Invoice query.
 
7. You now select a new grouping for the level 2 query: City. This allows you to look at the Quantity totals for all the Cities in the USA using the graph button on level 2. Graf-FX generates this SQL statement:
 

Select City, Sum(Quantity)  

From Invoices  

Where  Country = 'USA'  

Group By City;

 
 
You'll notice that this is very similar to the SQL for the first level, except that there's now a Where clause. Essentially, this Where clause is the drilldown component. This is very similar to using filter by example in an Access table or form. In Figure 2, this new query becomes the row source for an MS Graph object in a form.
 
8. Back to Figure 1 -- rather than generate a graph, you might want to have a look at the figures that the graph is based on. The query results are shown in Table 2.
 
Table 2. Drilldown query shown using a few rows in Datasheet view.

City

Sum quantity

Albuquerque

1383

Anchorage

603

Boise

4958

Butte

59

Elgin

122

Eugene

345

 
 
You can continue filling out the drilldowns to the fourth and fifth levels and experiment with the graphs and queries as you go. On the fourth level, you'll see that the drilldown now allows a combination of >= and <=. On the fifth level, the constraint is a pair of equal signs (levels 4 and 5 are only available when you register the software). A fifth-level drilldown query looks like this:
 

Select ShippedDate, Sum(Quantity) 

From Invoices  

Where  Country = 'USA'  

  And City = 'Anchorage'  

  And (ProductName >= 'Filo Mix' 

        And ProductName <= 'Gula Malacca') 

  And (Salesperson = 'Andrew Fuller' 

       Or Salesperson = 'Janet Leverling') 

Group By ShippedDate "

 
 
Exploring your data using 3D queries
 
Access has a powerful extension to the normal select query called crosstab queries (Excel users call them pivot tables).
 
Graf-FX allows you to analyze data using a crosstab query in conjunction with your data mining. A crosstab query displays summarized values (sums, counts, and averages) from one field in a table. The columns of the crosstab query are generated on the fly from the data in another column. As a result, a crosstab query groups data by one set of facts listed down the left side of the datasheet and another set of facts listed across the top of the datasheet. These queries become quite powerful when they're combined with the 3D graphs that are part of Microsoft Graph. Figure 3 shows a 3D histogram that's been fed with a query from the Invoice table
 
To start drilling down in 3D, use the same initial settings as before by following Steps 1, 2, 3, and 4 in Figure 4. The following numbered discussion points refer to the highlighted numbering in Figure 4.
 
1. Select the Z_Graf3Dhisto graph form from a list of usable forms with graph objects. Graf-FX allows you to first save, then customize the special graph forms to suit your special requirements. Things you might change would be the size of the form, adding corporate logos, or else the orientation of the 3D graph.
 
2. Select the field OrderDate from the Invoice query field list. You'll see that the field displayed in this combo box is actually a function, CalYYYYQ_FX(OrderDate). This is a VBA function that returns a four-digit year number concatenated with a number that represents the quarter the date falls in. The graph in Figure 4 shows these converted dates. To build this special function in Graf-FX, you'd right-click in the field where you added OrderDate and choose the FlashBuilder.
 
3. The Sum aggregate function is still used to summarize the computed crosstab fields.
 
4. Select the Quantity field for the 3D consolidation of Invoices data. As crosstab queries only have one field to store the results, you only select one consolidation field.
 
5. Now you can produce the graph shown in Figure 3 by clicking on the first-level 3D graph button.
 
6. To drill down in 3D, choose USA as your drilldown item. Then select Postal Code as your next 3D sub-grouping field.
 
7. Clicking on the crosstab button on the second level of drilldowns will produce the following SQL and output (shown in Table 3).
 
Table 3. Some columns of a crosstab query that shows quantities for Postal Codes by quarters for USA invoice results.

Postal Code

1994-3

1994-4

1995-1

1995-2

1995-3

59801

 

 

 

 

26

82520

24

74

126

 

 

83720

 

241

649

72

742

87110

174

231

168

148

171

94117

 

 

 

 

31

97201

 

15

 

 

 

97219

 

33

 

14

 

97403

 

 

 

20

93

 

Transform Sum(Quantity) As sumQuantity

Select PostalCode  

From Invoices  

Where  Country = 'USA'  

Group By PostalCode PIVOT CalYYYYQ_FX(OrderDate);

 
 
Graf-FX always shows the field selected in Step 2 as one axis of the 3D view, and the other axis is the grouping on the level that you select the graph or the query. The Where clause of the query exactly corresponds with the drilldowns in 2D, so you can actually use both techniques together to explore your data visually.
 
Conclusion
The technology that's deployed in Graf-FX makes use of the fact that Microsoft Graph can be fed with a valid select query. The select queries that are deployed are generally consolidation queries using the Group By clause. The drilldown technology involves taking the original base consolidation query and then filtering the data using a series of cascading filters, implemented as Where clauses. Graf-FX enhances this drilldown process by allowing users to also view the data through 3D graphs with crosstab queries.
 
Since Graf-FX is a shareware product, you can always download a trial copy of the software from http://www.gr-fx.com/graf-fx.htm. Most of the software featured in this article is available for use in the shareware mode. We welcome any useful user feedback on the product.
 
 
 
Garry Robinson  is the founder of GR-FX Pty Limited, a company based in Sydney, Australia. Garry specializes in database software for the mining industry (geology, metallurgy, environmental). The interest in data mining stemmed from a desire to finally couple 15 years of database experience with work undertaken for a master's degree in 3D graphics completed in 1982.

 

You May Also Wish To Read

Analyze Your Data in Space

Creating Charts/Graphs in an Access 2007 Database