Lookup Tables – Improve data quality in your database

<< Click to Display Table of Contents >>

Navigation:  The Toolbox >

Lookup Tables – Improve data quality in your database

Author: Doug Thatcher from GR-FX, February 2002

Eliminating data errors at the point of data entry can pay big dividends in the future.

This article describes all the steps involved in changing an existing text box with no data entry validation into a watertight combo box using a lookup table, as well as several options that can be used to help reduce incorrect data entry.

The sample database is similar to many that I have worked on where, at the original time of the database design, a text box was deemed to be adequate for the data that was to be entered. Over several years of data entry into this field by different personnel with no form of control as to what is entered, always results in data that is wide and varied. Circumstances often change over time and the need to now use this data in management reporting and business analysis can't be achieved with the data the way it is, as it can't be selected, sorted or grouped in any meaningful manner.

The sample database uses a basic data entry form 'frmTimeLog' (see Figure 1) and shows the text box field 'JobName' that will be used throughout the article.

lu1

Figure 1. The TimeLog data entry form.  

A combo box that uses it's own data as the Row Source

This option can often be used as a first step in improving data entry. It involves changing the 'JobName' text box to a combo box and then displaying all the previously entered job name records excluding duplicates.

With the 'frmTimeLog' form in design view, right click on the 'JobName' text box and use the 'Change To' & 'Combo Box' options, then in the Row Source property click on the ''¦' build button and create an SQL Statement query (see Figure 2).

lu2

  Figure 2. Row Source SQL Query Builder.

In our sample database this option has reduced the number of possible records displayed from 24 to 16, but in a real database situation it could have reduced many hundreds of records to 16. Check that the 'Limit To List' property is set to 'No' to allow for new job names to be entered.

Fixing the junk data by using update queries

Our combo box is now showing 16 job names but as you can see there is still a lot of junk data e.g. there are 3 variations of ABCTV2 but we can improve this by fixing the existing data. At this point in time a policy would need to be established for the format of all new and pre-existing job names, I would suggest to the users of the database that:

1) all job names have a number as the last digit and if there is the possibility of more than 9 in any sequence (and less the 100) that the first job name be ''¦..01' and not ''¦..1' for sorting reasons.

2) all job names contain no spaces.

3) the maximum length be 18 characters.

The steps involved in fixing the junk are:

Create a select query 'qryCheckJobNames' (see Figure 3) which has 3 important uses: it can be used to determine all the records that need to be fixed with a count of each, it can be used to Cut/Paste the records that need to be changed and it can be run after each update query has been run to check the results.

lu3

Figure 3. Select query 'qryCheckJobNames'

Create another select query 'qryFixJobNames' (which will later be changed to an update query) and Cut/Paste from 'qryCheckJobNames' into the Criteria field each variation of job name that needs to be fixed. (see Figure 4)

lu4

 Figure 4. Select query 'qryFixJobNames'

Change the 'qryFixJobNames' Select query to an Update query and enter 'ABCTV02' into the 'Update To:' field and run the query using the ! button. Confirm the number of rows to be updated is as expected (i.e. 4 in this case) by clicking the Yes button.

To check the results just run the 'qryCheckJobNames', you should now see ABCTV02 with a count = 4.

Modify the 'qryFixJobNames' back to select query and repeat the previous process for each job name.

Warning: Setting up and running many update queries requires full concentration and should only be run after a full backup of the database has been done and with no users on.

The final run of the 'qryCheckJobNames query should show that we have now reduced the job names down from 16 to 10.

Time to get serious with a lookup table

Our new 'JobName' combo box now shows 10 Job Names but another problem has now been highlighted i.e. that we are displaying current job names as well as some job names that have become redundant. This will only cause confusion at the time of data entry. To fix this problem we will use a lookup table that has a Yes/No field to determine which job names are displayed.

The steps involved to create the lookup table are:

Create a new select query 'qryCreateJobNamesLUtable' (see Figure 5)

lu5

Figure 5. The select query 'qryCreateJobNamesLUtable'

Change the select query to a Make Table query, name the new lookup table 'tlkp_JobNames' and then Run the query. At the prompt confirm that 10 rows are to be pasted into the new table.

Open the lookup table to check it is OK, then in design view make JobName the primary key, change the Field Size to 18 and change the Required property to Yes. Then add a second field called 'CurrentJob' as a Yes/No data type, set the Default Value to Yes and also change the Required property to Yes. Change to datasheet view and click the CurrentJob check box for all the current jobs. (see Figure 6)

lu6

Figure 6. The tlkp_JobNames lookup table.

Modify the main 'tblTimeLog' table to use the new lookup table. Open the 'tblTimeLog' in design view and click on the Lookup tab of the JobName field and change the Display Control to Combo Box then click on the Row Source build ''¦' button and set up the SQL query as per (Figure 7). The Field Size must also be changed to 18, the same as the lookup table; if they are different then problems will occur.

lu7

 Figure 7. Row Source SQL Query Builder.

Add the new lookup table to the database relationships using the Tools/Relationships options and creating a One-To-Many relationship and set 'Enforce Referential Integrity' and 'Cascade Update Related Fields' to true. (see Figure 8).

lu8

Figure 8. New database Relationships.

Modify the data entry form 'frmTimeLog' to use the new lookup table. The best way to do this is to open the form in design view and delete the current JobName control, and then open the Field List and drag/drop the JobName control onto the form in the same position. This will ensure that all the modifications to the table design and relationships that we have just done will be implemented. The results of all our efforts can be seen in Figure 9, the number of job names is now only 6!

lu9

Figure 9. Job Name is down from 24 to 6.

Fix the Tab Order

Something that is often overlooked when adding and deleting form controls is the Tab order. Any new control that is added to a form is allotted the form's next highest Tab order number, so it will always be last. After all our good work it's not much use to a data entry person if the cursor is jumping all over the screen. To test the Tab order of any form just open it in normal view and hit the Tab key and make note of where it goes. In our case we can use the Auto Order option to correctly re-adjust the Tab order. With the form in design view just right click and select Tab Order and then use the Auto Order option, which sorts the controls in a logical top to bottom and left to right order. You must be careful when using the Auto Order option as some forms can be very complex having controls for data entry, display only, hidden, command buttons etc and by using the Auto Order option the form's original Tab order can be totally messed up.

One little trick that I would use when just deleting and then adding a control in the same position, as in our example, is to take note of the Tab Index property of the control immediately above 'JobName', then just change JobName's Tab Index to this number plus one. Access is very clever in that it will also add one to all the controls above this number, thus maintaining the correct Tab order.

Adding a new Job Name

Now that we have fixed all the old Job Name data and have set up a combo box that only displays the 6 current job names we must look to the future and decide how new job names will be entered into the database. This again is a policy issue that should be decided by the users of the database. There are 2 obvious options i.e. data entry personnel or someone in a supervisory capacity. In both cases the 'Limit To List' property must be set to 'Yes', this will shut the gate on junk data getting into the database.

Option (1) - If the Supervisor option is adopted then only those staff with security access to the 'tlkp_JobNames' lookup table will be able to add new job names (database security is a another whole topic which is too complex to discuss here). If data entry personnel attempt to enter a new job name they will receive the standard Access error message 'œThe text you entered isn't an item in the list.'

Option (2) '“ If the User option is adopted then the following code should be added to the form's 'On Not In List' event.

Private Sub JobName_NotInList(NewData As String, Response As Integer)

Dim D As DAO.Database

Dim R As DAO.Recordset

Dim pos As Integer

Dim char As String

Dim maxseqno As Integer

Dim NewJobalpha As String

Dim NewJobseqno As Integer

' Check the JobName length is OK

If Len(NewData) > 18 Or Len(NewData) < 3 Then

  GoTo Error_Formatting:

End If

' Check first part of JobName is only alpha characters

For pos = 1 To Len(NewData) - 2

  char = (Mid(NewData, pos, 1))

  If Not (char >= "A" And char <= "z") Then

    GoTo Error_Formatting:

  End If

Next

' Check last 2 characters are numeric

If Not IsNumeric(Right(NewData, 2)) Then

  GoTo Error_Formatting:

End If

' Check the JobName is the next in the sequence

NewJobalpha = Left(NewData, (Len(NewData) - 2))

NewJobseqno = Right(NewData, 2)

If NewJobseqno <> "01" Then

  Set D = CurrentDb

  Set R = D.OpenRecordset("tlkp_JobNames")

  Do Until R.EOF

    If Left(R!JobName, (Len(R!JobName) - 2)) = NewJobalpha Then

      If Right(R!JobName, 2) > maxseqno Then

        maxseqno = Right(R!JobName, 2)

      End If

    End If

    R.MoveNext

  Loop

  Set R = Nothing

  If NewJobseqno <> maxseqno + 1 Then

    MsgBox "The new Job Name you have entered is not the next" &  _

    " number in the sequence." & vbCrLf & "The next sequence" & _

    " number for: '" & NewJobalpha & "' should be '" & _

    Format((maxseqno + 1), "00") & "'", vbExclamation, _

    "Sequence Number   Error"

    Response = acDataErrContinue

    Exit Sub

  End If

End If

' Prompt the user to add the new Job Name.

If MsgBox("'" & NewData & "' is not in the list." & vbCrLf & _

 "Would you like to add it?", vbYesNo + vbQuestion, _

 "New Job Name") = vbNo Then

  Response = acDataErrContinue

  Me!JobName.Undo

Else

  Set D = CurrentDb

  Set R = D.OpenRecordset("tlkp_JobNames")

  R.AddNew

  R("JobName") = NewData

  R.Update

  Response = acDataErrAdded

End If

Set R = Nothing

Exit Sub

Error_Formatting:

MsgBox "The new Job Name format you have entered is incorrect," & vbCrLf & _ "It should be alpha characters + 2 numeric characters eg 'Car01'", _ vbExclamation, "Format Error"

Response = acDataErrContinue

Exit Sub

End Sub

This code tests the new job name for all possible format errors and also checks the correct sequence number against pre-existing jobs.

The code will also force the user to confirm that the new job name is correct before being added and also gives them the opportunity to reject any junk data. Another benefit of using lookup tables and setting up the correct Relationships i.e. 'Enforce Referential Integrity' and 'Cascade Update Related Fields' set to true, is that if a new job name is inadvertently misspelt and is not realised until after it has been used several times by data entry personnel, it can easily be corrected. The incorrect job name only needs to be changed once in the lookup table and all the incorrect fields in the 'tblTimeLog' table will also be automatically corrected.

Summary

1  The initial database design is so important '“ any inkling that a text box field with unrestricted data entry could possibly be used in the future for a more meaningful purpose should be set up as a list box or combo box.

2  Depending on the database application there is an obvious time versus accuracy trade off between very fast keyboard data entry personnel that can key in anything into a text box as opposed to selecting a pre-existing item in a combo/list box.

key Your Sample Database Is Called   "lookups.zip"

 

Click here to find out how to purchase all the downloads