Fuzzy Dates

<< Click to Display Table of Contents >>

Navigation:  Dates and Time >

Fuzzy Dates

Rebecca M. Riordan          
 

This month, Rebecca Riordan strays from the subject of building interfaces in Access to look at the subject of handling dates. She begins by examining the DateTime data type and the joys of VBA date functions, but the core of this article is how to store values that represent periods in time. As Rebecca points out, much date-like data can't (or at least shouldn't) be stored in traditional DateTime fields.

Microsoft Jet, SQL Server, and Access VBA all support a DateTime data type (see the sidebar, "Counting Time," for details of the supported types and the values they can contain). Of course, Jet and SQL Server engines handle dates differently–anything else would be too easy–but by and large the DateTime data types work well for storing and manipulating dates in the vast majority of business applications.

Access VBA supports a wide range of functions for manipulating dates and times. The majority of these are straightforward, but you must always be careful about manipulating the time component of a DateTime or a VBA date variant.

For example, the Date function returns the current system date, while Now returns both the date and the time, and you have to be careful when you mix two functions. If, for example, you use Date as the default value of a field on one form, and Now on another, comparisons may return unexpected results. This expression will always return False because of the addition of the time value in Now:

Date = Now

Unless you really do care about the time component, always use Date rather than Now when creating DateTime values, and for safety, use the DateValue function for comparisons. The expression will return True, because DateValue will strip off the time component:

Date = DateValue(Now)

DateValue is one of two functions that can be used to convert an arbitrary value to a date variant. It operates on any single value that can be converted to a date. Most often, it's used with a string value:

Dim myDate as Date

myDate = DateValue("1 January, 1990")

The other date conversion function is DateSerial, which accepts the day, month, and year components as separate integer arguments and then returns a Date value:

Dim myDate as Date

myDate = DateSerial(1990, 1, 1)

VBA (as opposed to Access VBA) provides equivalent functions, TimeValue and TimeSerial for creating time values, but surprisingly, no function for creating a value that combines the two in one step. In fact, VBA's behavior in this area is somewhat odd. If you pass a time value to DateSerial, for example, it won't be included in the value returned, although the function will generate an error if the time passed is invalid. TimeSerial behaves the same way: It will produce an error if passed an invalid date, but otherwise ignores the value.

 

Adding dates and times

If you do need to create a value that contains both date and time components, you must add the two values:

Dim myDateTime as Date

myDateTime = #1/1/1990# + #1:23 PM#

In this example, I've used literal values for the date and time (date and time literal values are always surrounded by hash marks in the same way that string values are always surrounded by quotes). However, you can also use functions or expressions:

Dim myDate as Date

myDate = Date + TimeValue(Now)

Notice that I've explicitly extracted the time component from the value returned by the Now function. If you fail to do this (or the equivalent for your particular expression), VBA will convert both values to their floating point representation, add them, and then convert them back to a date. The results are, well, odd. For example, this function:

Sub CreateDateTime()

  Debug.Print Date

  Debug.Print Now

  Debug.Print Date + TimeValue(Now)

  Debug.Print Date + Now

End Sub

returned the following values when I ran it just now:

12/4/2004

12/4/2004 2:38:43 PM

12/4/2004 2:38:43 PM

11/9/2109 2:38:43 PM

Now, this is a slightly pathological example since Now returns both the date and time components, so there's no point in adding it to the value returned by Date, but it does demonstrate the problem: When you use the + operator between a date and a time, you get a DateTime; when you use the + operator between two DateTimes, you get strange results (this is known in the trade as a "trap for young players").

If you do need to add date or time values, VBA provides the extremely useful DateAdd function, which will work for both date and time components. The trick is that it only allows you to add or subtract a single component at a time. So, if you need to add a year and a day to a date, you must do so in two steps:

Dim myDate as Date

Dim newDate as Date

myDate = Date

newDate = DateAdd("yyyy", 1, myDate)

newDate = DateAdd("d", 1, myDate)

This can be tedious but, in reality, you just don't need to do this very often. Despite its name, DateAdd will also allow you to subtract a value simply by providing a negative number as the second argument:

newDate = DateAdd("m", -1, myDate)

The function allows you to specify a year, quarter, month, day of year, day, weekday, week, hour, minute, or second to be added or subtracted. The result is always a valid date. Given the idiosyncrasies of a calendar that attempts to accommodate both the lunar and solar cycles, this saves an enormous amount of work. I've been at this long enough to remember having to translate "30 days hath September..." into an algorithm. Not my idea of fun.

The other powerhouse VBA date function is DateDiff, which returns an integer specifying the number of intervals between two dates. The intervals are the same as those that can be supplied to the DateAdd function–year, quarter, month, day of year, day, weekday, week, hour, minute, or second–and the same constraint applies: Only one interval can be specified at a time.

In all of these functions I've used the VBA variant data type, which corresponds to the Jet DateTime data type. Since the range of values for a SQL Server DateTime is different from the Jet DateTime, you'll need to be careful of your ranges when working with an Access Project. Again, in real life, this isn't often a problem, but if you're dabbling around the edges of the year 1753 (the limit for SQL Server), it can bite you.

Storing dates

Of course, not all of the applications we build with Microsoft Access are business applications, and even in a business application, DateTime isn't always the best choice for holding "date-like" data. The most obvious example of this, of course, is when you're dealing with calendar dates that fall outside the range that can be handled by your data engine. Dates earlier than 100, the earliest date supported by Microsoft Jet, can easily occur whenever you're dealing with artifacts of any type or historical events, and dates earlier than the 1753 supported by SQL Server are even more common.

When presented with this situation, you have a couple of choices. You can manipulate the intrinsic date type. I've seen developers use the time component of a DateTime field to determine the offset from the base date, for example. Personally, I think this is really, really ugly, and it requires low-level manipulation of the data similar to bitmasking, which is at serious odds with relational theory.

A better solution is to split the components of the date into multiple fields. A Byte field will easily handle days and months, while an Integer or Long Integer will handle years. For those of you who worry about such things, two Bytes and a Long take up six bytes, which is two bytes less storage space than a single 8-byte DateTime. Which isn't to say, of course, that you should routinely avoid the DateTime data type. When you're working with complete dates or times within the valid range, DateTime is the correct choice, and saves manipulation down the line.

In practice, your storage space will often take even less room, since it's unlikely that you'll need all three fields. As I'm sure you remember, Rembrandt's "The Night Watch" was completed in 1642 and Machiavelli's The Prince was first printed in 1532, but you can't assign a day or month to either date (it's not your memory that's fading; no one else knows, either). This situation can occur even within the range of dates valid for DateTime fields. So, for example, the date of the Spring 2005 collection should be stored in two fields, not in a single DateTime field. Storing the data in a single DateTime field would force you to assign an arbitrary month and day. Two fields (called, perhaps, Season and Year) would let you store all the information that you have. That may sound obvious, but it can be surprisingly difficult to avoid the knee-jerk "this is a date, stick it in a DateTime field" reaction.

In situations where dates are incomplete, they're often also fuzzy. This is particularly true when dealing with artifacts, but the situation can arise even in business situations. Scheduling systems, in particular, can require that projects be scheduled "sometime in the Spring... March or April, perhaps."

Sometimes in these situations, four DateTime fields or four sets of Integer fields will work. This is simple enough to implement. Slightly more complicated are multiple requirements: "We're committed to start this project on January 15, but that project will be sometime in August" or "'The Night Watch' was painted in 1642, Caravaggio's 'Judith and Holofernes' was done circa 1599, and Rubens' 'The Love Garden' was painted between 1632 and 1634. Which of these paintings were available in 1601?" To answer this question, you have to define what "circa" means so that you can begin to think about how you will store it.

The obvious solution to this situation is to use multiple fields, one for each possible way of specifying the date(s). Unfortunately, this makes querying complex because you have, in essence, created a repeating group. If you needed to find all the paintings painted between 1600 and 1635, for example, you'd need a query like the following:

SELECT * FROM PAINTINGS

WHERE ([Year] Between 1600 And 1635)

OR ([StartYear] >= 1600)

OR ([EndYear] <= 1635)

And, of course, the more options, the more fields you need to interrogate. Obviously, this can get ugly. You could use subclassing to store the dates in separate tables. Unfortunately, subclassing doesn't help very much in this situation. In fact, it makes it more complicated, because you're not performing conditional joins on multiple tables.

A slightly awkward, but workable, solution is to add a couple more fields (more fields?) to the table for start and end dates, and then massage the dates entered. This requires extra processing during data entry, and some negotiation about how certain values should be interpreted–does circa xxxx mean plus or minus five years, or one year, or can it even be generalized?–but it does greatly simplify certain types of queries. If users will be issuing ad hoc queries, this is pretty much a requirement.

For instance, you could use a BaseYear field and a DateAdjust field. With this design:

• "The Night Watch" would be stored with a Year of 1642 and a DateAdjust of 0.

• "Judith and Holofernes" would be stored with a Year of 1599 and a DateAdjust of -3 (deciding that "circa" means that the painting was certainly done by 1599 and must have been painted within the previous three years).

• "The Love Garden" would be stored with a Year of 1632 and a DateAdjust of 2.

The SQL statement to find all the paintings between 1600 and 1635 would look like this:

SELECT * FROM PAINTINGS

  WHERE (BaseYear < 1600 And

           BaseYear + DateAdjust > 1635)

  Or BaseYear Between 1600 And 1635

  Or BaseYear + DateAdjust Between 1600 and 1635

The real solution here is to carefully consider both the structure of the data and the way in which it will be used. This should come as no surprise; it's the solution to every database problem. Dates and times have their own particular set of foibles, as we've seen, but it always comes down to careful analysis and design.

 

Sidebar: Counting Time

SQL Server, Jet, and VBA all provide intrinsic support for date and time values. SQL Server provides two data types, DateTime and SmallDateTime. The DateTime data type can track dates from 1 January 1753 to 31 December 9999. It stores time to the millisecond. The value is stored as two 4-byte integers. The first integer represents the number of days before or after 1 January 1900, and the second integer represents the number of milliseconds after midnight.

The SmallDateTime data type can track dates from 1 January 1900 to 6 June 2079. It stores time to the second, but is only accurate to the minute. A SmallDateTime is stored as two 2-byte integers. The first represents the number of days after 1 January 1900, while the second byte is the number of minutes since midnight.

With either type, if you store a time with no date, the date is set to 1 January 1900; a date with no time is stored as Midnight. These are, of course, zero-offset values.

The Microsoft Jet DateTime data type and the VBA date variant can track dates from 1 January 100 to 31 December 9999. The value is stored as a single 8-byte floating-point number, with the integer portion representing the number of days before or after 30 December 1899, and the decimal portion representing the percentage of the day since midnight.

If you store only a time in a Microsoft Jet DateTime field or a VBA variant, the date value will be set to 30 December 1899. Conversely, if you store only a date, the time will be set to Midnight. Again, these are zero-offset values.

 

Your download file is called 502RIORDAN.ZIP in the file SA2005-02down.zip

This can be purchased with all the other downloads on this page

 

Other Pages That You Might Like to Read
 

Soundex: "Close" Only Counts in Horseshoes

How to Convert Case and Calculate Total Time

Access Developer Needs a Date

Playing Tag with Validation