Access Answers: These Are a Few of My Favorite Things

<< Click to Display Table of Contents >>

Navigation:  Smart Access 1996-2006 > Aug-2005 >

Access Answers: These Are a Few of My Favorite Things

Doug Steele          

In the spirit that "everything is data," Doug Steele looks at how to build a database of favorites (bookmarks) from data extracted from several different Web browsers. Along the way, he provides a very efficient routine for processing files and a number of helper functions for processing files (including INI files).

 

Is it possible to import my Favorites from Internet Explorer and store them in a table?

For those of you who haven't looked, Internet Explorer stores each entry in your Favorites list as a separate file (with an extension of .url). For example, the favorite I have saved for the Smart Access Web site is named Pinnacle Publishing Smart Access.url, and its content is:

[DEFAULT]

BASEURL=http://www.smartaccessnewsletter.com/ME2/

Audiences/Default.asp

[InternetShortcut]

URL=http://www.smartaccessnewsletter.com/ME2/Audiences/

Default.asp

Modified=B04FC6954E6BC501D9

That means that your question can be viewed as having three separate parts:

1. Find all of the favorites files.

2. Read the contents of each file.

3. Store the information.

 

How do I find all files meeting a specific criterion?

There are three common approaches to running through files on a hard disk:

• Use the Dir function.

• Use FileSystemObject (FSO) from the Scripting library.

• Use the FindFirstFile, FindNextFile, and FindClose APIs.

One basis for making the decision is which gives the best performance. Tests I did yielded the results shown in Table 1. These results reflect how long it took to find the 5491 files on my hard drive. As you can see, using the API functions is the fastest (no surprise, since the other methods are probably using those functions themselves).

Table 1. Relative speeds of different file enumeration methods.

Method

Time (seconds)

Using Dir

1.222

Using FSO

10.285

Using APIs

0.191

In the interest of space, I'm only going to talk about using the Dir function, since it combines efficiency with ease of understanding. (If you want information on the other methods, I wrote an article about these functions for the May 2005 issue of Access Advisor.)

The VBA Dir function "returns a String representing the name of a file, directory, or folder that matches a specified pattern or file attribute, or the volume label of a drive" (to quote the VBA Help file). The first time you call the Dir function, you must provide a string expression that specifies a file name (which may include folder and drive, as well as wildcards). Dir returns the first file name that matches the pattern supplied. To get any additional file names that match the pattern, you call Dir again with no arguments.

The Dir function works fine to list all of the files in a single folder or to get a list of all subfolders in a given folder. The problem, though, is that you can't nest Dir calls, making it difficult to create a recursive function to handle files in subfolders.

One solution to processing subfolders is to store subfolders in a collection or array as the Dir function finds them. When you finish processing a directory, you'll have a list of all of the subfolders in the directory. You can then use the Dir function to find the files in each subfolder. This sounds like a useful routine and, to make it generic, all that's necessary is to have the function accept a starting directory and a file pattern as parameters. My version of the routine uses a collection to hold the list of subfolders and is designed to run recursively. In actual fact, I use two separate collections in the routine: the "main" collection that holds all of the files found and that gets passed between routines, and the "local" collection that's local to the routine that holds the folders found currently processed directory. Here's the routine:

Public Sub FindFiles( _

  StartDir As String, _

  FilePattern As String, _

  FileList As Collection _

)

Dim strFile As String

Dim strFolder As String

Dim strSubFolder As String

Dim varFolder As Variant

Dim colSubfolders As Collection

Assuming that a starting directory was passed, I first ensure that the folder name ends in a slash. I've created a helper function called QualifyFolderPath to do this (see the code after this code listing for what the QualifyFolderPath function looks like):

  If Len(StartDir) > 0 Then

    strFolder = QualifyFolderPath(StartDir)

I now add the name of each file in the folder to the "main" collection:

    strFile = Dir$(strFolder & FilePattern)

    Do While Len(strFile) > 0

      FileList.Add strFolder & strFile

      strFile = Dir$

    Loop

I then build a list of subfolders, adding each subfolder's name to the "local" collection:

    Set colSubfolders = New Collection

    strSubFolder = Dir$(strFolder, vbDirectory)

    Do While Len(strSubFolder) > 0

      If strSubFolder <> "." And _

        strSubFolder <> ".." Then

        If (GetAttr(strFolder & strSubFolder) _

          And vbDirectory) = vbDirectory Then

          strSubFolder = strFolder & strSubFolder

          colSubfolders.Add strSubFolder

        End If

      End If

      strSubFolder = Dir$

    Loop

Finally, I recursively process each of the subfolders found above. Because I'm enumerating the elements of a collection that contains strings rather than objects, I have to use a variant to enumerate the loop. However, the function is expecting a string, so I have to convert the variant back to a string after pulling it from the collection:

    For Each varFolder In colSubfolders

      Call FindFiles(CStr(varFolder), _

        FilePattern, FileList)

    Next varFolder

  End If

End Sub

As mentioned earlier, here's QualifyFolderPath, which ensures that each folder ends with a slash:

Private Function QualifyFolderPath( _

  PathName As String _

) As String

  If Len(PathName) > 0 Then

    If Right$(PathName, 1) <> "\" Then

      QualifyFolderPath = PathName & "\"

    Else

      QualifyFolderPath = PathName

    End If

  Else

    QualifyFolderPath = ""

  End If

End Function

In order to use this routine to determine all of the favorites stored on the machine, you have to know where to look. The folder in which Internet Favorites are stored is a so-called "Special Folder," and there are API calls that can be used to determine the location of any Special Folder. The following code uses the SHGetSpecialFolderLocation and SHGetPathFromIDList API calls to return the location of the user Favorites folder (which, in Windows XP, is usually going to be C:\Documents and Settings\<user id>\Favorites):

Private Const CSIDL_FAVORITES = &H6

Private Const MAX_PATH = 260

Private Const S_OK = 0

Private Declare Function SHGetPathFromIDList _

  Lib "shell32" Alias "SHGetPathFromIDListA" ( _

  ByVal pidl As Long, _

  ByVal pszPath As String _

) As Long

Private Declare Function _

SHGetSpecialFolderLocation Lib "shell32" ( _

  ByVal hwndOwner As Long, _

  ByVal nFolder As Long, _

  pidl As Long _

) As Long

Private Declare Sub CoTaskMemFree Lib "ole32" _

  (ByVal pv As Long)

Function GetFavoritesFolder() As String

Dim strPath As String

Dim lngPIDL As Long

  If SHGetSpecialFolderLocation(0, _

    CSIDL_FAVORITES, lngPIDL) = S_OK Then

    strPath = Space$(MAX_PATH)

    If SHGetPathFromIDList( _

      ByVal lngPIDL, ByVal strPath) Then

      GetFavoritesFolder = _

        Left(strPath, InStr(strPath, Chr(0)) - 1)

    End If

    Call CoTaskMemFree(lngPIDL)

  End If

End Function

Now that you have a way of determining where to start looking for the favorites, you can use the FindFiles routine using code like this:

Public Sub TestFind()

Dim colFiles As Collection

Dim intLoop As Integer

Dim strStartDir As String

Dim strMessage As String

Dim varFile As Variant

  strStartDir = GetFavoritesFolder()

  Set colFileList = New Collection

  Call FindFiles(strStartDir, "*.url", colFiles)

  If colFiles.Count = 1 Then

    strMessage = "There is 1 file under "

  Else

    strMessage = "There are " & _

      colFileList.Count & " files under "

  End If

  strMessage = strMessage & strStartDir

  Debug.Print strMessage

  For Each varFile In colFileList

    Debug.Print varFile

  Next varFile

  Set colFiles = Nothing

End Sub

 

How can I read the contents of a Favorites file?

Take another look at the contents of Pinnacle Publishing Smart Access.url that I showed earlier. It turns out that the file can contain a lot of different pieces of information in it, but the relevant part for our exercise (the actual URL associated with the favorite) is the following:

[InternetShortcut]

URL=http://www.smartaccessnewsletter.com/ME2/Audiences/

Default.asp

You may notice that that this file's format is identical to an INI file, a fact that we can use to our advantage, since Windows' GetPrivateProfileString API is designed for reading INI files.

The parts of an INI file are:

[section]

key=value

The declaration for the GetPrivateProfileString is:

Private Declare Function _

  GetPrivateProfileString Lib "kernel32" _

  Alias "GetPrivateProfileStringA" ( _

    ByVal lpSectionName As String, _

    ByVal lpKeyName As Any, _

    ByVal lpDefault As String, _

    ByVal lpReturnedString As String, _

    ByVal nSize As Long, _

    ByVal lpFileName As String _

) As Long

You'll need to supply values for all but lpReturnedString:

lpSectionName–The section from the INI file schematic shown earlier ("InternetShortcut" in the case of the Favorite file).

lpKeyName–The key from the INI file schematic shown earlier ("URL" in the case of the Favorite file).

lpDefault–What value to use if the section/key combination isn't found file.

lpReturnedString–A buffer that's been initialized to a fixed number of spaces that will receive the value of the key in the INI file.

nsize–The number of spaces to which the buffer has been initialized.

lpFileName–The name of the INI file (in our case, the name of the Favorite file).

If the function call succeeds, the function returns the number of characters copied to the buffer, not including the terminating null character.

So, to read the value of the URL key in the InternetShortcut section of a Favorites file, you'd use code similar to this:

Function GetURL(FileName As String) As String

Dim lngSize As Long

Dim lngSuccess As Long

Dim strReturn As String

  strReturn = Space$(2048)

  lngSize = Len(strReturn)

  lngSuccess = GetPrivateProfileString( _

    "InternetShortcut", "URL", _

    "", strReturn, lngSize, FileName)

  If lngSuccess > 0 Then

    GetURL = Left$(strReturn, lngSuccess)

  End If

End Function

 

How can I store the Favorites information?

What I did was design a table that could hold the site information. Much to my surprise, I discovered that some of the URLs associated with favorites were in excess of 255 characters, so I was forced to use a Memo field to hold the URL. The fields I used in my table are listed in Table 2.

Table 2. My "Favorites" table.

Field name

Field type

FavoriteFolderNm

Text(255)

SiteNm

Text(255)

URLTx

Memo

I decided that I wouldn't store the entire name of the Favorites files. Instead, I'd store only that part of the path after whatever was returned by GetFavoritesFolder. In other words, if my favorites are stored under D:\Documents and Settings\DJSteele\Favorites and I have a specific Favorite stored in D:\Documents and Settings\DJSteele\Favorites\Microsoft\MSDN Advanced Search.url, I'll store Microsoft as the FavoriteFolderNm, and MSDN Advanced Search as the SiteNm.

Having said all that, here's code that can find all of the favorites and store them in the database. The routine is passed the starting directory to search for the Favorites file:

Public Sub FindFavorites( _

  Optional StartDir As String = "" _

)

Dim colFavorites As Collection

Dim lngStartFolderLength As Long

Dim strFile As String

Dim strFolder As String

Dim strStartFolder As String

Dim strSQL As String

Dim varFile As Variant

I then make sure that the program knows where to look for the favorites files (and that the starting folder ends with a back slash):

  If Len(StartDir) = 0 Then

    strStartFolder = _

      QualifyFolder(GetFavoritesFolder())

  Else

    strStartFolder = QualifyFolder(StartDir)

  End If

  lngStartFolderLength = Len(strStartFolder) + 1

With the initialization just about done, I then create a new collection to hold the file names, and call FindFiles, passing it the appropriate values to return the list of favorites files:

  Set colFavorites = New Collection

  Call FindFiles(strStartFolder, "*.url", _

    colFavorites)

If I do find some favorites files, I loop through them. For each entry in the collection of Favorite files, I separate the full path into the Folder and File names, stripping off the "unique" part of the folder, and removing the .url from the end of the file name. Now that I have the file name, I retrieve the URL from within the file using the GetURL function shown earlier, and use a SQL Insert Into statement to store the values in my table:

  If colFavorites.Count > 0 Then

    For Each varFile In colFavorites

      strFile = Dir$(varFile)

      strFolder = Mid$(Left$(varFile, _

        Len(varFile) - Len(strFile)), _

        lngStartFolderLength)

      strFile = Left$(strFile, Len(strFile) - 4)

      If Right$(strFolder, 1) = "\" Then

        strFolder = Left$(strFolder, _

          Len(strFolder) - 1)

      End If

      strSQL = "INSERT INTO Favorites ( _

        FavoriteFolderNm, SiteNm, URLTx) " & _

        "VALUES(" & FixText(strFolder) & ", " & _

        FixText(strFile) & ", " & _

        FixText(GetURL(CStr(varFile))) & ")"

      CurrentDb.Execute strSQL, dbFailOnError

    Next varFile

  End If

End Sub

Here again, I use another helper function–FixText, which ensures that the quotes in the SQL statement are appropriate:

Private Function FixText( _

  InputText As String, _

  Optional Delimiter As String = "'" _

) As String

Dim strTemp As String

  strTemp = Delimiter

  strTemp = strTemp & Replace(InputText, _

    Delimiter, _

    Delimiter & Delimiter)

  strTemp = strTemp & Delimiter

  FixText = strTemp

End Function

 

What about Favorites (or Bookmarks) for browsers other than Internet Explorer?

Firefox and Netscape essentially store their bookmarks as a Web page. Here's an extract of a Firefox bookmark file (yes, even though it says !DOCTYPE NETSCAPE-Bookmark-file-1, it is Firefox!):

<!DOCTYPE NETSCAPE-Bookmark-file-1>

<!-- This is an automatically generated file.

     It will be read and overwritten.

     DO NOT EDIT! -->

<META HTTP-EQUIV="Content-Type"

  CONTENT="text/html; charset=UTF-8">

<TITLE>Bookmarks</TITLE>

<H1 LAST_MODIFIED="1117918695">Bookmarks</H1>

<DL>

    <p>

    <DT>

        <H3 ADD_DATE="1102425622"

          ID="rdf:#$j.0eT1">Dell</H3>

      <DL>

        <p>

        <DT>

         <A HREF="http://www.dell.com/"

             ADD_DATE="1102425622"

             ID="rdf:#$l.0eT1">Dell</A>

        <DT>

          <A HREF="http://support.dell.com/"

             ADD_DATE="1102425622"

             ID="rdf:#$n.0eT1">Support.Dell.com</A>

      </DL>

      <p>

    <DT>

        <A HREF="http://communities.microsoft.com/

         newsgroups/default.asp?icp=whidbey&amp;

         slcid=us"

         ADD_DATE="1102425622" LAST_VISIT="1115580162"

         LAST_CHARSET="ISO-8859-1"

         ID="rdf:#$v.0eT1">Microsoft Newsgroups</A>

    <DT>

        <A HREF="http://www.microsoft.com/downloads/

         details.aspx?FamilyId=FE118952-3547-420A-A412

         -00A2662442D9&displaylang=en"

         ADD_DATE="1102425622"

         ID="rdf:#$z.0eT1">Office 2003 XML Schemas</A>

</DL>

<p>

Each bookmark (or favorite) has its information stored in an HTML Anchor element, with the HREF element containing the URL, and what's between the <A> and </A> being the name of the bookmark. While a number of different elements may be included in the <A> tag, I'm only concerned with the value of the HREF element.

As before, that means that the problem can be viewed as having three separate parts:

1. Find all of the Anchor elements in the file.

2. Determine the contents of each Anchor element.

3. Store the information.

Reading the contents of a file into a variable is straightforward: Just determine the size of the file (using the FileLen function) and initialize a buffer to that size. You can then use the VBA Open statement to open the file for input, the VBA Get statement to read the contents of the file into a variable, and the VBA Close statement to close the file when complete, as this code does:

Function GetContentsOfFile( _

  WhatFile As String _

) As String

Dim intFile As Integer

Dim lngFileSize As Long

Dim strContents As String

  If Len(Dir(WhatFile)) > 0 Then

    lngFileSize = FileLen(WhatFile)

    strContents = Space(lngFileSize)

    intFile = FreeFile()

    Open WhatFile For Binary As intFile

    Get #intFile, , strHTML

    Close #intFile

  End If

  GetContentsOfFile = strContents

End Function

Once the contents of the file have been stored in a variable, the Split function can be used to divide the HTML into the various Anchor elements by splitting on every occurrence of <A. What's in each element of the resultant array will have to be parsed again into the URI (Uniform Resource Identifier, the official name for what's stored as the HREF element) and the Anchor value (the text between <A ...> and </A>).

To pull out the data, I created two more "helper functions." The first is GetHREF, which extracts the URI. It begins by declaring some variables:

Private Function GetHREF( _

  Anchor As String _

) As String

Dim lngStart As Long

Dim lngEnd As Long

Dim lngTotalLength As Long

Dim strDelimiter As String

Dim strHREF As String

I then check to see whether the expression HREF= occurs in Anchor text:

  lngTotalLength = Len(Anchor)

  lngStart = InStr(1, Anchor, _

    "HREF=", vbTextCompare)

The URI must be delimited after HREF=. That delimiter might be either a single quote (', or Chr$(39)) or a double quote (", or Chr$(34)). However, there can also be white space after HREF=, so I need to handle that possibility. I start by looking at the character after the = sign, which is five positions after lngStart, and then keep looking until the first ' or " is found:

  If lngStart > 0 Then

    lngStart = lngStart + 5

    strDelimiter = Mid(Anchor, lngStart, 1)

    Do While strDelimiter <> Chr$(39) And _

      strDelimiter <> Chr$(34)

      lngStart = lngStart + 1

      If lngStart > lngTotalLength Then

        strDelimiter = vbNullString

        Exit Do

      End If

      strDelimiter = Mid(Anchor, lngStart, 1)

    Loop

Assuming a delimiter was found, I then look for the next occurrence of the delimiter that I found at the start. Whatever is between is the URI:

    If Len(strDelimiter) > 0 Then

      lngEnd = InStr(lngStart + 1, _

        Anchor, strDelimiter, vbTextCompare)

      If lngEnd > 0 Then

        strHREF = Mid(Anchor, lngStart + 1, _

          lngEnd - lngStart - 1)

      End If

    End If

  End If

  GetHREF = strHREF

End Function

My second helper routine is GetLinkName, and it finds the name of the favorite. Determining the actual name of the link in the Anchor element is fairly straightforward. I find where the first > character occurs in the Anchor element, and then find where the closing </A> occurs. What's between those two positions is the name of the link:

Private Function GetLinkName( _

  Anchor As String _

) As String

Dim lngStart As Long

Dim lngEnd As Long

Dim strDelimiter As String

Dim strLink As String

  lngStart = InStr(1, Anchor, ">")

  If lngStart > 0 Then

    lngEnd = InStr(lngStart + 1, HTML, "</A>")

    If lngEnd > 0 Then

      strLink = Trim$(Mid(HTML, lngStart + 1, _

        lngEnd - lngStart - 1))

    End If

  End If

  GetLinkName = strLink

End Function

Using these various building blocks, it's possible to read the bookmark file and store its content in my table:

Sub ReadBookmarkFile(BookmarkFile As String)

Dim lngLoop As Long

Dim strCurrElement as String

Dim strHREF As String

Dim strHTML As String

Dim strName As String

Dim strOutput As String

Dim strSQL As String

Dim varElements As Variant

I'm a suspicious guy, so I first make sure that a valid Bookmark file was actually passed. I have to check for both the length of the value and the length returned by the Dir function. If an empty string is passed, then Dir("") will give you a file in the current directory, which isn't what I want:

  If Len(BookmarkFile) > 0 Then

    If Len(Dir(BookmarkFile)) > 0 Then

Once I get the contents of the file, using one of my helper functions, I split it into the contents into individual Anchor elements:

      strHTML = GetFileContent(BookmarkFile)

      varElements = MySplit(strHTML, "<A", _

        -1, vbTextCompare)

For each Anchor element found, I use my last two helper functions to get the favorite's URL and the name of that URL. I then write that information to my table:

      For lngLoop = LBound(varElements) To _

        UBound(varElements)

        strCurrElement = varElements(lngLoop)

        If InStr(1, varElements(lngLoop), _

          "</A>", vbTextCompare) > 0 Then

          If InStr(1, strCurrElement, _

            "HREF", vbTextCompare) > 0 Then

            strHREF = _

              GetHREF(strCurrElement)

            strName = _

              GetLinkName(strCurrElement)

            strSQL = "INSERT INTO Favorites " & _

              "(FavoriteFolderNm, SiteNm, " & _

              "URLTx) " & _

              "VALUES(" & _

              FixText(BookmarkFile) & ", " & _

              FixText(strName) & ", " & _

              FixText(strHREF) & ")"

            CurrentDb.Execute strSQL, _

              dbFailOnError

          End If

        End If

      Next lngLoop

    End If

  End If

End Sub

And I'm not done yet–but I've run out of space. Next month, I'll address another Web browser (Opera) and show how to read links from Internet Web pages that have collections of URLs.

Downloads

 

Your download file is called 508STEELE.ZIP in the file SA2005-08down.zip

This is found in theon this page