Chapter 13. Using threads in conjunction with the BDE, Exceptions and DLLs.
In this chapter:
DLL's and Multiprocess
Dynamic link libraries, or DLL's allow a programmer to share executable
code between several processes. They are commonly used to provide shared
library code. for several programs. Writing code for DLL's is in most respects
similar to writing code for executables. Despite this, the shared nature
of DLL's means that programmers familiar with multithreading often use
them to provide system wide services: that is code which affects
several processes that have the DLL loaded. In this chapter, we will look
at how to write code for DLL's that operates across more than one process.
process scope. A single threaded DLL.
Global variables in DLL's have process wide scope. This means that if two
separate processes have a DLL loaded, all the global variables in the DLL
are local to that process. This is not limited to variables in the users
code: it also includes all global variables in the Borland run time libraries,
and any units used by code in the DLL. This has the advantage that novice
DLL programmers can treat DLL programming in the same way as executable
programming: if a DLL contains a global variable, then each process has
its own copy. Furthermore, this also means that if a DLL is invoked by
a processes which contain only one thread, then no special techniques are
required: the DLL need not be thread safe, since all the processes have
completely isolated incarnations of the DLL.
We can demonstrate this with a
simple DLL which does nothing but store an integer. It exports a couple
of functions that enable an application to read and write the value of
that integer. We can then write a
simple test application which uses this DLL. If several copies of the
application are executed, one notes that each application uses its own
integer, and no interference exists between any of the applications.
Writing a multithreaded DLL.
Writing a multithreaded DLL is mostly the same as writing multithreaded
code in an application. The behaviour of multiple threads inside the DLL
is the same as the behaviour of multiple threads in a particular application.
As always, there are a couple of pitfalls for the unwary:
The main pitfall one can fall into is the behaviour of the Delphi memory
manager. By default, the Delphi memory manager is not thread safe. This
is for efficiency reasons: if a program only ever contains one thread,
then it is pure wasted overhead to include synchronization in the memory
manager. The Delphi memory manager can be made thread safe by setting the
variable to true. This is done automatically for a given module if a descendant
class of TThread is created.
The problem is that an executable and the DLL consist of two separate
modules, each with their own copy of the Delphi memory manager. Thus, if
an executable creates several threads, its memory manager is multithreaded.
However, if those two threads call a DLL loaded by the executable, the
DLL memory manager is not aware of the fact that it is being called by
multiple threads. This can be solved by setting the IsMultiThread variable.
It is best to set this by using the DLL entry point function, covered later.
The second pitfall occurs as a result of the same problem; that of having
two separate memory managers. Memory allocated by the Delphi memory manager
that is passed from the DLL to the executable cannot be allocated in one
and disposed of in the other. This occurs most often with long strings,
but can occur with memory allocated using New or GetMem, and disposed using
Dispose or FreeMem. The solution in this case is to include ShareMem,
a unit which keeps the two memory managers in step using techniques discussed
DLL Set-up and Tear down.
Mindful of the fact that DLL programmers often need to be aware of how
many threads and processes are active in a DLL at any given time, the Win32
system architects provide a method for DLL programmers to keep track of
thread and process counts in a DLL. This method is known as the DLL Entry
In an executable, the entry point (as specified in the module header)
indicates where program execution should start. In a DLL, it points to
a function that is executed whenever an executable loads or unloads the
DLL, or whenever an executable that is currently using the DLL creates
or destroys a thread. The function takes a single integer argument which
can be one of the following values:
As it turns out, DLL entry points have two characteristics which can lead
to misunderstandings and problems when writing entry point code. The first
characteristic occurs as a result of the Delphi encapsulation of the entry
point function, and is relatively simple to work around. The second occurs
as a result of thread context, and will be discussed later on.
DLL_PROCESS_ATTACH: A process has attached itself to the DLL. If this is
the first process, then the DLL has just been loaded.
DLL_PROCESS_DETACH: A process has detached from the DLL. If this is the
only process using the DLL, then the DLL will be unloaded.
DLL_THREAD_ATTACH: A thread in the has attached to the DLL. This will happen
once when the process loads the DLL, and subsequently whenever the process
creates a new thread.
DLL_THREAD_DETACH: A thread has detached from the DLL. This will happen
whenever the process destroys a thread, and finally when the process unloads
1: The Delphi encapsulation of the Entry Point Function.
Delphi uses the DLL entry point function to manage initialization and finalization
of units within a DLL as well as execution of the main body of DLL code.
The DLL writer can put a hook into the Delphi handling by assigning an
appropriate function to the variable DLLProc. The default Delphi handling
works as follows:
Now, the application writer only gets code to execute in response to the
entry point function when the DLLProc variable points to a function. The
correct point to set this up is in the main body of the DLL. However, this
is in response to the second call to the entry point function. In short,
what this means is that when using the entry point function in the DLL,
the delphi programmer will never see the first process attachment to the
DLL. As it turns out, this isn't such a huge problem: one can simply assume
that the main body of the DLL is called in response to a process loading
the DLL, and hence the process and thread count is 1 at that point. Since
the DLLProc variable is replicated on a per process basis, even if more
processes attach themselves later, the same argument applies, since each
incarnation of the DLL has separate global variables.
The DLL is loaded, which results in the entry point function being called
Delphi uses this to call the initialization of all the units in the DLL,
followed by the main body of the DLL code.
The DLL is unloaded, resulting in two calls to the entry point function,
with the arguments DLL_PROCESS_DETACH.
In case the reader is still confused, I'll present an example. Here
is a modified
DLL that contains a unit
with a function that displays a message. As you can see, the main body,
unit initialization and DLL entry point hooks all contain "ShowMessage"
calls which enable one to trace what is going on. In order to test this
DLL, here is a test application. It consists of a form
with a button on. When the button is clicked, a thread
is created, which calls the procedure in the DLL, and then destroys itself.
So, what happens when we run the program?
The DLL reports units initialization
The DLL reports main DLL body execution
Every time the button is clicked the DLL reports:
Entry point: thread attach
Entry point: thread detach
Note that if we spawn more than one thread from the application, whilst
leaving existing threads blocked on the Unit Procedure message box, the
total thread attachment count can increase beyond one.
When the program is closed, the DLL reports entry point: process detach,
followed by unit finalization.
Writing a multiprocess DLL.
Armed with a knowledge of how to use the entry point function, we will
now write a multiprocess DLL. This DLL will store some information on a
system wide basis using memory shared between processes. It is worth remembering
that when code accesses data shared between processes, the programmer must
provide appropriate synchronization. Just as multiple threads in a single
process are not inherently synchronized, so the main threads in different
processes are also not synchronized. We will also look at some subtleties
which occur when trying to use the entry point function to keep track of
This DLL will share a single integer between processes, as well as keeping
a count of the number of processes and threads in the DLL at any one time.
It consists of a header
file shared between the DLL and applications that use the DLL, and
project file. Before we look more closely at the code, it's worth reviewing
some Win32 behaviour.
Global named objects.
The Win32 API allows the programmer to create various objects. For some
of these objects, they may be created either anonymously, or with a certain
name. Objects created anonymously are, on the whole, limited to use by
a single process, the exception being that they may be inherited by child
processes. Objects created with a name can be shared between processes.
Typically, one process will create the object, specifying a name for that
object, and other processes will open a handle to that object by specifying
The delightful thing about named objects is that handles to these objects
are reference counted throughout the system. That is, several processes
can acquire handles to an object, and when all the handles to that object
are closed, the object itself is destroyed, and not before. This includes
the situation where an application crashes: typically windows does a good
job of cleaning up unused handles after a crash.
The DLL in detail.
Our DLL uses this property to maintain a memory mapped file. Normally,
memory mapped files are used to create an area of memory which is a mirror
image of a file on disk. This has many useful applications, not least "on
demand" paging in of executable images from disk. For this DLL however,
a special case is used whereby a memory mapped file is created with no
corresponding disk image. This allows the programmer to allocate a section
of memory which is shared between several processes. This is surprisingly
efficient: once the mapping is set up, no memory copying is done between
processes. Once the memory mapped file has been set up, a global named
mutex is used to synchronize access to that portion of memory.
Initialization consists of four main stages:
In the first stage, two synchronization objects are created, a global mutex,
and a critical section. Little needs to be said about the critical section.
The global mutex is created via the CreateMutex API call. This call has
the beneficial feature that if the mutex is named, and the named object
already exists, then a handle to the existing named object is returned.
This occurs atomically. Were this not the case, then a whole range of unpleasant
race conditions could potentially occur. Determining the precise range
of possible problems and potential solutions (mainly involving optimistic
concurrency control) is left as an exercise to the reader. Suffice to say
that if operations on handles to global shared objects were not atomic,
the application level Win32 programmer would be staring into an abyss...
Creation of synchronization objects (global and otherwise).
Creation of shared data.
Initial increment of thread and process counts.
Hooking the DLL entry point function.
In the second stage the area of shared memory is set up. Since we have
already set up the global mutex, it is used when setting up the file mapping.
A view of the "file" is mapped, which maps the (virtual) file into the
address space of the calling process. We also check whether we happened
to be the process that originally created the file mapping, and if this
is the case, then we zero out the data in our mapped view. This is why
the procedure is wrapped in a mutex: CreateFileMapping has the same nice
atomicity properties as CreateMutex, ensuring that race conditions on handles
will never occur. In the general case, however, the same is not necessarily
true for the data in the mapping. If the mapping had a backing file, then
we might be able to assume validity of the shared data at start-up. For
virtual mappings this is not assured. In this case we need to initialize
the data in the mapping atomically with setting up a handle to the mapping,
hence the mutex.
In the third stage, we perform our first manipulation on the globally
shared data, by incrementing the process and thread counts, since the execution
of the main body of the DLL is consistent with the addition of another
thread and process to those using the DLL. Note that the AtomicIncThreadCount
procedure increments both the local and global threads counts whilst both
the global mutex and process local critical section have been acquired.
This ensures that multiple threads from the same process see a fully consistent
view of both counts.
In the final stage, the DLLProc is hooked, thus ensuring that the creation
and destruction of other threads in the process is monitored, and
the final exit of the process is also registered.
An application using the DLL.
A simple application that uses the DLL is presented here. It consists of
shared unit, a unit
containing the main form, and a subsidiary
unit containing a simple thread. Five buttons exist on the form, allowing
the user to read the data contained in the DLL, increment, decrement and
set the shared integer, and create one or more threads within the application,
just to verify that local thread counts work. As expected, the thread counts
increment whenever a new copy of the application is executed, or one of
the applications creates a thread. Note that the thread need not directly
use the DLL in order for the DLL to be informed of its presence.
Thread context in Entry Point Functions.
Instead of using a simple application, let's try one that does something
more advanced. In this situation, the DLL is loaded manually by the application
programmer, instead of being automatically loaded. This is possible by
replacing the previous form unit with this
one. An extra button is added which loads the DLL, and sets up the
procedure addressed manually. Try running the program, creating several
threads and then loading the DLL. You should find that the DLL no longer
correctly keeps track of the number of threads in the various processes
that use it. Why is this? The Win32 help file states that when using the
entry point function with the arguments DLL_THREAD_ATTACH and DLL_THREAD_DETACH:
"DLL_THREAD_ATTACH Indicates that the current process
is creating a new thread. When this occurs, the system calls the entry-point
function of all DLLs currently attached to the process. The call is made
in the context of the new thread. DLLs can use this opportunity to initialize
a TLS slot for the thread. A thread calling the DLL entry-point function
with the DLL_PROCESS_ATTACH value does not call the DLL entry-point function
with the DLL_THREAD_ATTACH value.
Note that a DLL's entry-point function is called with
this value only by threads created after the DLL is attached to the process.
When a DLL is attached by LoadLibrary, existing threads do not call the
entry-point function of the newly loaded DLL."
It drives the point home by also stating:
"DLL_THREAD_DETACH Indicates that a thread is exiting
cleanly. If the DLL has stored a pointer to allocated memory in a TLS slot,
it uses this opportunity to free the memory. The operating system calls
the entry-point function of all currently loaded DLLs with this value.
The call is made in the context of the exiting thread. There are cases
in which the entry-point function is called for a terminating thread even
if the DLL never attached to the thread.
This behaviour has two potentially unpleasant side effects.
The thread was the initial thread in the process, so the
system called the entry-point function with the DLL_PROCESS_ATTACH value.
The thread was already running when a call to the LoadLibrary
function was made, so the system never called the entry-point function
Readers would benefit from noting that both these side effects have repercussions
when deciding when to set the IsMultiThread variable.
It is not possible, in the general case to keep track of how many threads
are in the DLL on a global basis unless one can guarantee that an application
loads the DLL before creating any child threads. One might mistakenly assume
that an application loading a DLL would have the DLL_THREAD_ATTACH entry
point called for already existing threads. This is not the case because,
having guaranteed that thread attachments and detachments are notified
to the DLL in the context of the thread attaching or detaching, it is impossible
to call the DLL entry point in the correct context of threads that are
Since the DLL entry point can be called by several different threads, race
conditions can occur between the entry point function and DLL initialization.
If a thread is created at about the same time as the DLL is loaded by an
application, then it is possible that the DLL entry point might be called
for the thread attachment whilst the thread main body is still being executed.
This is why it is always a good idea to set up the entry point function
as the very last action in DLL initialization.
When writing robust applications, the programmer should always be prepared
for things to go wrong. The same is true for multithreaded programming.
Most of the examples presented in this tutorial have been relatively simple,
and exception handling has mostly been omitted for clarity. In real world
applications, this is likely to be unacceptable.
Recall that threads have their own call stack. This means that an exception
in a thread does not fall through the standard VCL exception handling mechanisms.
Instead of raising a user-friendly dialog box, and an unhandled exception
in a thread will terminate the application. As a result of this, the execute
method of a thread is one of the few places where it can be useful to create
an exception handler that catches all exceptions. Once an exception has
been caught in a thread, dealing with it is also slightly different from
ordinary VCL handling. It may not always be appropriate to show a dialog
box. Quite often, a valid tactic is to let the thread communicate the fact
that a failure has occurred to the main VCL thread, using whatever communication
mechanisms are in place, and then let the VCL thread decide what to do.
This is particularly useful if the VCL thread has created the child thread
to perform a particular operation.
Despite this, there are some situations in threads where dealing with
error cases can be particularly difficult. Most of these situations occur
when using threads to perform continuous background operations. Recalling
chapter 10, the BAB has a couple of threads that forward read and write
operations from the VCL thread to a blocking buffer. If an error occurs
in either of these threads, the error may show no clear causal relationship
with any particular operation in the VCL thread, and it may be difficult
to communicate failure instantly back to the VCL thread. Not only this,
but an exception in either of these threads is likely to break them out
of the read or write loop that they are in, raising the difficult question
of whether these threads can be usefully restarted. About the best that
can be done is to set some state indicating that all future operations
should be failed, forcing the main thread to destroy and re-initialize
The best solution is to include the possibility of such problems into
the original application design, and to determine best effort recovery
attempts that may be made.
In Chapter 7, I indicated that one potential solution to locking problems
is to put shared data in a database, and use the BDE to perform concurrency
control. The programmer should note that each thread must maintain a separate
database connection for this to work properly. Hence, each thread should
use a separate TSession object to manage its connection to the database.
Each application has a TSessionList component called Sessions to enable
this to be done easily. Detailed explanation of multiple sessions is beyond
the scope of this document.
© Martin Harvey