Let's make use of Thread Local Storage!

Body

TUTORIAL NEEDS REVISION - as stated in comments - "Note one should use initialize_after() and terminate_before() instead of _init() and _fini(), as the shared object crt code actually defines them already in BeOS, to do some dirty tricks, and call the initialize/terminate functions when present." - NEEDS REVISION

Last week, I came across an interesting problem which I had not dealt with before. Some people on IRC were discussing the testing of the new netkit, and had problems with the unimplemented libnet call _h_errnop(), which supposedly returns the address of the errno variable for network funcs. In fact, I think it calls _h_errno() defined somewhere else, but that's the idea.

Since we want this stuff to be thread-safe, errno shouldn't be clobbered by calls made by other threads. In the headers (search in /boot/header for netdb.h), this function is then used in a macro to virtualise the UNIX network equivalent of errno, with this line:

#define h_errno (*__h_errno())

But the point is: How can threads, which belong to the same address space, access something different at the same address ?

The answer is Thread Local Storage (TLS), which is an API that provides a means to get a 32 bit value that can have a different value for each thread in a team. It's a valuable thing to help in the porting of a UNIX mono-threaded application. And it is also used in the system libraries, as we will see below. There is even a page of the BeBook that describes this API (check also the header file).

While I was searching for more information about this, Google kindly pointed me to this page, where you can find the source code to the BeOS implementation of TLS on R4 (which seems to have been integrated in libroot by now, with a bit of help from the kernel maybe, since the intel implementation uses inlined asm that makes use of the fs segment register), and find out that the API comes from MS-Windows (*grin*)...

Another trick

For the new libnet, we need to make use of this TLS API. But however, there is another issue: How can you be sure make sure the calls to _h_errnop() will succeed, since the TLS slot needs to be allocated before making use of it?

Well there might be more than one way of ensuring this, one of which would be to use atomic_add() like this:

status_t *_h_errnop(void)
{
static uint32 initdone = 0;
static volatile uint32 initcomplete = 0;

// make sure we don't test a cached register
// that we wouldn't be able to assign from another thread
if (!atomic_add(&initdone, 1)) {
h_errno_tls = tls_allocate();
atomic_add(&initcomplete, 1);
} else {
atomic_add(&init_done, -1);
while (!initcomplete)
// busy wait the first thread to pick up init did it totally
;
}
return tls_address(h_errno_tls);
}

This is a bit of overkill for such a tiny thing.

There is a nicer solution, one that is used by the current libroot also, which involves hooking in the library loading mechanism, using standard elf libraries features. Google told me this was described here (dang it, at least a nice tutorial about shared libraries !!!).

The trick here is we will provide the _init() function to the compiler (to the linker really), so it won't include the default one when creating the shared object. This special function gets called just when library has been loaded, before anything else is called in it. As you will see below it makes everything much simpler :-)

The code

Here is just what I proposed to the people on IRC to add to the libnet code. Quite simple but still helpful.

#include <TLS.h>
#ifdef __cplusplus
extern "C" {
#endif
// the TLS id
static int32 h_errno_tls;
// This hook gets called upon dynamic library load
void _init()
{
h_errno_tls = tls_allocate();
// however it doesn't check for error...
}
// this one is called before the library gets unloaded
// (fini means ended in French)
void _fini()
{
}
// returns the pointer to the _h_errno status variable.
status_t *_h_errnop(void)
{
return tls_address(h_errno_tls);
}
#ifdef __cplusplus
}
#endif