Be Newsletters - Volume 4: 1999

Issue 4-40, October 6, 1999

Be Engineering Insights: A Tutorial Introduction to CVS

By Fred Fish

CVS stands for Concurrent Versions System. CVS stores different versions of files, both individually and collectively, and allows you access to previous versions.

The system is "concurrent" because it allows multiple developers to work concurrently on the same set of files with minimal conflicts, and handle most merging issues automatically. We won't explore merging in this introduction; instead we'll concentrate on how a single developer can effectively use CVS to manage a software project's source code and related files.

The CVS documentation in Postscript and HTML forms, the CVS source code, and executables for x86 and PPC BeOS 4.5.2 are available at

<ftp://ftp.be.com/pub/experimental/tools/cvs-1.10.7-doc.zip>
<ftp://ftp.be.com/pub/experimental/tools/cvs-1.10.7-src.zip>
<ftp://ftp.be.com/pub/experimental/tools/cvs-1.10.7-x86.zip>
<ftp://ftp.be.com/pub/experimental/tools/cvs-1.10.7-ppc.zip>

To install, unzip the desired binary archive and move the CVS binary to /boot/home/config/bin/cvs or some other suitable location in your search path.

Before you can start using CVS for managing the source to your project, you have to decide where to store the CVS repository on your system, set up the CVSROOT environment variable that tells CVS where the repository is, and initialize the repository.

You can put the repository anywhere on your system; it's useful to make a symbolic link to the location you choose, and then always refer to the repository via that link. This lets you move the repository around without changing the path you use to access it. If you move it, just update the symbolic link. To pick a root directory for the repository and set up the symbolic link:

$ mkdir -p /spare/junk/repository
$ ln -s /spare/junk/repository /boot/home/repository

When you run CVS, it needs to know how to find the repository. There are several ways to do this, including using the path to the repository you specify via the -d option, but it's probably easiest to just set the CVSROOT environment variable:

$ export CVSROOT=/boot/home/repository
(hint: put this in your /boot/home/.profile)

Before CVS can store anything in the repository you have to initialize it -- with the cvs init command:

$ cvs init
$ ls -lL /boot/home/repository
total 2
drwxrwxr-x  1 fnf  users   2048 Oct  2 11:00 CVSROOT/

The init command has created a directory called CVSROOT in the repository, which contains a number of administration files. One of the most useful is the "modules" file, which we won't use in this introduction, but which you may want to read about in the CVS documentation. It's particularly useful when you have a number of projects to maintain.

Now you can start using the repository. For this example, we'll use the Magnify app from the sample code directory on the BeOS R4.5 CD, which you copied to your hard drive if you installed the optional items:

(Note long line broken up using \ escaped newlines)

$ cd /r4.5/optional/sample-code/application kit/Magnify
$ cvs import -ko -I\! \
      -m "Baseline version from BeOS 4.5 CD" \
      Magnify be Magnify-4 5-1
N Magnify/LICENSE
...
N Magnify/makefile
No conflicts created by this import

The import command causes CVS to add this project to the repository. You can peek at the repository to check this:

$ ls -lL /boot/home/repository
total 4
drwxrwxr-x   1 fnf  users    2048 Oct  2 11:00 CVSROOT/
drwxrwxr-x   1 fnf  users    2048 Oct  3 11:28 Magnify/

Let's examine the import command arguments in greater detail. The -ko option tells CVS that you don't want any keyword expansions on the file contents. Consult the CVS docs for more detail on keywords.

The -I\! option tells CVS to add every file in the project, regardless of whether or not it might be one that CVS would normally ignore, such as object files or backup files.

The -m option provides a log message. If you don't supply a log message via -m , CVS will fire up whatever editor you specified with your EDITOR environment variable, and let you enter something longer than what can be conveniently typed on the command line.

The Magnify arg gives the subdirectory name in the repository where the files will be stored. The be arg is the tag used for the branch where the files are imported. The Magnify-4 5-1 arg is the symbolic tag for the specific set of files that correspond to this import. Note that the '.' character is not legal in tag names, so you have to use another character, like ' '.

Now that the files are in the repository, you can check out a working set in which you can make changes. You can check out as many copies as you wish; in fact it's often useful to have several different copies of the sources checked out at the same time. You might have one set of sources where you're doing active development, another set where you're fixing bugs reported by users, etc. Every working set of sources is independent and the changes you make will not appear in the repository or any other copy until you commit them to the repository and update your other copies.

Let's take an example that shows how to check out multiple copies of the sources, make changes in each copy, commit changes back to the repository, and automatically merge those changes into all the copies you've checked out. First, check out two complete sets of the sources, one for bug fixing and one for development:

$ mkdir -p /boot/home/bugfixing /boot/home/development
$ cd /boot/home/bugfixing
$ cvs checkout Magnify
cvs checkout: Updating Magnify
U Magnify/LICENSE
...
U Magnify/makefile
$ cd /boot/home/development
$ cvs checkout Magnify
cvs checkout: Updating Magnify
U Magnify/LICENSE
...
U Magnify/makefile

Let's now suppose that a user complains that you don't use "rgb" consistently in the help message; i.e., sometimes it's "RGB" and sometimes it's "rgb." He thinks it should always be in caps since it's an abbreviation, and you agree. Go to your checked out copy of the sources for bug fixing, edit the file, make the change, rebuild the app, test it, and then commit this change to the repository:

(Note: example lines chopped short for newsletter)

$ cd /boot/home/bugfixing/Magnify
$ emacs main.cpp
$ make
gcc -c main.cpp  -I./ -I-   -O3  -Wall -Wno-multichar ...
gcc -o obj.x86/Magnify  obj.x86/main.o  -Xlinker ...
xres -o obj.x86/Magnify Magnify.rsrc
mimeset -f obj.x86/Magnify
$ obj.x86/Magnify
$ cvs diff main.cpp
Index: main.cpp
==========================================================
RCS file: /boot/home/repository/Magnify/main.cpp,v
retrieving revision 1.1.1.1
diff -r1.1.1.1 main.cpp
756c756
<       text->Insert("      which pixel's rgb values will
---
>       text->Insert("      which pixel's RGB values will
776c776
<       text->Insert("  arrow keys - move the current sele
---
>       text->Insert("  arrow keys - move the current sele
$ cvs commit -m "Fix inconsistent use of RGB in help"
cvs commit: Examining .
Checking in main.cpp;
/boot/home/repository/Magnify/main.cpp,v  <--  main.cpp
new revision: 1.2; previous revision: 1.1
done

Prior to checking the changes into the repository you used the CVS diff command to print the differences between what's currently in the repository and what's in your working sources, just to double check what you were about to commit to the repository.

In order to make it easy to check out a copy of your sources (the latest released version), plus this bug fix, give this new set of sources the symbolic tag "Magnify-4 5-2", to signify that it is the second revision of the Magnify sources from 4.5:

$ cvs tag Magnify-4 5-2
cvs tag: Tagging .
T LICENSE
...
T makefile

At any point in the future, you can use that symbolic tag with the CVS checkout command to recover the sources to this revision of the project:

$ cd /tmp
$ cvs checkout -r Magnify-4 5-2 Magnify
cvs checkout: Updating Magnify
U Magnify/LICENSE
...
U Magnify/makefile
$ rm -rf Magnify

The bug fix is now in your checked out working set for fixing bugs and the copy in the repository, but NOT in the copy where you're doing active development. Let's pretend that this is a critical bug fix that you also need in your ongoing development sources. Without a source management system like CVS, you'd have to make the same change in your development sources and any other copies you were maintaining manually. With CVS, it's trivial to bring all other copies up to date with no manual work. You just use the CVS "update" command:

$ cd /boot/home/development/Magnify
$ cvs update
cvs update: Updating .
U main.cpp

If you make a change in one working set that conflicts with a change made in another working set, when you update, instead of getting a line like

U main.cpp

you'll get

M main.cpp

This means that you had a "merge conflict." The main.cpp file now contains fragments that look something like this:

<<<<<<<
extern int MyFunc (int a, long b);
=======
extern int MyFunc (unsigned int a, unsigned long b);
>>>>>>>

You need to examine these fragments, decide how to resolve the conflict, and edit the file appropriately. When you check it in, if what you kept was different from what was in the repository, the repository copy will be updated.

This is a good time to mention the CVS log command, which will print a summary of previous revisions of your files:

$ cvs log main.cpp

RCS file: /boot/home/repository/Magnify/main.cpp,v
Working file: main.cpp
head: 1.2
branch:
locks: strict
access list:
symbolic names:
        Magnify-4 5-2: 1.2
        Magnify-4 5-1: 1.1.1.1
        be: 1.1.1
keyword substitution: o
total revisions: 3;     selected revisions: 3
description:
----------------------------
revision 1.2
date: 1999/10/03 17:53:26;  author: fnf;  state: Exp;
lines: +2 -2
Fix inconsistent use of RGB in help
----------------------------
revision 1.1
date: 1999/10/02 18:18:07;  author: fnf;  state: Exp;
branches:  1.1.1;
Initial revision
----------------------------
revision 1.1.1.1
date: 1999/10/02 18:18:07;  author: fnf;  state: Exp;
lines: +0 -0
Baseline version from BeOS 4.5 CD
==========================================================

This output gives you a huge amount of useful information about the file and its history. From it, you know that the current version of the file (as numbered by CVS) is 1.2. You know that there are some symbolic names like Magnify-4 5-1 and Magnify-4 5-2 that you can use to specify specific revisions of the file, and you see what file revisions correspond to particular changes to the file, such as that revision 1.2 fixed the inconsistent use of RGB in the help text.

If you're curious about seeing the differences between revisions, you can use the -r" option to the CVS "diff" command to view the differences between two revisions:

(Note: example lines chopped at 60 chars for newsletter)

$ cvs diff -r 1.1 -r 1.2 main.cpp
Index: main.cpp
==========================================================
RCS file: /boot/home/repository/Magnify/main.cpp,v
retrieving revision 1.1
retrieving revision 1.2
diff -r1.1 -r1.2
756c756
<       text->Insert("      which pixel's rgb values will
---
>       text->Insert("      which pixel's RGB values will
776c776
<       text->Insert("  arrow keys - move the current sele
---
>       text->Insert("  arrow keys - move the current sele

Other options are also useful in the CVS diff command, like -c to get contextual diffs or "-p" to get the name of the function included in the diffs.

Let's now go back to your development sources, and make a change that you want to show up in future versions. Since this is only an example, we'll use a simple change that doesn't really change the program behavior:

$ cd /boot/home/development/Magnify
$ emacs main.cpp
$ cvs diff main.cpp
Index: main.cpp
==========================================================
RCS file: /boot/home/repository/Magnify/main.cpp,v
retrieving revision 1.2
diff -r1.2 main.cpp
3,6c3,4
< /*
<       Copyright 1999, Be Incorporated.   All Rights ...
<       This file may be used under the terms of the Be ..
< */
---
> //  Copyright 1999, Be Incorporated.   All Rights ...
> //  This file may be used under the terms of the Be ...
$ cvs commit -m "Use C++ style comments" main.cpp
Checking in main.cpp;
/boot/home/repository/Magnify/main.cpp,v  <--  main.cpp
new revision: 1.3; previous revision: 1.2
done

The problem you now have is that you want to start a new release cycle for your product, based on the current development sources, and only make changes to the source for that release that are related to fixing bugs found by beta testers. On the other hand, you have some ideas for fairly extensive changes that will improve the product, but may destabilize the sources for several weeks.

Up to this point, you have a single thread of development in the sources. You started off with an initial revision, made a change to fix a bug, made another change that normally would be enough to warrant releasing a beta test copy, and presumably are going to make some additional changes for ongoing development.

CVS handles this problem by letting you create a "branch" for the release. If you think of the sequence of revisions of a file as a tree, you might get something like the following tree (knocked over by a hurricane):

                       (beta 2.0)  (rel 2.0)   (rel 2.1)
                   O-------O-----------O-----------O
Release 2 branch) /1.3.1  1.3.2      1.3.3       1.3.4
                 /
root            /     1.4        1.5       1.6
 O----- O------O-------O----------O---------O----> trunk
1.1    1.2    1.3       \
                          O------>
                         1.4.1
                         (Release 3 branch)

By creating a branch, you can continue development on the main trunk, without destabilizing the sources on the branch. Ultimately the branch will terminate in a release, perhaps followed by a bug fix release or two, and then stop growing.

When creating a branch, it's useful to first mark the branch point with a symbolic tag, so that you always have an easy reference to the point in the sources where the branch was made. You do this with the CVS "tag" command:

$ cd /boot/home/development/Magnify
$ cvs update
cvs update: Updating .
$ cvs tag release2-branchpoint
cvs tag: Tagging .
T LICENSE
...
T makefile

To create the actual branch in the repository, you use the tag command again, but this time with the -b option. You give the branch the symbolic tag "release2-branch", and from now you can refer to the branch itself using that symbolic name.

$ cvs tag -b release2-branch
cvs tag: Tagging .
T LICENSE
...
T makefile

Note that the branchpoint tag (release2-branchpoint) refers to a very specific set of sources that will never change, while in most situations the branch tag (release2-branch) refers to whatever set of sources are the current head of the branch.

Now you need to check out a working set of sources for the branch development, build a release, and send it out to beta testers:

$ mkdir -p /boot/home/releases
$ cd /boot/home/releases
$ cvs -q co -r release2-branch Magnify
U Magnify/LICENSE
...
U Magnify/makefile
$ mv Magnify Magnify-release2
$ cd Magnify-release2
$ make
 ...
(ship copy to beta testers)

Now you return to your ongoing development sources. You make a bunch of changes that result in your development sources failing to build, and while you're trying to figure the problem out, you get your first bug report from the beta testers (they're pretty d**n quick):

$ cd /boot/home/development/Magnify
$ make
gcc -c main.cpp  -I./ -I-   -O3  -Wall -Wno-multichar ...
/boot/home/development/Magnify/main.cpp:52: syntax error
$ (you have mail)

You switch back to the release sources, poke around in them, and quickly spot the problem. You install a fix, commit it to the repository, tag the new sources as "release2-beta2", create a new beta release and send it out:

$ cd /boot/home/releases/Magnify-release2
$ emacs main.cpp
$ cvs commit -m "Fix problem reported by beta1 testers"
cvs commit: Examining .
Checking in main.cpp;
/boot/home/repository/Magnify/main.cpp,v  <--  main.cpp
new revision: 1.3.2.1; previous revision: 1.3
done
$ cvs tag release2-beta2
cvs tag: Tagging .
T LICENSE
...
T makefile
$ make
...
(ship beta2 copy to beta testers)

Since branches are independent threads of development, if you want the fix made on the release 2 branch to migrate back to the trunk, which you almost certainly do in most cases, you have to do what is known as a CVS "join." This is where the branchpoint tag comes in handy. You know that you tagged the sources with the branchpoint tag, fixed a bug, and tagged them again with "release2-beta". So you can migrate the patch back to the trunk with like this:

$ cd /boot/home/development/Magnify
$ cvs -q update -j release2-branchpoint -j release2-beta2
M main.cpp
RCS file: /boot/home/repository/Magnify/main.cpp,v
retrieving revision 1.3
retrieving revision 1.3.2.1
Merging differences between 1.3 and 1.3.2.1 into main.cpp

The main.cpp file in your working set for the development trunk now contains the patch you made on the release branch, as well as all the other changes you made that are not yet checked in. When you check those in, the release patch is checked into the trunk as well. The only danger is that if you blow away the changes you're working on, you will have also blown away the bug fix you made in the release branch, and it won't make it back to the trunk.

A better way to handle the migration of the patch from the release branch to the trunk is to create a temporary working set that is the latest set checked into the trunk of the repository, run the join command to migrate the patch to the trunk sources, and commit the patch back to the trunk:

$ cd /tmp
$ cvs -q checkout Magnify
U Magnify/LICENSE
...
U Magnify/makefile
$ cvs -q update -j release2-branchpoint -j release2-beta2
RCS file: /boot/home/repository/Magnify/main.cpp,v
retrieving revision 1.3
retrieving revision 1.3.2.1
Merging differences between 1.3 and 1.3.2.1 into main.cpp
$ cvs commit -m "Merge bugfix from release 2 to trunk"
cvs commit: Examining Magnify
Checking in Magnify/main.cpp;
/boot/home/repository/Magnify/main.cpp,v  <--  main.cpp
new revision: 1.4; previous revision: 1.3
done
$ rm -rf Magnify

Note how easy it is to create a temporary sandbox to make a quick change and then blow it away when you're done with it. This is partly because CVS does not maintain any state information in the repository about what files you have checked out or where they live.

Now, if you switch back to your development sources and do a CVS update, CVS notices that the patch from the release branch is already in your working sources (as a result of running the join command earlier) and updates you to the latest revision of the file plus your uncommitted local changes:

$ cd /boot/home/development/Magnify
$ cvs -q update
RCS file: /boot/home/repository/Magnify/main.cpp,v
retrieving revision 1.3
retrieving revision 1.4
Merging differences between 1.3 and 1.4 into main.cpp
main.cpp already contains the diffs between 1.3 and 1.4

Of course, local development modifications that you haven't checked yet in are unaffected, as you see when you run make again:

$ make
gcc -c main.cpp  -I./ -I-   -O3  -Wall -Wno-multichar ...
/boot/home/development/Magnify/main.cpp:53: syntax error
...

Some other useful commands are

cvs add       - add a new file to the repository
cvs remove    - remove a file from the repository
cvs tag name  - give current sources symbolic tag "name"
cvs rdiff     - examine differences between versions

Some additional hints. Use your repository to checkpoint your work occasionally, at points where you've reached some sort of milestone. By checking the latest changes into the repository, you ensure that you don't lose your work, or can revert back to a previous checkpoint if some development path turns out to be a dead end.

Make liberal use of symbolic tags. Think of a tag as a handle for grabbing a specific set of sources regardless of their individual version numbers. Give the tags meaningful names.

Back up your repository frequently. If you lose it, you lose a lot more than just your current development sources!

DISCLAIMER: CVS is unsupported software. There is no warranty that it is suitable for any particular purpose. Neither the author nor Be Inc. is liable for any loss of data that may occur as a result of using this software.

Developers' Workshop: Condition Variables, Part 2

By Christopher Tate

Last week I presented a condition variable (or "CV") implementation for BeOS. If you've looked over the code, you may have found it rather convoluted. I agree, but the complexity is not without reason. CVs are low-level atomic scheduling primitives, and expressing them in terms of a different primitive, the semaphore, is decidedly nontrivial. The complexity is increased by the restriction that the CV operations be implemented as a set of user-level functions, with no changes to the kernel. In this article I'll explain why the CV implementation is so involved, by illustrating a bit of the design process that went into its creation.

First things first. The source is available on the Be FTP site at this URL:

<ftp://ftp.be.com/pub/samples/portability/condvar.zip>

Fundamentally, CVs have two operations, waiting and signalling. Waiting is a blocking operation, and in BeOS the only blocking primitive is the semaphore, so a CV "wait" needs to be implemented as a semaphore acquisition. That, in turn, dictates that the "signal" operation be a semaphore release. Add to this the standard behavior that waiting on a CV unlocks then relocks an external mutex and we see that these two operations will look something like this (in pseudo-C):

cond_wait(condvar_t* cv, mutex_t* mutex)
{
   unlock(mutex);
   acquire_sem(cv->semaphore);
   lock(mutex);
}

cond_signal(condvar_t* cv)
{
   release_sem(cv->semaphore);
}

Ideally, this would be a sufficient. Unfortunately, there are a couple of problems. First, signalling releases the underlying semaphore even when there are no waiters. This means that threads attempting to wait later on will experience immediate, spurious wakeups until they exhaust the semaphore's accumulated "extra" signals. This is unfortunate, but it's technically allowed by the POSIX condition variable standard: spurious wakeups are deemed an acceptable price for efficient CVs.

There's a worse problem, however: the mutex unlock and semaphore acquisition are not atomic. The waiting thread could be rescheduled between those two operations, and this can lead to incorrect behavior in some situations. Here's one example: imagine two threads running the following loops endlessly:

mutex_t* mutex = MUTEX_INITIALIZER;
condvar_t* cv = COND_INITIALIZER;
volatile int mode = 0;

thread A:
{
   lock(mutex);
   for ( ; ; mode = 0, cond_signal(cv) )
   {
       while (mode == 0) {
         cond_wait(cv, mutex);   // line A
      }
   }
}

thread B:
while (true) {
   lock(mutex);                  // line B
   if (mode == 0) {
      mode = 1;
      cond_signal(cv);           // line C
   }
   while (mode != 0) {
      cond_wait(cv, mutex);      // line D
   }
   mutex_unlock(mutex);
}

These two threads ping-pong back and forth, using the CV as a signal to do so. Now, imagine that thread A is calling cond_wait() in line "A". The mutex is unlocked inside the cond_wait() implementation, but let's assume that thread A is preempted after the unlock but before it acquires the semaphore, and thread B begins running. Thread B acquires the now-available lock in line "B", sees that mode == 0, sets mode = 1, and calls cond_signal() in line "C". This releases the cv semaphore, making it available. Thread B then calls cond_wait() in line "D", which releases the lock and acquires the underlying semaphore—successfully! This is a spurious wakeup, which is allowed, so thread B has to re-test the condition that it's waiting on. "mode" is still non-zero, so thread B repeats the call to cond_wait() at line "D", this time blocking on the cv semaphore. Now thread A finally gets another chance to run, picking up where it left off in the middle of the cond_wait() implementation: it also attempts to acquire the CV's semaphore, and blocks. Deadlock: both threads are blocked on the same semaphore.

This example isn't contrived. This exact deadlock occurred in real code -- known to work properly on other platforms—when the developers used a BeOS condition variable implementation that turned out to be overly simplistic.

So, we need a mechanism to make the unlock-and-block *look* atomic. More precisely, we need to prevent a signal-then-wait sequence from racing ahead of some other thread which has begun the wait process but not yet blocked on the CV's semaphore. To do this, we'll need some mechanism that forces signallers to defer to waiters, even if the waiters haven't yet blocked on the semaphore. The mechanism I chose was to use an additional semaphore for "handshaking." The signaller waits for the in-progress waiter by blocking on the handshake semaphore, which the waiter releases upon awakening, i.e., receiving the signal.

This introduces a new complexity, however: the signaller has to know, when releasing the main semaphore, whether or not there's a waiter to answer the handshake. Testing the CV semaphore's count isn't sufficient; recall that the very problem we're trying to solve involves a waiter that hasn't yet acquired that semaphore. So, we need more bookkeeping: the waiting thread has to inform the signaller, somehow, of its presence.

To accomplish this, and properly account for the cases when multiple threads are trying to wait simultaneously, we add a count to the condvar_t structure, called nw for "number of waiters." Threads increment the count in cond_wait(), before they unlock the mutex, then decrement it again once they awaken, after they handshake with their signaller. The signaller uses this count to determine whether a handshake is necessary. Of course, the count manipulations need to be atomic, otherwise simultaneous waiters will corrupt the count.

The implementation now looks like this:

cond_wait(condvar_t* cv, mutex_t* mutex)
{
   atomic_add(&cv->nw, 1);
   unlock(mutex);
   acquire_sem(cv->semaphore);   // line E
   atomic_add(&cv->nw, -1);      // line F
   release_sem(cv->handshake);
   lock(mutex);
}

cond_signal(condvar_t* cv)
{
   int count = cv->nw;
   if (count > 0)
   {
      release_sem(cv->semaphore);
      acquire_sem(cv->handshake);   // defer to waiter
   }
}

This is better in two ways. First, it avoids the race condition illustrated above; the signaller defers to the awakened thread for a handshake, at which point both the wait and the signal have "completed," and the CV is back to a neutral state. Second, this new implementation doesn't release the primary semaphore unless there's actually a waiter present, which avoids the spurious wakeups of the initial approach.

Unfortunately, this implementation is still insufficient. There is another dangerous race condition: calls to cond_signal() might occur between lines "E" and "F" above; that is, after the waiter awakens but before the count is adjusted. These post-wakeup cond_signal() invocations would still see the waiter count as non-zero, so they would still release the main semaphore and try to handshake with the (nonexistent) waiter, and hang in cond_signal().

There's also a situation that arises because cond_signal() is not really the only way that a waiter can be awakened. It's possible that some other thread posted an interrupt to the waiting thread via kill() or a similar function; that would interrupt the waiter's attempt to acquire the main semaphore. We'd like to behave properly in such a case, with the cond_wait() returning B_INTERRUPTED but without attempting to handshake with a signaller. Similarly, the POSIX standard also mandates a function called cond_timedwait(), which allows a thread to wait until a specified absolute time for the CV to be signalled, at which point the wait times out and returns a suitable error code. In both of these cases, the awakened thread must be able to discern whether there are any signallers with which to handshake.

The multiple-signaller race issue is addressed by adding another lock to the condvar_t structure in order to serialize the cond_signal() operation, forcing the racing signallers to wait patiently for ongoing signal-and-handshake sequences to complete. The aborted-wait issue, in turn, requires that waiters have some knowledge of whether there are signallers in progress in order to handshake when expected to. This is accomplished by adding a signals-in-progress count to the condvar_t structure. We'll call the new lock "signalLock," and the new count "ns" for "number of signals." Here's the final implementation:

cond_wait(condvar_t* condvar, mutex_t* mutex)
{
   status_t err;

   lock(condvar->signalLock);
   condvar->nw += 1;
   release_sem(condvar->signalLock);

   unlock(mutex);
   err = acquire_sem(condvar->semaphore);

   lock(condvar->signalLock);
   if (condvar->ns > 0)
   {
      release_sem(condvar->handshakeSem);
      condvar->ns -= 1;
   }
   condvar->nw -= 1;
   unlock(condvar->signalLock);

   lock(mutex);
   return err;
}

cond_signal(condvar_t* condvar)
{
   status_t err = B_OK;

   lock(condvar->signalLock);

   if (condvar->nw > condvar->ns)
   {
      condvar->ns += 1;
      release_sem(condvar->semaphore);
      unlock(condvar->signalLock);
      acquire_sem(condvar->handshakeSem);
   }
   else   // no waiters, so the signal is a no-op
   {
      unlock(condvar->signalLock);
   }
   return err;
}

Because a wait can be interrupted at any instant, including while a signaller believes itself to be waking up the waiting thread, sometimes handshakes are necessary even when threads time out. This implies that the decision to handshake should be based solely on the signal count, not on whether the wait timed out.

Access to both the signal and waiter counts is serialized through the signalLock because both waiters and signallers use those counts to decide whether to handshake. Conceptually, that lock allows only one thread to formally enter a waiting or signalling state at a time, preventing the races described earlier in this article. Because the lock provides serialization, atomic arithmetic is unnecessary.

The source code for the condition variable implementation is more complete than I've presented here; it deals with interrupts coherently, and handles the CV "broadcast" and "timedwait" operations. The code is commented so that you can tell what it's doing; those two operations are simple generalizations of the basic "signal" and "wait" cases. The biggest drawback to this implementation is its overhead: it requires an extra pair of context switches per awakened waiter, plus it imposes fairly strict serialization on nearly simultaneous signal and wait operations. This is unfortunate, but it's the price we pay for having a correct CV implementation that does not rely on any kernel support other than sempahores.

From Socialism to Entrepreneurial Capitalism

By Jean-Louis Gassée

No, this isn't about the respective merits of Old World and New World cultures. Or about The Fatal Conceit, which sounds like a reference to e-stock market caps but, in fact, is the name of a book by Friedrich Von Hayek, a Nobel apostle of the free market. And that brings me to our topic: broadband, earlier misunderstandings, and ISDN.

Today, we learned that Paul Allen invested 1.5 billion dollars in RCN. Paul was a Microsoft co-founder and now is a billionaire investor. RCN is a DSL supplier bent on becoming a dominant player in the broadband age. The news delights me, because for about ten years, I've been an ISDN bigot, frustrated to see such promising technology fail to get traction in the real world.

The demos were terrific, especially in the days of 2400 bps modems. The call was set up in 250ms with no 25-second (when successful) modem mating chant. The speed was incredible; if you could combine two B channels, it was 20 to 400 times as fast as an ordinary phone line. Any change by more than one order of magnitude, by more than a factor of ten, is a revolution, not an evolution (which sounds like a consultant mating song).

I remember bridging AppleTalk networks through an international ISDN call, drag and drop heaven. But I was wrong. I was taken in by the demo. It wasn't reality—ISDN was socialism. By that I mean we were at the mercy of phone company apparatchiks; we could get a line attribution when the state monopoly bureaucrats got around to processing the paperwork. Actual installation was the fiefdom of another set of rulers. But it often worked. Not always, though often enough to tease us with visions of online bliss. But we were never admitted to heaven.

Now, we have entrepreneurial capitalism driving broadband into the marketplace. By entrepreneurial capitalism, I mean large sharks on Sand Hill Road as well as legions of smaller piranhas, all fighting for a piece of the broadband market, for a share of the Evernet, as John Doerr calls the next generation of the Internet. I was naive in the past, so am I naive again about broadband? This time, I think not, because there is competition, because we're no longer at the mercy of established phone companies, because the Web has whetted our appetite for instant-on, megabit-per- second connections, because we see new forms of Web applications combining information, entertainment, and transactions.

Competition is organized in variations of cable modems, DSL, and wireless cable. The last is a charming neologism, which refers to wireless two-way connections to offices and homes offering the bandwidth of cable, or more. All three classes have their problems. Cable modems suffer from poor infrastructure and, some say, the cable companies' reputation for poor service. Wireless cable isn't broadly deployed outside of high-end applications and needs precious real estate for antennas. DSL uses the local loop, the phone wires between my house and the central office. But these wires aren't always up to the task—although it depends who you ask.

In what the world perceives as the mecca of high-tech, in the heart of Silicon Valley, downtown Palo Alto, not far from one of the largest Internet nodes, the phone company says my house can't get a DSL connection. Fortunately, new regulations force the phone company to rent wires to competitors. One of them, recently acquired by RCN, says it could get a DSL connection to my house. We'll see. And there are other similar DSL stories.

So, yes, it's messy. But that's the good news. The glacial Old Order wasn't much fun. We like broadband, it creates opportunities for BeOS on both sides of the pipe. We'd rather have the messy, animated frontier scene we see today.