Issue 4-39, September 29, 1999

Be Engineering Insights: Rule #1

By Howard Berkey

When we decided to move networking into the kernel for the networking rewrite, some kernel team gangstas dragged me down a dark alley and roughed me up a little. Realizing that I was becoming an honorary member of the kernel team, they had to jump me into the gang and lay down the law.

"Rule #1," said Ficus, "is NO C++ IN THE KERNEL. When you find yourself hankering to overload operators in kernel code, ABSTAIN, DUMMY!"

It may come as a surprise to some BeOS developers that in our wonderful, buzzword-enabled, OO-powered OS, there is not a single line of C++ code in the kernel. There are many reasons for this, none of which will appear in this article.

Many experienced C programmers have been writing object oriented code in C for years. This article is geared towards C++ programmers who haven't coded much in pure C since college but are interested in doing drivers or kernel modules, and at C programmers who just hadn't thought of this before.

The moral of the story is, "There's no reason to let lack of language syntactic sugar keep you from writing good, solid object oriented code."

The best way to illustrate OO programming in C is to jump right in to code. The sample code for this article is available at <ftp://ftp.be.com/pub/samples/languages/ObjectOrientedC.zip>

Consider a base class, and a subclass that inherits from the base class and includes both public and private data:

/* from class.h in the sample code */

/*
 * base class definition
 */
typedef struct myBaseClass
{
    char               *name;
    int                 id;
    struct myBaseClass *next;
} myBaseClass;

/*
 * inherited class definition.  Note inheritance through
 * aggregation of the base class as the first member
 * of the struct.
 */
typedef struct myInheritedClass
{
    myBaseClass base;
    uint32      public_data;
    void        *private_data;
    void        (*public_action)(struct myInheritedClass *);
    void        (*private_action)(struct myInheritedClass *);
} myInheritedClass;

These are declared in a public header file, which we'll call class.h. For this example, class.h contains the class declarations and construction/destruction function declarations; class.c contains the class implementation; and main.c contains the code which uses the classes.

Several interesting things are going on here. Classes are obviously represented in C as structs. The classes can contain either data members, or member functions.

Inheritance is accomplished by simply aggregating the base class struct as the first member of the subclass.

Public data encapsulation is accomplished via simple aggregation of the data type or a reference to it. Private data encapsulation is accomplished by hiding the actual private data by referencing it via a void*, and providing member functions to access the actual data.

Since C has no scope resolution operator to allow direct implementation of class member functions, the member functions for the class (public action and private action) would need to be declared and defined as standard C functions (presumably static to the module they are in) and assigned to the member function pointers in the class.

Note that these class declarations describe abstract classes. Concrete subclasses are accessed through the interface specified by the abstract class declaration. This gives us polymorphism, the third part of the sacred OO trinity along with inheritance and encapsulation.

Since C has no constructors and destructors, an allocator function must be provided for each concrete subclass type. For example, if you wanted to make three concrete subclasses from the myInheritedClass abstract interface called "one", "two", and "three", the following functions would be defined:

/*
 * allocator and deallocator prototypes
 */
extern myInheritedClass *new_one(int);
extern myInheritedClass *new_two(int);
extern myInheritedClass *new_three(int);

Each of these functions would allocate the memory for the classes, initialize all the data members of the classes (and their base classes), allocate the private data members for each class and assign them to the private data pointer, and assign function pointers to the member functions. The new* functions take an int argument here and assign the value to a member variable, as we'll see, much like a C++ constructor might take an int argument to initialize a data member if desired.

The corresponding delete functions

extern void             delete_one(myInheritedClass *);
extern void             delete_two(myInheritedClass *);
extern void             delete_three(myInheritedClass *);

would deallocate the space used by the classes and their private data.

For implementation, any functions relating to the internal state of any given class will be colocated in the .c file implementing that class. Any code which is a consumer of the class will only be able to use the class through its interface as specified in a header file. This helps insure encapsulation.

The new * and delete * functions would (of course) be implemented in the same file as the specific subclass is implemented. One way to write one of the new * functions would be

/*
 * definition of the private data structures used by the
 * "one", "two", and "three" subclasses
 */
struct one_private_data
{
    int data;
};

[...]

/*
 * implementation of the member functions used to access the
 * class' data
 */
static void public_action(struct myInheritedClass *mc)
{
    printf("class %s, instance id %d\npublic data is %ld\n",
mc->base.name, mc->base.id,
            mc->public_data);
}

static void one_private_action(struct myInheritedClass *mc)
{
    struct one_private_data *pd = (struct one_private_data *)
mc->private_data;

    printf("private data is %d\n", pd->data);
}

[...]

/*
 * templates used to simplify creation of instances of each
 * of the subclasses. alternatively, once could simply assign
 * values to each of the struct membersin the new * functions.
 */
static myInheritedClass one = {
    {
        "one",
        0,
        0
    },
    1,
    0,
    public_action,
    one_private_action
};

[...]

/*
 * class instance allocators and deallocators
 */
myInheritedClass *new_one(int id)
{
    myInheritedClass *rc = 0;

    rc = (myInheritedClass *)
        malloc(sizeof(myInheritedClass));

    if(rc == 0)
        goto error;

    memcpy(rc, &one, sizeof(myInheritedClass));

    rc->base.id = id;

    rc->private_data = malloc(sizeof(struct
        one_private_data));

    if(rc->private_data == 0)
        goto error1;

    ((struct one_private_data *) rc->private_data)->data =
        rand();

    return rc;

error1:
    free(rc);
error:
    return 0;
}

Two interesting things here. First, a static global instance of the specific subclass is created with most of the fields initialized, particularly the member functions. This is then copied to create specific instances of the class in the new function, and instance-specific data members are then initialized there.

Next, the private encapsulation through data hiding via (void *) is illustrated by the allocation of the one private data structure and its assignment to the opaque (void *) private data pointer.

To wrap it up, remember the important point that these implementations are in a file not directly referenced by code that uses these classes; the only interface visible to them is the declarations of the class structures (and the class' member functions), and the new * and delete * functions. The sample code illustrates this. Here is how the classes are used, in a separate source file:

/* from main.c */

#include "class.h"

static myInheritedClass *make_class(int id);

/*
 * this function returns a random inherited class instance
 */
static myInheritedClass *make_class(int id)
{
    myInheritedClass *rc = 0;

    switch(rand() % 3)
    {
        case 0:
            rc = new_one(id);
        break;

        case 1:
            rc = new_two(id);
        break;

        case 2:
            rc = new_three(id);
        break;
    }
    return rc;
}

int main(int argc, char **argv)
{
    myInheritedClass *head = 0, *tmp = 0;
    int i;

    /*
     * make a list of 5 random subclass instances.
     */
    for(i=0;i<5;i++)
    {
        tmp = make_class(i);
        if(tmp != 0)
        {
            tmp->base.next = (myBaseClass *) head;
            head = tmp;
        }
    }

    /*
     * iterate through the list and call the member
     * function to print out each subclasses public
     * and private data. Whee, polymorphism!
     */
    for(tmp = head; tmp != 0; tmp = (myInheritedClass *)
        tmp->base.next)
    {
        printf("{\n");
        tmp->public_action(tmp);
        tmp->private_action(tmp);
        printf("}\n\n");
    }

    return 0;
}

There you have it, object inheritance, encapsulation, and polymorphism in C. In this example, inheritance is used by aggregating myBaseClass as the first member of myInheritedClass; encapsulation is provided by having both public and hidden data members, with the only way to get at the private data types being member functions that know how to access them; and polymorphism, which is provided by the struct definitions for the classes acting as an abstract class interface from which concrete classes are created.

Volatility

One thing I've noticed lately in other people's code is the use, or rather the lack of use, of the volatile keyword.

Essentially, one should declare something volatile if its value will be changed asynchronously by code executing in parallel to other threads using it, especially if the value is tested inside of loops and such. The reason for this is that the compiler has no way of knowing that a variable may be modified by something outside the codeblock it's currently compiling and may do optimizations that involve code motion, such as hoisting assignments or comparisons out of loops, or locally caching the value of a variable once instead of checking the value stored in its real memory location on each loop iteration. Declaring the variable volatile lets the compiler know that it isn't free to make these kinds of optimizations with respect to that variable.

The classic example is for variables that are updated at interrupt time, such as driver registers. But the same is true of variables that may be changed in another thread. If your execution path depends on testing the value of a variable in a loop, and the value may be changed asynchronously by another thread, then you need to declare that variable volatile.


Developers' Workshop: Condition Variables, Part 1

By Christopher Tate

In a multithreaded system like BeOS, the most critical issue is the synchronization of simultaneous tasks. The BeOS kernel provides one synchronization tool, the semaphore, but not all synchronization problems are best addressed this way. In particular, software designed for other operating systems where semaphores are not the primary synchronization tool can be difficult to rewrite using other primitives.

In the Unix world, for example, the basic synchronization primitive is the "condition variable," or "CV" for short. Superficially, CVs behave similarly to semaphores. When a thread waits on the CV, it blocks until some other thread signals that CV. The first thread then awakens and continues executing.

Signalling a condition variable is analogous to releasing a semaphore. The difference between them is in what happens when they're signalled (or released) when no threads are waiting. A semaphore "remembers" the signal; a thread that later tries to acquire the semaphore will succeed immediately because of the earlier signal. In contrast, condition variables don't remember whether they were signalled: if no threads are waiting when the CV is signalled, the signal is "lost," and has no effect.

In this week's article I'll present a condition variable implementation for BeOS. The interface is similar to the POSIX threads API for familiarity and ease of porting. Building a CV implementation using only semaphores turns out to be quite complex, so I'll just explain the code's behavior here and leave the detailed analysis of its workings until next week. You can download the code and an example program that uses it at this URL:

<ftp://ftp.be.com/pub/samples/portability/condvar.zip>

Unfortunately, I don't have enough space here to discuss condition variable semantics in detail. I recommend a good POSIX threads book for this. My personal favorite is David Butenhof's Programming With POSIX Threads , but the O'Reilly Pthreads book is also good. I'm also trying to avoid typing too much because of a hand injury, so for now, I'll just summarize the interface and the example program.

Condition variables are tightly associated with a locking primitive called a "mutex," so the library provides a C interface to both. Here are the APIs for both:

/* mutex operations */
status_t be_mutex_init(be_mutex_t* mutex,
                       be_mutexattr_t* attr);
status_t be_mutex_destroy(be_mutex_t* mutex);
status_t be_mutex_lock(be_mutex_t* mutex);
status_t be_mutex_unlock(be_mutex_t* mutex);


/* condition variable operations */
status_t be_cond_init(be_cond_t* condvar,
                      be_condattr_t* attr);
status_t be_cond_destroy(be_cond_t* condvar);
status_t be_cond_wait(be_cond_t* condvar,
                      be_mutex_t* mutex);
status_t be_cond_timedwait(be_cond_t* condvar,
                           be_mutex_t* mutex,
                           bigtime_t absolute_timeout);
status_t be_cond_signal(be_cond_t* condvar);
status_t be_cond_broadcast(be_cond_t* condvar);

Mutex operations are a lot like BLocker operations; I won't discuss them much here, except to say that, unlike a semaphore, a mutex can only be unlocked by the same thread that locked it. This provides a stronger locking guarantee.

be_cond_wait() and be_cond_timedwait() are the operations for blocking on the condition variable. They're identical except for the timeout semantic: be_cond_wait() blocks forever, and be_cond_timedwait() blocks until the specified time. That time is *absolute*, not relative, and in this implementation it uses the same time base as the kernel's system_time() function. If you want to block until 10 seconds into the future, you pass (system_time() + 10000000) as the timeout value. Note that there is a mutex argument to these functions: that mutex *must* be locked when the wait function is called, and the mutex is guaranteed to be locked when the function returns, even if the function returns an error.

be_cond_signal() awakens one waiting thread, and be_cond_broadcast() awakens *all* waiting threads. It's not necessary for any mutex to be locked when either of these two signalling functions is called.

It's important that your code test the condition you're interested in before blocking on the condition variable, and then test it again after awakening. For example, in the sample Alarm app, the condition is "the alarm queue is not empty." When waiting on this condition, the code looks like this:

while (alarm_list == NULL)
{
     be_cond_wait(&alarm_cond, &alarm_mutex);
}

This before-and-after testing is necessary because in certain rare circumstances the waiting thread may awaken without the condition variable being signalled. This is called a "spurious wakeup." The POSIX spec explicitly permits such behavior because preventing it can make CV implementations far less efficient.

The Alarm sample is a direct rewrite of Example 3.3.4 in the Butenhof book cited above. It repeatedly asks the user for an alarm time, expressed as a number of seconds into the future, and a message associated with that time. It then issues the message at that time. The code is heavily commented to explain how the condition variable is being used, and illustrates both regular and timed waiting. If you're new to condition variables, I recommend playing with the sample application, reading through the code, and thinking about how the implementation would be different if based on semaphores.

That's it for this article. Next week I'll look at the CV implementation itself, and explain why its unusual approach was necessary to guarantee the correct semantics.


Palm and Other Objects of Conventional Wisdom

By Jean-Louis Gassée

After the fact, everything is much clearer and (almost) everyone is much wiser. The latest episode in the Palm saga is (almost) no exception to that rule and... rather than pile on more retroactive wisdom, I'd like to focus on little fractures, on amusing ironies below the surface of the story. Who knows, the tale might contain forward wisdom for us to use, if we act decisively, before it turns retroactive.

A caveat before I proceed: I'm a director of 3Com. As a result, an even larger grain of salt should be required when considering my opinion on Palm matters. All I have is an opinion—I don't deal in Truth.

Let's remember how Palm started. They built PDA software and licensed it to Casio and Sharp—following the approved business model. For the founders, Jeff Hawkins and Donna Dubinsky, this became a frustrating experience in the midst of The PDA Part I, First Blood. Go and Eo came out, as did Microsoft's hommage to the idea, PenWindows. Then we had General Magic, the Newton, and—I was going to forget—Momenta. Let's say about half a billion dollars "invested" in "refining the concept."

In that context, Donna and Jeff had a hard time dealing with their Japanese licensees, who were too cautious, too slow, and too unwilling to invest in new versions of the hardware platform. So Palm's founders went to their Board and committed blasphemy: "Let's take control of our future. We know where we want to go, so let's build hardware that implements our vision".

The answer was a very polite "Let's find a strategic partner." Or more plainly: "These guys are crazy, let's sell the company." And, indeed, US Robotics successfully adopted Palm. The hardware was manufactured and well distributed in a retail channel that US Robotics knew how to feed. The founders, though, still felt uncomfortable, misunderstood, and somewhat disrespected by the strong US Robotics culture—a feeling that appeared to be mutual.

When 3Com acquired US Robotics, Palm wasn't very visible yet and not much was made of that part of the acquisition at the time. Donna and Jeff had great hopes for the relationship with their new adoptive parent. Sales kept growing. The Palm Pilot was now quite visible and had become market leader.

Meanwhile, CE appeared and went through several revisions. Many felt that Palm had to avoid being "Appled," that is, left isolated on a proprietary island. Palm's management resisted the change, the move to a new business model, even at a time when addiction to hardware margins hadn't set in yet—mainly by "virtue" of their absence.

In the early days, Palm sales grew rapidly, but the product wasn't making money. This encouraged some to push for a bloodless transition. Palm's founders found themselves frustrated again and lobbied for their freedom, as a spin-off. They felt misunderstood and stifled by the larger company. In 1998, a spin-off would have destroyed the favorable tax treatment of the US Robotics acquisition, coming too soon after its June 1997 closing. Donna, Jeff, and Ed Colligan left—with a Palm OS license—and started Handspring, a PDA company.

A year later, 3Com feels it can spin off Palm without unfavorable tax consequences and can realize value for its shareholders. Handspring announces a Pilot clone with an interesting approach to hardware modularity. It looks exciting—I've registered on their Web site to be notified of the product's availability.

So, what do we have? A soon-to-be-independent company, prepped for an IPO, just as the founders wanted—but without them. A strong, independent, creative Pilot clone maker, something the founders didn't want—but from them. Was the Pilot successful because Palm took control of both hardware and software, or in spite of that move? Likewise, are Apple's current fortunes exemplary or nonsensical? Is the PDA ecosystem better off because the founders left to start a muscular venture-backed (Kleiner, Benchmark et al.) clone? Or would the ecosystem be better off with the founders running a spun-off Palm, but without Handspring? In sum, is the Palm saga exemplary or totally accidental?

Creative Commons License
Legal Notice
This work is licensed under a Creative Commons Attribution-Non commercial-No Derivative Works 3.0 License.