Binary Compatibility in 3 Easy Steps!

In the early days of the Haiku project, a debate raged concerning one of the projects primary goals: maintaining binary compatibility with BeOS R5. The idea was that the only way an effort to rewrite BeOS would be successful was if folks could continue running the apps they already had.

Certainly, a lot of software available for BeOS is open source or actively maintained -- these apps could just be recompiled if necessary. Others -- PostMaster, Gobe's Productive suite and a few other crucial apps -- weren't likely to get rebuilt, either because the original author had stopped maintenance without being kind enough to release the source, or because it just wouldn't be commercially feasible.

Some said that we were crazy; that it couldn't be done. Thankfully, cooler heads prevailed and we're well on our way to a binary compatible clone of R5.

"But wait!" you cry. "How did the cooler heads which prevailed know that the holy grail of binary compatibility was achievable?" I'm so glad you asked! Keeping reading and be enlightened, Grasshopper.

There are three basic issues that have to be addressed to ensure binary compatibility:

Names must be identical
This includes class and structure names as well as public, protected and global function and variable names.
Object sizes must be identical
Classes must contain the same number of bytes of data; global variables must be the same size. Maybe BGlobalVar should've been an int32 instead of an int16, but we're stuck with it now.
Virtual function table layout must be identical
The most cryptic and confusing aspect of maintaining binary compatibility. The issue essentially boils down to this: for any given class, there must be the same number of virtual functions, declared in the same order as the original.

The Nitty Gritty

"Good grief!" you say. "How on earth do I keep all this stuff straight?" Ah, Grasshopper, it is easier that you might imagine. Just follow these steps, and you should be binary compatible in no time!

Make a copy of the appropriate Be header file
This is now your header file. You may need to change a thing or two, but what you can (or will need to) change is quite limited, and discussed below.
Implement public, protected and virtual functions
In the course of doing this, you may discover that there are some private non-virtual function declarations that you just don't use. Feel free to axe them! Since they're private, nobody using the class will miss them, and because they're not virtual, they don't effect the vtable layout. Conversely, if you find a need to add some private, non-virtual functions, go right ahead (for the very same reasons).
Make sure you don't change the number of bytes used for data
There are two situations that can make this seem difficult. First, there may be data members that you don't use in your reimplementation. You can just leave them (safe, but a little messy) or you can add the extra members' bytes to the class's "unused" data array. An example will make this clear.
Let's say we have a class BFoo:
```
    class BFoo {
        public:
            BFoo();
            void SomeFunc();
        private:
            int32 fBar;
            int32 fQux;
            char  fZig;
            int32 fUnused[2];
    };
        
```
The Be engineers that originally wrote this BFoo purposely added some data padding in the form of an array of 2 int32s (they did this with most classes in the API). Now let's suppose in your implementation, you really didn't need fQux. You can add fQux's bytes into fUnused:
```
    class BFoo
    {
        ...
        private:
            int32 fBar;
            char  fZig;
            int32 fUnused[3];
    };
        
```
Nothing could be easier!
"But what if I don't need fZig, either?" you wonder. "It's only one byte, not four like an int32!" Have no fear! Just rename it "fUnusedChar" and be done with it.
The second situation that can make preserving object size tricky is if there aren't enough bytes of data available. Building on our cheesy BFoo example, let's suppose that rather than getting rid of fQux and fZig, you actually needed to add another 4 int32s worth of data: fNewData1 through fNewData4. The original implementation of BFoo has two extra int32s which we can use, but that leaves us two int32s short. What to do? The easiest thing to do is create a data structure to hold your new data and convert one of the fUnused items into a pointer to that structure:
```
    // Foo.h
    struct _BFooData_;
    class BFoo
    {
        public:
            BFoo();
            ~BFoo();
            void SomeFunc();
        private:
           int32 fBar;
           char  fZig;
           _BFooData_* fNewData;
           int32 fUnused[1];
    };
    // Foo.cpp
    struct _BFooData_
    {
        int32 fNewData1;
        int32 fNewData2;
        int32 fNewData3;
        int32 fNewData4;
    };
    BFoo::BFoo()
    {
        fNewData = new _BFooData_;
    }
    BFoo::~BFoo()
    {
        delete fNewData;
    }
        
```
Voila! More data without making the class bigger. Notice the added destructor; make sure you're cleaning up your new (dynamically allocated) data.

More about vtables

A vtable (or virtual functions table) is created for each class with virtual methods. It allows the compiler to find the right version of the method to call, no matter how much subclassing and overriding of methods took place.

Like for the object content, the vtable is part of the ABI, so its layout (size, and order of the contents) must not change.

If that principle was followed strictly, it would not be possible to ever add or remove a new virtual method to a class (because that would change the vtable size). Over the years, this became quite a problem in Haiku, as we needed to extend several classes with new functionality.

Fortunately, the Be engineers had planned for a way to achieve that. Just like for the object data, they added some reserved slots to each class. Usually it looks something like this:

	class BFoo {
		public:
			virtual void Foo();

		private:
			virtual void Reserved1();
			virtual void Reserved2();
	};

When you need to add a new virtual method to such a class, you can do just this:

	class BFoo {
		public:
			virtual void Foo();
			virtual void NewFancyStuff();

		private:
			virtual void Reserved2();
	};

Then, the vtable layout is not changed, and ABI is preserved!

… Or is it?

Well, actually, there is more to it. Let's see how the vtable looks for a subclass of BFoo declared this way:

	class MyFoo : public BFoo {
		public:
			void Foo() override;
	};

This class overrides the Foo method from BFoo, but leaves the two reserved methods untouched. Hence its vtable will have 3 entries:

MyFoo::Foo

BFoo::Reserved1

BFoo::Reserved2

As you can see, and unfortunately, the vtable references the reserved methods. When these are removed from BFoo, suddenly we get an undefined reference, as the vtable references methods which are now gone. So, we need to make this work somehow.

We cannot re-add the method to BFoo, because if we make it virtual, it would change the vtable side, and if we don't, it wouldn't have the same mangled name ("mangled" names are a conversion of C++ method names to a longer and somewhat cryptic string, used so that the linker can differenciate methods with the same name but different implementations, for example taking different parameters or being from different classes).

So, we have to manually provide a symbol which looks like the now gone reserved method. Fortunately, the C++ language is flexible enough to let us do this (well, by bending it a little stronger than usual). First, we need to know the mangled name of the method. There does not seem to be an easy tool for that, so you will need to use readelf or objdump to list library symbols to find it. Then we declare this symbol as an extern C function:

extern "C" void Reserved1__3Foo()
{
}

The extern "C" tells the compiler to not mangle the name further (it uses C mangling rules, which are essentially "just use the function name as is"). Note that some extra care is needed when using different versions of gcc, as they do not all use the same mangling rules for C++ (this is the main reason Haiku is still required to use gcc2 to provide BeOS compatibility).

So, we fixed our undefined reference, and our program is running again. But, there is an additionnal problem. MyFoo is still referencing BFoo::Reserved1 in its vtable. So, if someone writes:

MyFoo foo;
foo.NewFancyStuff();

It will not work, because the Reserved1 method will be called instead. Since we cannot fix the vtable, we have to live with that extra indirection and make our replacement for Reserved1 do the right thing:

extern "C" void Reserved1__3Foo(Foo* object)
{
	object-<Foo::NewFancyStuff();
}

There are several things happening there:

Reserved1 now takes a Foo object as its first argument. This is where the "this" pointer is placed when calling a method in C++, and that is how the compiler thinks Reserved1 should be called.
We use that pointer to call the new method NewFancyStuff.
We take care of calling it with the Foo:: prefix, which forces the comiler to skip any subclass vtable and look for the implementation in Foo directly.

And that should be it! We are now ABI compatible!

Except when more subclasses are involved!

Ah yes, right. There is an additional problem we need to solve. Given a base class BView and a derived class BButton which both live in libbe, as well as a third-party class FooButton. When reusing e.g. the virtual slot _ReservedView4() in BView to implement PreferredSize() we create a similar situation, with an additional snag: Initially both BView's, BButton's, and FooButton's vtable referred to BView::_ReservedView4(). After introducing BView::PreferredSize() and overriding it in BButton, FooButton still refers to BView::_ReservedView4(), i.e. invoking PreferredSize() on a FooButton calls the reserved symbol function we provide for compatibility. Having it Call BView::PreferredSize(), similar to the previous example, is not quite what we want, however, since we actually want BButton::PreferredSize() to be called.

The solution the software engineers at Be Inc. came up with, is the following: The root class of the class hierarchy, BArchivable, defines a virtual method Perform() which every class in the hierarchy overrides (or should override). The method provides a mechanism to emulate a virtual function call.

In the PreferredSize() example, the compatibility function _ReservedView4__5BView() doesn't call BView::PreferredSize() directly, but instead it calls Perform(PERFORM_CODE_PREFERRED_SIZE,...). All derived classes that override PreferredSize() also need to handle PERFORM_CODE_PREFERRED_SIZE in their Perform() implementation and call their respective implementation of the PreferredSize() method, i.e. for BButton that's BButton::PreferredSize().

And there you have it: Binary Compatibility in 3 (not-so-)Easy Steps!