OpenBeOS Project - Displaying Newsletter

Displaying Newsletter

Newsletter Archive

Issue 2, 07 Oct 2001

In This Issue:

Dumping variables Part 1: basics, byte order, and structure padding by Daniel Reinhold
So, you want to build a kit? by Michael Phipps
Why all languages suck by Michael Phipps


Dumping variables Part 1: basics, byte order, and structure padding	by Daniel Reinhold

This is a sort of back-to-basics article, so all you coding wizards out there can go back to sleep.

Being able to see the exact memory layout of data structures is very useful, even vital in some instances, so most programmers will have a handy function for doing this laying around. Here's mine:

typedef unsigned char byte;
void
show_bytes (void *arg, int size)
    {
    byte *b = (byte *) arg;

    printf ("\n{ ");
    while (size--)
        {
        printf ("%02x ", *b++);
        }
    printf ("}\n");
    }

This allows me to dump the binary contents of any variable like this:

int x = 3476;
show_bytes (&x, sizeof x);

On my machine this will display:

{ 94 0d 00 00 }

The hex value of 3476 is 0xd94, so you can clearly see that my machine, an Intel PC, uses little-endian byte order to store data in memory. I like to think of it as "psychotic byte order" because you surely have a need to mess with people's heads to store data in a screwed-up, reverse order like that!

By the way, if you want to quickly see the hex value of an integral value, here's a tip: go to a command line and type "pc 3476" (or whatever number) to see it displayed in several formats. Pc stands for "programmer's calculator" and was written by Dominic Giampaolo of BFS fame. It is integer only, but still very useful. It's included with every copy of BeOS (and you probably didn't even know it!)

If the variable above had been stored as { 00 00 0d 94 } this would be big-endian byte order. Different machines store data in either byte order, so it was decided (long ago) that all data sent across networks would be put in big-endian byte order (i.e. the "sane byte order") as a standard format to avoid confusion and scrambling of data. Therefore, this format is usually called network byte order or just NBO. This is important for network programming (using sockets, etc.) because you must make sure that any bytes sent across the wire are first put into NBO for delivery and that any bytes received must be converted back to the byte order used by the receiving computer -- usually called host byte order.

Now on to structures.

The function above works fine for structures too. For example:

typedef struct
    {
    char  c;
    short h;
    int   n;
    char  s[11];
    }
foo;
. . .
    foo f;
    f.c = 'A';
    f.h = 4533;
    f.n = -777;
    strcpy (f.s, "hello!");

    show_bytes (&f, sizeof f);

On my machine, this displays:

{ 41 00 b5 11 f7 fc ff ff 68 65 6c 6c 6f 21 00 00 00 00 00 00 }

This is alright, but it would be more useful to see the offsets of the different field members marked in some way. This would make it easier to see the binary contents of each field as well as the padding used to fill out the structure. So I have another function that just dumps structures.

Note that we need to pass a little bit more info here. Every struct can have a different number of fields and field sizes, so we need to pass in the offsets to let the dump function know where the boundaries are. Here's my version:

void
show_struct (void *arg, int size, int offs[])
    {
    int i;
    int k = 0;
    byte *b = (byte *) arg;

    printf ("\n{ ");
    for (i = 0; i < size; ++i)
        {
        if (i == offs[k])
            {
            if (i > 0) printf ("| ");
            ++k;
            }
        printf ("%02x ", *b++);
        }
    printf ("}\n");
    }

The call for this would look about the same as above, except for adding an array to hold field offsets. The simplest way to do this is to initialize an array using the 'offsetof' macro. This is a standard C library macro included in <stddef.h>. Here's how to use it for my example:

    foo f;
    f.c = 'A';
    f.h = 4533;
    f.n = -777;
    strcpy (f.s, "hello!");

    {
    int offs[] =
        {
        offsetof (foo, c),
        offsetof (foo, h),
        offsetof (foo, n),
        offsetof (foo, s)
        };

    show_struct (&f, sizeof f, offs);
    }

On my machine, this displays:

{ 41 00 | b5 11 | f7 fc ff ff | 68 65 6c 6c 6f 21 00 00 00 00 00 00 }

Here we can see how the first field member, a character, has nonetheless been stored in a two-byte slot. Also, the character buffer, defined as 11 bytes is padded to 12 bytes. This gives the entire structure a 20 byte size, so we can infer that the compiler wants to pack structures so that both individual members and the entire structure are memory aligned on two-byte boundaries.

Now to make the output even just a bit more friendly, I'll throw in another version that displays bytes as characters when possible. This is basically just a clone of 'show_struct' modified slightly:

void
show_struct_chars (void *arg, int size, int offs[])
    {
    char c;
    int  i;
    int  k = 0;
    byte *b = (byte *) arg;

    printf ("\n{ ");
    for (i = 0; i < size; ++i)
        {
        if (i == offs[k])
            {
            if (i > 0) printf ("| ");
            ++k;
            }
        c = *b++;
        printf ("%02c ", isprint (c) ? c : '.');
        }
    printf ("}\n");
    }

Combining a call to this function with the previous version gives:

{ 41 00 | b5 11 | f7 fc ff ff | 68 65 6c 6c 6f 21 00 00 00 00 00 00 }
{  A  . |  .  . |  .  .  .  . |  h  e  l  l  o  !  .  .  .  .  .  . }

Gosh, ain't it perty?

Notice that the string "hello!" is in the proper left-to-right order. String data is composed only of single bytes, so byte order doesn't come into play. As such, string data (i.e. a contiguous sequence of characters) can always be sent from one machine to another regardless of local byte order conventions and still be interpreted correctly on the other end. It's only multi-byte data types -- ints, floats, structs, etc. -- that must be packed carefully in NBO in order to be correctly interpreted on the receiving machine.

Unfortunately, there's a big limitation to the technique used in my struct dumping routines. They only handle "flat" structures -- i.e. structures whose fields are basic data types. What about structs that contain other structs? Well, the dump function would need to be a bit more intelligent, but it's not too hard to implement. However, I'll save that for Part 2 in the next newsletter.


So, you want to build a kit?	by Michael Phipps

In my recent work with ScreenSaver, I had the opportunity to build a kit from the ground up. I wanted to take some of the mystery out of it for those who find it terrifying. Some of this material appeared in one form or another previously on this list. Special thanks to Jason Gerstenberger, who did some of the initial "proof of concept" work.

A kit consists of (at least) three parts. Screensaver has a fourth:

A Preferences app which manages the configuration. We all know and love these.
A C++ library (.so) file that provides an API
A server that implements most of the "real" work.
Screen Saver also has an input_filter.

Tying the pieces together:

The preferences app writes flattened BMessages to disk and (sometimes) sends BMessages to the server. The client API (C++ library) sends BMessages to the server for any non-trivial work. Trivial work consists of things like accessors. The server sends replies to the client API.

Making the client API:

Create a new shared library project
Copy the ScreenSaver.h file into your project directory and add to your project
Run stubgen -n -q -g -a ScreenSaver.h to create the cpp
Read through objdump for clues (for this lib you don't need many at all!)
Build your lib & compare objdumps for your .so. with the factory .so.
So far as I could tell, my disassem was identical

This provides you with a stubbed library. Making the C++ API is not too tough. It is just a matter of creating methods that send BMessages to a server. Maybe some parameter checking, some exception throwing, but that is the whole of it.

I advise switching to the server, at this point: Making the server is sort of the guts of it. I started with HelloWorld (believe it or not). I removed the Window code, just keeping the Application piece. I implemented my own DispatchMessages method. Thinking of what a screen saver must do, I made a simple API:

blank
unblank
set the blank time
reset the timer
update the preference

For each of these, I check to see if the incoming 'what' matches the API function and, if so, call it. These functions were, originally, just printf("In function Blank\n");. This allowed me to develop incrementally and test. I proved that my server could receive messages and call functions. Yeehaaa!!!!

Next, I implemented the timer (what is a screen saver without a timer). The simplest way was to use pulse. So I set the pulse setting to 1 second intervals and on pulse, noted that another second had gone by. That allowed me to implement Reset the Timer. Next was making a BDirectScreen to "blank" the screen. That allowed me to implement Blank and Unblank (show and hide). Reset the Timer was actually needed for Unblank, so that came along for the ride.

Screen saver is a bit different from most of the other kits. Most kits have many functions that you can call. Screen saver has only a very few. Most of screensaver's functionality comes from overriding the BScreensaver class and adding functionality to it to draw on the screen. As an example, the point remains, though. The client lib sends messages like "set the blank time". The server accepts that message and deals with it.

There are pieces that are not described here, like preferences and preference loading. This, too, is fairly straightforward stuff. A preferences app makes a BMessage - very straightforward, as is sending it or writing it to disk. Likewise, receiving and/or reading those BMessages is very simple as well.

Here is the link for stubgen: http://www.radwin.org/michael/projects/stubgen/


Why all languages suck	by Michael Phipps

The computer science field is an interesting beast. On one side of life, we have the commercial aspect. This side is driven by having something new. The old is already sold, or so the expression goes. Windows is an obvious example of this. Every few years a new version of Windows comes out - otherwise, Microsoft would have nothing to sell. On the other side of the equation, we have the academic perspective. Not the business side of academia, but the research portion. New ideas and concepts are tested by peers. There is no drive to perfect a "product". Little impetus toward polish. Just writing enough code to prove that you are as smart as you claim.

Both of these groups need each other. If academia were alone, computers would be unusable by the average consumer. There is no drive to the academic mind to perfect things for those who aren't reading professional journals. This is not an indictment of the academic system; it is a description of their focus. The commercial groups, however, could not succeed without the pure research of academia. Most of the critical concepts and ideas in computer science have come from academia. The problem with the commercial world is that there is little drive to do fundamentally better. Adding features and bells and whistles creates a new release, just the same as recreating the product in a better way.

So what does this have to do with languages? Simply this - academia has produced hundreds, if not thousands of "new" programming languages. Rarely, though, are they made accessible. How useful is a language wherein a developer can do terminal IO only - no other interaction with the outside world? Not very, in today's world, but that is sufficient creation for most people in academia. They can point to the language, state that it is Turing complete and consider it finished. To a "real world" developer, however, this language is useless without the bindings and call mechanisms that allow interaction with the operating system in a more complete manner.

The commercial world, on the other hand, tends to do the finishing work. Take (almost) any compiler system in the marketplace. The issues are really the same for C++, C#, perl, and Visual Basic,. All of these languages are evolutionary adaptations of a basis founded 20+ years prior. I do not mean to denigrate authors of the original languages. Indeed - many of their ideas came from yet others. James Gosling and Bill Joy certainly would admit that Java comes from C/C++. Brian Kernighan and Dennis Ritchie would certainly admit that the Algol 60 report and Pascal impacted their thinking on the issues of C.

Isn't it time, though, for something different? Does anyone out there really think that there are no better ways of doing things? There are certainly flaws in each of these languages. A cursory examination of comp.lang.* will make abundantly clear that these languages are far from perfect. Yet it seems that every time we hear of a new language, it is a slightly different spin on something that already exists. That certainly allows for infrastructure reuse. GCC's intermediate language is a prime example of the basic similarity of different languages.

Additionally, every radical paradigm shift in programming has failed to live up to its hype. Structured programming rid us of goto's, but not bad programming. Objects have indeed changed the way that we think about code, but reuse is not as common as it was promised. Component based methodology is supposedly the cure for that. The silver bullet to make us all super-coders. Able to build web browsers in a single bound. Sure. I'll believe that when I see it.