Displaying Newsletter
Issue 20, 17 Jun 2002


  In This Issue:
 
Engineers, artists, and coding style by Daniel Reinhold 
We often have people ask about coding guidelines for the project. The standard answer is that we are using the OpenTracker guidelines. We then refer them to the proper link and the question is neatly taken care of, for the moment. Those who glance at those guidelines will realize, however, that they still leave a lot of room for variation. Some specifics such as indentation and spacing is treated in very explicit detail, while others are skipped over, left to the programmer's judgement.

Just how committed to tight coding styles are we in this project? It will depend on who you ask. It has been suggested that we develop our own set of coding guidelines that outline all the details in full. The argument is that it wastes developers' time if they have to spend effort trying to decipher someone else's coding style whenever they switch from one component to another. It's a worthy goal -- a consistent interface across all the files in a source tree. Many software houses require all their programmers to adhere to a strict coding style (generally written out by the original programmer at the company). It's a little more difficult for us to crack the whip on this project, as we are not paying salaries and can't threaten anyone with anything too heavy.

But this is also a very practical project. We want to get results. We depend, very heavily, on those who can make a real contribution. If Joe Schmoe is writing a ton of code for the project, how much incentive do we have to jump on Joe's case because he uses his own quirky coding style that many others don't care for. So long as the stuff he writes works correctly, it's difficult to bite the hand that produces.

This sort of dilemma is actually very common. I knew a fellow once who wrote some of the most awful code you have ever seen. It broke almost every good practice that you can think of: short, weird variable names that often appeared duplicative (a1, a_1, a11); long complex functions that were thousands of lines long and riddled with goto jumps to mysteriously named labels; tedious and complex numerical calculations filled with magic numbers without a hint of a comment to their purpose,... and on and on. And yet this same fellow was unbelievably meticulous about checking everything. Amazingly, his code worked beautifully -- it was absolutely rock solid. Of course, great pity on whoever might end up with the task of maintaining that code.

The truth is, computers don't know a damned thing about C or C++ or Java or Pascal or Lisp or Scheme or whatever language strikes your fancy. Computers run on machine code, pure and simple. I sometimes have to remind others or even myself of this basic fact from time to time. As programmers, we spend so much time with the compiler it becomes easy to forget that the compiler is only a dutiful messenger who takes our textual flights of fancy and renders something that our hardware can actually use.

That's why reading assembly listings or even examining the object code directly is a valuable exercise for most programmers. It's humbling and enlightening. You get a true glimpse of how your elegantly worded and designed source code is turned into the long flat sequences of primitive operations on bits, bytes, and registers that your computer actually understands.

On the other hand, we don't want to write in assembly. We need the abstractions of our high level languages to collect and organize our thoughts. We also need to explain and communicate our ideas to others. Writing software is really a community affair -- we all base our own work on the previous good works of others.

So when we create software groups, such as this project, we are creating a little community. And within that community will be a variety of types, ranging from the wise old men to the village idiots. We have to be able to accomodate the eccentricities and quirks of each individual and still promote an ethic of cooperation and willingness to sacrifice a little for the common good.

The members of this community are diverse and cover a large range of personalities. However, there are a few stereotypical cases that always crop up. Let's take a quick look at two common types of programming animals: engineers and artists.

Engineers are the meticulous type, attentive to every detail they consider important, inattentive to everything else. They often look the part too -- consider an image of Dilbert with cropped hair, thick glasses, and a pocket protector full of pens. Think of the kind of old school nerds that worked for NASA or big companies such as IBM. They like math. They hate soft subjects like psychology. They want definitive answers. They believe anything worth knowing is quantifiable, and they spend a lot of time quantifying things. They keep careful logs of their work. Their source code is complex, hard-to-read, poorly commented, and works very, very well. Generally it works so well because they analyze it piece by piece and add to it incrementally, retesting as they go. They mean exactly what they say, literally (not that they say much). If you misunderstood, it was your mistake -- they'll happily provide you with the definitions of words that they use. In general, they annoy the hell out of everyone else, but then again, everyone listens when they speak -- because they are almost always right.

Artists are dreamers, planners, thinkers on a higher abstract plane. You can usually identify them by their sloppy, casual clothes and unkempt long, long hair. Think of the stereotypical new-wave geek so in fashion during the dot com boom, coding at his desk all night long listening to blaring, heavy metal music. They like art and music. They hate conformity. They believe in a kind of perfect expression which they are always reaching towards. Their source code is well designed, well commented, but always incomplete. Coding is done in bursts -- short intervals where a massive amount of code is banged out at once, followed by days or weeks of inactivity while they ponder their next move. Quite often, the same source code is completely re-written from scratch several times. It's got to be perfect, but it never is. They are funny, bright, quit-witted, sarcastic, and very prone to bad moods and depression. They are well liked, for the most part, because of their brilliance and wit and irreverant humor, but are feared too, because of their moods swings and occasional outbursts of caustic anger.

Can you recognize the descriptions above? Do they sound frighteningly close to someone that you know (or perhaps, even yourself)? Of course, these are the most extreme stereotypes. Most of us in the programming camp don't really fit neatly into one type or the other. We have a mix of the traits described above. But then again, you could probably take a group of any programmers and, within a short time, divide them up generally into one of those two categories and not be far off the mark.

So how do we, the members of this project, accomodate different individuals with such clashing styles and outlooks? Simple. We don't push too hard. We take what we can get. We respect results, but we also respect ideas. We let each carve out his own space and walk lightly around that space. We learn to live with different perspectives. We try to foster listening and communication, not shouting and lecturing.

Oh, and when all that fails, we turn to Michael Phipps and force him to make the final, hard decisions. It's lonely at the top (*grin*).

 
Ezprof: A home brewed profiler courtesty of gcc options and attributes by Daniel Reinhold 
A few years ago, the folks at the Free Software Foundation quietly changed the meaning of gcc. Whereas it originally meant "GNU C Compiler", they now say that it stands for "GNU Compiler Collection" since it has become a front end for many different compilers. However, I have my own theory and have decided that it really stands for "GNU Command Collection". This is in honor of the more than 10,000 command line options that one can pass to gcc.

Alright, maybe not 10,000... perhaps I exaggerate a wee bit. But gcc is certainly *rich* with options -- more than I will ever use in my lifetime, I suspect. Nevertheless, I recently discovered, among the dense forest of choices, a gcc option that is very cool and that I can't resist playing with. This options allows you to trace all function calls in the source code. Simply include:

-finstrument-functions

on the compile command line, and gcc will insert calls to special hook functions at the entry and exit of every function in the source code. To enable this, you must define a pair of functions with these signatures:

__cyg_profile_func_enter (void *this_fn, void *call_site)
__cyg_profile_func_exit  (void *this_fn, void *call_site)
Not surprisingly, __cyg_profile_func_enter() is called at the entry of every function. The address of the function being entered is the first parameter; the address of the calling function is the second parameter. Likewise, __cyg_profile_func_exit() is called at the exit point of every function, using the same two parameters.

Ah, system hooks! (eyes widen with excitement). What fun you can have when you are provided with hooks into the system. With entry and exit hooks such as these, the obvious exploits are to implement tracing (to observe the call tree) and profiling (to obtain statistics on relative function usage and timing). You could probably even add in enough backing code to support a kind of primitive debugger, but the first two are good starters.

Having seen the availability of these hooks, there was no way I could resist trying to write some code to take advantage of them. So what kind of service should I implement? Tracing or profiling? Hmmm... how about both? It's not too hard to do a little of each, and, without being too greedy, generate scads of useful info for a small amount of work.


Tracing the call tree

Functions call other functions (which, in turn, call other functions). This creates a run time structure that is like a tree. For instance, function A calls function B which calls function C:

enter A
    enter B
        enter C
            ...
        exit C
        ...
    exit B
    ...
exit A
Tracing function calls allows you to view the tree-like traversal of the execution environment. Of course, you can do your own tracing by hand, as in:
void
foo ()
    {
    printf ("entering foo...\n");
    . . .
    // function code
    . . .
    printf ("...exiting foo\n");
    }
Probably every programmer has written code like that at least a few times, usually in the midst of a bug hunt. But imagine doing that for every single function in the source code. It's too tedious to even contemplate. That's the beauty of the gcc profile hooks: they'll generate the entry and exit calls for you, automatically, and for every single function. All you have to do is flesh out the hook routines with something useful.

There is one little problems with this scenario. If the compiler generates entry and exit hooks for every function in the source code, won't it generate those for the hooks themselves (after all, they are functions too)? Well, yes, it will. Unless you instruct gcc to not generate the hooks. There is a special function attribute that will stop the compiler from trying to trace the tracing function themselves.

A function attribute? What is that? Well, gcc has a set of extensions called attributes that allow you to control some specific aspects of the compiler's behavior. Mostly these attributes concern low-level, architecture specific details. Look at the document Declaring Attributes of Functions for more information about these directives. They're a pretty fascinating topic in and of themselves. Of course, since they are extensions, they are specific to gcc, so any source code that makes use of attributes will be non-portable to other compilers.

The attribute that prevents gcc from generating the tracing hooks is:

__attribute__((__no_instrument_function__))

Yes, the double parentheses are required. Just place this tag immediately before a function name in the definition, and gcc will not emit the extra calls. All other functions without this special attribute will have the tracing hooks generated. To make this easier to use, I define a macro called NO_TRACE that expands to this rather awkward text.


Ezprof

Ezprof is the cheesy name that I choose for my tracer/profiler (cuz it's so EZ.. see?). Since tracing is a rather dynamic thing, I thought it would be fun to write ezprof as a server. That is, the running program would send messages (via IPC) to ezprof, which would then display the call tree as each function was invoked. The overhead for sending the messages is small, so there is little (if any) effect on the program execution. When the client program finishes, the server then computes a few profiling statistics and writes them to stdout.

So how is the client/server messaging setup? A standard message port is used to deliver the address calls from the client app to the profile server. The client calls write_port() each time a hook is invoked, and the server calls read_port() to receive the messages. There is a file called trace.c that contains the needed tracing hooks. You need only include this file with any source program that you wish to profile. Another file, ezprof.c, implements the server.

To set this up, first compile the server and run it. Then compile your source code with the -finstrument-functions compiler option and the trace.c file included. Then run the client program -- it will sends messages to the server at the entry and exit of each function call. The server counts how many times each function is called, and stores the cumulative time spent in each function. When the client exits, the server displays the final statistics and stops.

Here's the file trace.c containing the needed client interface:

#include <stdio.h>
#include <OS.h>
#define NO_TRACE __attribute__((__no_instrument_function__))
void NO_TRACE __ezprof_trace (int, void *);
void
NO_TRACE
__cyg_profile_func_enter (void *this_fn, void *call_site)
    {
    // called upon function entry
    __ezprof_trace ('entr', this_fn);
    }
void
NO_TRACE
__cyg_profile_func_exit (void *this_fn, void *call_site)
    {
    // called at function exit
    __ezprof_trace ('exit', this_fn);
    }
	
void
NO_TRACE
__ezprof_trace (int mode, void *this_fn)
    {
    // sends the function address to the ezprof server
    //
    // portability warning:
    //   this routine assumes the function address is 4 bytes
    //   (true for IA-32 machines -- Intel PCs, etc.)
	
    static bool    initialized = false;
    static port_id port;
    static uint32  addr;
	
    if (!initialized)
        {
        port = find_port ("ezprof_port");
        if (port < 0)
            {
            fprintf (stderr, "unable to connect to ezprof server\n");
            exit (1);
            }
        initialized = true;
        }
    addr = (uint32) this_fn;
    write_port (port, mode, &addr, sizeof addr);
    }

Not much to it, really. The compiler hooks are just wrappers for the one tracing routine which sends the given function address along with the message code 'entr' or 'exit' to the ezprof server.

The 'addr' variable isn't strictly required, but it's useful for two reasons. First, we need to pass the address of a buffer to write_port(), and using a static variable avoids passing the address of a value on the stack. Secondly, it balances the write_port() and read_port() calls by allowing both to use a 4-byte integer for the message buffer.

Of course, for this to be valid, it has to be assumed that the function address is, in fact, 4 bytes. This will be true on any IA-32 machine (Intel Architecture-32 bit) and probably most other 32-bit computers as well. Function addresses are not the same size as integers on every machine architecture, but they are on most, and none of the exceptions are of any importance to us in the BeOS world.


Gathering symbols

As convenient as the gcc tracing hooks are, there is one obvious problem that has to be dealt with: only the function addresses are passed, not the symbolic names. This limits the usefulness of the tracing and profiling output considerably. It's not very convenient to report that function 0x800009f0 was invoked 23 times. What we really want to report is that function my_function() was invoked 23 times. Thus we need to load the symbolic names that are associated with each address.

Ordinarily the symbols are located within the image file itself. For some compilers, you have to specify that the symbols be kept, otherwise they are stripped out during the compilation. Gcc, however, keeps all symbols in the image file -- at least by default. Because of this, you should not have to worry about anything, just compile as usual. Obviously, you don't want to use any command options that will remove the symbols.

How to go about gathering the symbols? Well, a couple of options come to mind. The first idea I had was to shell to 'nm', a binutils program that returns a listing of all the symbols found in an image file, and just read from its output. This program may or may not be included with your version of the BeOS. If it's not already present, you can download it as part of the BeOS R5 Developer Tools found here.

But then, I remembered that you can load the image symbols directly by using some image functions in the Kernel Kit. It's pretty simple really. Here's a test program I wrote to do just that.

// ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
//  symdump.c
//
//  displays the symbols found in a given BeOS image file
//
//      Copyright (c) Daniel Reinhold
//      written May 2002
//
// ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <image.h>
#include <Errors.h>
#include <OS.h>
char *
symboltype (int32 type)
    {
    switch (type)
        {
        case B_SYMBOL_TYPE_DATA: return "DATA";
        case B_SYMBOL_TYPE_TEXT: return "TEXT";
        case B_SYMBOL_TYPE_ANY:  return "ANY";		
        }
    }
void
set_absolute_path (char *path, int len, char *arg)
    {
    char *p = strchr (arg, '/');
    if (p)
        strcpy (path, arg);
    else
        {
        getcwd (path, len);
        strcat (path, "/");
        strcat (path, arg);
        }
    }
int
main (int argc, char *argv[])
    {
    if (argc == 2)
        {		
        image_id img;
        char     imgfile[256];
		
        set_absolute_path (imgfile, 256, argv[1]);
		
        img = load_add_on (imgfile);
        if (img < 0)
            printf ("unable to load image: '%s'\n", strerror (img));
        else
            {
            char  buf[600];
            int   len;
            int32 type;
            void (*addr)(void);
            int   i, n = 0;
			
            for (i = 0; n != B_BAD_INDEX; ++i) 
                {
                len = sizeof buf;
                n = get_nth_image_symbol (img, i, buf, &len, &type, (void **) &addr);
                printf ("[%4d]:%4s\t %p\t '%s'\n", i, symboltype (type), addr, buf);
                }
            }
        }
    else
        puts ("usage: symdump imagefile");
	
    putchar ('\n');
    return 0;
    }
This is nice, but there's still a problem... C++ name mangling. C++ programs generate mangled names for the symbols and the simple routine above can't unmangle them back.

For example, a fuction with the signature  "int doit (char *, int, int)"   becomes   "doit__FPcii"   in the image file. There is no demangling function available in the Kernel Kit, which is too bad (perhaps we need to add one). I don't know the algorithm gcc is using either. I tried looking it up, found a few references, but nothing definitive. You can somewhat guess the algorithm used by comparing mangled versions to their original form, but writing a general function to do this correctly for all cases is very non-trivial.

So I went back to my original idea. You see, 'nm' has an option, --demangle, that will decode the mangled names back to their readable versions in the source code. Thus, ezprof executes the call "nm --demangle" at startup time and receives the program output via a pipe. The output is parsed and the symbols loaded into a hash table.


Keeping statistics

Given that the server can load up the symbolic names and adresses for functions in an image file, what kind of statistics should be kept? Well, your imagination is the limit here, but I only do the obvious stuff.

Naturally, we want to know how many times a function is called, so we keep a count. We also want to know how long each function invocation took, so we keep some timing info too. In the case of ezprof, this is stored in two ways: one variable marks the system time when the function is invoked, another stores the cumulative (total) time that all the function calls have used. Here's the node structure used by the hash table:

typedef struct node node;
struct node
    {
    char     *name;        // function name
    u4        addr;        // function address
    u4        calls;       // number of times the function has been called
    bigtime_t start_time;  // time the current function call began
    double    cum_time;    // cumulative time spent in function
	
    node *next;
    };
This is pretty much the basic info that any profiler would want. You could do more if you care to add in the extra processing.


Fixing up addresses

When the server starts up, it loads the symbolic names and addresses of each function as found by the 'nm' utility. However, the addresses aren't quite right. They need to be "fixed up" before being added to the hash table. Fortunately, the fixup is very, very simple. But the reason for the fixup is a little more complicated. I can't really explain it all without delving a bit into the topic of linking and loading.

The process of compiling/executing programs is naturally divided into two phases usually called "compile time" and "run time". Compilers themselves don't execute code. They merely translate the source code (text files) into a binary format and save it to disk. The binary file is available for execution later. To execute the binary file, another program, called a loader, loads the binary file into memory and then runs it.

To be more precise, the compiler proper creates the object (*.o) files, and another program, called a linker, stitches the object files together to create the image file, which is saved to disk. For BeOS Intel, the binary format is ELF, so I'll just refer to the image files as ELF files for simplicity.

Now there is a mix of terms used to describe these utilities that can be very confusing. The linker is also sometimes called a "linking loader" or even just a "loader". This is reflected in the name of gcc's linker which is called "ld". The program that loads existing image files is called a "program loader", "run time loader", or just "loader" or, sometimes, it's even referred to as the "run time linker". Not very helpful. For my part, I will always refer to the compile time utility that creates ELF files as the "linker" and the run time utility that loads existing ELF files as the "loader".

Obviously, for the compilation/execution process to succeed, the linker and loader have to work together. Originally, there was something of a linking/loading mismatch in ezprof -- at least, without the function address fixup. I was quite disappointed when my first sample runs wouldn't produce a function trace, but instead only gave "address not found" errors when doing table lookups. An inspection of the addresses made it clear very quickly what was going wrong: if function foo() had address 0x80000c80 sent by the client app, it had address 0x00000c80 stored in the hash table. There was an additional offset of 0x80000000 in the client addresses that wasn't present in the addresses stored by the server.

Of course, the difference lies in the source of the information. The hash table was loaded from the output of the 'nm' utility which read the stored ELF file. In other words, it was spitting back the addresses exactly as they had been written by the linker. But the client app was actually running, which meant it had been processed by the loader. The loader is a sub-system of the kernel and does any address fixups needed to make the freshly created process fit the memory model that the kernel uses. For the BeOS, the memory map looks like this:

0x00000000 - kernel 
0x80000000 - application code 
heap follows code 
0xea000000 - addons 
0xec000000 - libraries 
0xfc000000 - stack 
Notice that addresses 0x0 up to 0x7fffffff are in the kernel. Userland starts at address 0x80000000. But the linker doesn't know this. And doesn't need to know. It just assumes that all addresses are relative to address 0x0. It lets the loader do any address fixups needed to create a running process. The linker is part of gcc, written by members of the Free Software Foundation. It is not part of the kernel and knows nothing about the BeOS memory map. The loader, in turn, was written by Be programmers and is naturally tied to the specifics of the BeOS kernel. Of course, the two can work together because they both understand and use the ELF format.

Well, knowing what the problem was, the fix was easy enough: just add 0x80000000 to any function address before storing it in the server's hash table. That is, if it's a userland function. So I defined a macro called USER_MODE that causes the fixup to occur if the macro is true. So, does this mean kernel code can be traced/profiled with ezprof as well? Um... actually, I don't know. I think it should work, but honestly, I haven't even tried it yet (*shrug*).


Closing the connection

When the client program finishes, the server can compute the final profiling statistics and display them. How does it know when the client is finished? Continually poll the client with ping messages? Look up the client program in the roster and try to detect when it is no longer there? Nah, I didn't want to fiddle with any of that mess. Instead, ezprof has two different ways of closing the connection.

The first method (the nice one) is when the client's main() function exits. To enable detecting this, the server makes note of the address of the main function at startup. Every time an exit function message is received, the address is checked to see if it's the main function that is ending. If so, the server loop is broken, and the stats can be generated.

If, however, the client program does not exit normally, it's up to the user, watching the command line, to kill the server manually. A signal is raised on a SIGINT interrupt (triggered by typing Ctl-C or Alt-C), which will then cause the server to display the final stats and exit. Not all that graceful, but it works.


Excluded functions

Ezprof does not actually store info on every function. Only the functions that are available in the source code. Library functions used by the client program are not profiled. This is for the simple reason that the profile hooks can only be generated for functions being compiled.

This may seem like a limitation that will weaken the value of the profiling numbers, but it isn't really. If you have the source for a needed library, you can compile it with tracing too -- i.e. make a bigger client program. If you don't have the source, then profiling info isn't very useful anyway -- you won't be able to do a damned thing about a bottleneck in a library function if you don't have access to the code (I guess you could complain to the vendor).

Also, the profile hooks themselves don't need to be profiled. Clearly, they are going to be called every time a local function is called, so counting their invocations isn't useful. Hopefully, the timing penalty is minimal, but it doesn't matter anyway.The purpose of profiling is to determine what parts of the source code could be re-written to improve their speed or efficiency. The hooks are only present when debugging, so their effect is ultimately not an issue.

Additionally, the standard startup and closedown routines found in all C programs is not profiled, again because the hooks cannot be generated for them. All C programs have a run time environment that is initialized before the main program starts, and then closed down after main exits to release any resources. That is, when the loader runs a C program, it invokes a run time sequence that looks like this:

init_C_runtime();
main();
end_C_runtime();
Actually, the C runtime routines have names like _init, _init_one, and _end. Additionally, C++ programs will run initialization code that runs constructors for global objects and then the destructors for these objects at closedown. The same idea, basically. Since they are run before and after the actual code in the client program, profiling them doesn't make sense anyway.

What all these functions share in common is that their names all begin with an underscore, which is the usual convention for naming internal system functions (user programs should never call them directly). Therefore, when ezprof is loading the symbols into the hash table, it simply skips all the functions whose names begin with an underscore. If you've named a local function in your own source code to start with an underscore, then it will be skipped. But shame on you for doing that! That's a big no-no!

Finally, the main() function itself is not profiled. Since it represents the sum total of all activity in the client program, including it would only cause all the other numbers to be skewed downward since the entire program run (except for the init/close above) occurs inside main(). It is added to the hash table, however, because the call tree would look odd if the main (root) function wasn't there.


Conclusion

There is a GNU profiler that is included with many POSIX systems called 'gprof'. It is included with BeOS R5 as well (or at least with the developer tools mentioned above), but frankly, it doesn't work. However, Be did provide their own profiler called "profile". This used to be on their ftp server. You should be able to find it on one of the Be mirrors; here's one that was good at the time of this writing: ProfileTools_x86.zip. Run "profile imagefile" to examine a particular program. Calls to library functions are recorded, altho, as I mentioned before, there is not much you can do about this information.

Given this, is there any reason to look at ezprof? Sure. As a learning tool, if nothing else. Obviously, ezprof is not a high powered, production quality profiler. Still, it does provide some insights into how profilers work and how we might build our own tools within the project to examine and critique the quality of the source code.


Source Code:
ezprof.zip

 
Thinking about distributions by Michael Phipps 
We are finishing up plans for our (VERY) long awaited source control change. We started this months ago. Many problems have come up, including staffing changes. But the issue has finally bubbled to the top as the most pressing issue. Partially due to the whole "kernel thing". Source control is one of those nasty things that no two people would ever do the same way, and no one likes anyone else's design. I would like to commend the team leaders on their very professional composure during this trying time.

What, you may ask, does this have to do with distribution? One of the things that we have discussed is a place to put third party pieces in our source control. I am of the opinion that we should *not* have third party pieces in our source control if at all possible. One of the reasons for this is that we (OBOS) don't intend to make distributions. Never have.

Think for a moment about the level of work required to put together a distro. It needs a theme (most apps, best apps, smallest, fastest, whatever). Next, you have to decide on apps to include. This leads us to ask what licenses are acceptable for our distribution. Furthermore, if we decided to include (or not include) certain applications that OBOS members wrote, there is something of a conflict of interest. Even with free software. Furthermore, we would need to put the distro together (build an installer, etc), test it on dozens of boxes, write user documentation (surely you don't intend to ship software without decent documentation) and make an image.

That is just putting it together. Now we have to decide what to do with it. Shall we sell it? Give it away? Both? How much? CD or DVD? Should we sell support and/or services? How many CDs to burn? Would we like to get on store shelves? Which distributor should we use? What other distribution methods do we use? Should we encourage or even allow other distributors? What should the boxes look like? Ugh. And all of this in addition to building the OS, testing, bug fixing and don't forget designing R2.

Distribution, though, is a critical piece. It doesn't matter if you have the world's sweetest system if no one can successfully install it. Linux found this out the hard way in the early 90's. And Be found out that the sweetest system out there doesn't succeed if it doesn't get into users' hands. I don't expect that R1 will be a *huge* smash in the marketplace. R1 is about getting something into users' hands that is useful and proves us to be a reasonable force as OS builders. R2 and R3 will be the releases, I think that make a difference. The releases where we will be able to really shine.

I think that focus is what OBOS really needs. We are only about 30 active coders and no business people. Couldn't we use the money that one could hypothetically make from such an endeavor? Sure. I would *love* to work on OBOS full time. So would most of us. Might that not justify such a level of effort? It might. If there wasn't a better option. So far, three different groups have been interested in making distros. BeUnited is the only group that has announced their support. Simon and Deej and I talk regularly; we have very similar ideas on how to go about this. All three, though, have promised to "give back" a portion of the proceeds, since they "know which side their bread is buttered on". This leaves OBOS free to write code and make a better product, with a revenue stream - a way to buy hardware, software and maybe even create some job opportunities and/or code bounties.