Displaying Newsletter
Issue 22, 14 Jul 2002


  In This Issue:
 
Hello kernel? You have a syscall from userland! by Daniel Reinhold 
One of the features of modern operating systems is the ability to separate application code from the critical code that implements the core of the system. Regular applications run in user mode (often referred to as userland) which means that they cannot directly manipulate the vital system data structures. This makes everything much more stable -- buggy apps may crash and burn themselves, but they can't bring down the rest of the system.

The flipside to this protection is that userland code is walled off from the kernel code. This means, for example, that your application cannot directly call a kernel function. But the kernel implements many useful services that most apps would like to take advantage of. Indeed, that is one of the main purposes of the kernel -- to abstract all those icky underlying hardware details and provide a clean, consistent interface for applications. So how does all this useful interface ever get called and used?

Well, I'm glad that you asked (ok, you didn't... I'll just pretend that you did), since that's the topic of this article. There is a mechanism that the kernel provides so that user apps may tap into the system's coffers. This is the system call interface -- aka "syscalls" (surprise! I guess the title of the article gave it away).

Syscalls are the mechanism by which requests from userland code are converted into function calls within the kernel. This involves two context switches: first, switching from user to kernel mode in order to run the system service, then from kernel back to user mode to return to the caller. Additionally, any data passed must be copied in both directions (from user to kernel and back again). This means that syscalls are not exactly cheap -- they incur far more overhead than a simple procedure call. But the cost of the service is balanced by the safety of maintaining system integrity.


Caveats

This article will delve into the details of how syscalls are implemented in the OpenBeOS kernel, but there are a couple of caveats that I must lay out from the start:

  1. Only the Intel x86 architecture is covered here. While the overall design of the syscalls mechanism is the same on any platform, the specific details of how it is implemented is highly dependent on the machine architecture.
  2. As I write this, the kernel source code is a moving target. It has already deviated somewhat from the original version forked from NewOS. And it will continue to evolve in the months ahead. I don't think that the basic mechanism for handling syscalls will change (altho you never know), but some of the specific file names and/or code snippets referenced below may become obsolete over time.
  3. I am still learning and studying the syscalls mechanism myself. I believe that all the information presented here is correct, but, in the end, the ultimate reference is the kernel source code itself. Believe what it says first, and take what I've said here as supplemental.


The system interface

Consider your average C program with calls to fopen(), fread(), fclose(), etc. These functions are part of the standard C library and are platform independent -- i.e. they provide the same functionality regardless of what operating system is being run. But how are those calls actually implemented? As system calls in the native OS, of course:

  • sys_open()
  • sys_read()
  • sys_write()
  • sys_lseek()
  • sys_close()
  • . . .
These file operations are so common and so fundamental that is makes sense to offer them as system services. But file operations are not the only services that the kernel provides. There are also operations available for manipulating threads, semaphores, ports, and other low-level goodies. Here's a partial list of some other syscalls defined in OpenBeOS:
  • sys_system_time()
  • sys_snooze()
  • kern_create_sem()
  • kern_delete_sem()
  • kern_acquire_sem()
  • kern_release_sem()
  • kern_release_sem_etc()
  • kern_spawn_thread()
  • kern_kill_thread()
  • kern_suspend_thread()
  • kern_resume_thread()
  • sys_port_create()
  • sys_port_close()
  • sys_port_delete()
  • sys_port_find()
  • sys_port_get_info()
  • sys_exit()
  • . . .
These syscalls provide a good representation what the kernel is capable of doing and how its operation can be controlled. It is the kernel equivalent of an API, only it's really an SPI (System Programming Interface). So far, as of this writing, there are 78 syscalls defined for the kernel. This number is very likely to increase over time. As a point of reference, this Linux syscalls index lists a total of 237 syscalls currently defined for that platform.

How many syscalls should an OS have then? Well, as many as it needs, I guess. It's a judgement call: more system services mean more power for userland (and possibly finer grained control), but too many complicate the interface to the kernel. The best motto would be "keep it as simple as possible, but no simpler".


Peeking with strace

In order to get a better appreciation of the role of syscalls within user applications, you can run a program called strace. This is one of the standard /bin apps included with the BeOS. This very useful command will run a user program while printing out all syscalls as they are invoked. As an example, consider the following command:

strace ls /boot/beos
This will run the command 'ls /boot/beos' while displaying the syscalls encountered during exectution. Here is a sample of the output:
user_create_sem(0x0, 0xec09cd6e "addon lock") = 0x10537  (42 us)
_user_get_next_image_info(0x0, 0xfd001788, 0xfd00178c, 0x434) = 0x0  (145 us)
_user_get_next_image_info(0x0, 0xfd000330, 0xfd000334, 0x434) = 0x0  (146 us)
. . .
area_for(0xfd00028c) = 0x2b36  (51 us)
_user_get_area_info(0x2b36, 0xfd00028c, 0x48) = 0x0  (61 us)
user_find_thread(0x0) = 0x908  (16 us)
user_create_sem(0x0, 0xec09cd3d "gen_malloc") = 0x1053a  (58 us)
. . .
This is only a fraction of the output... run it yourself to see the full glory (heck, it's even color-coded!) Each line in the output is formatted as:

syscall_function(arg1, arg2, ... argN) = return_code

If the argument is a string literal, its string value is displayed immediately following the address that was passed. If the return code is a standard error code, its textual tag will be displayed immediately following its integral value.

For example, from the first line of output, we can surmise that something like the following was present in the 'ls' source code:

sem = user_create_sem(0, "addon lock");
// at run time:
//    sem was set to 0x10537
//    user_create_sem() took 42 microseconds to execute
Running strace is a wonderful way to get a handle on how syscalls are being used by applications. You might even find it useful to run your own programs with strace to see how your application interfaces with the kernel.


Connecting thru software interrupts

Alright, the kernel offers all these wonderful services as syscalls. But how do user apps actually invoke the syscalls? Thru software interrupts.

Most of you are probably familiar with the concept of hardware interrupts. For example, you press a key and the keyboard generates a hardware interrupt, which, in turn, notifies the keyboard driver to process the input. However, interrupts are just as commonly generated by software events.

The mechanism for generating software interrupts is the INT instruction. This is an Intel x86 opcode that interrupts the current program execution, saves the system registers, and then jumps to a specific interrupt handler. After the handler has finished, the system registers are restored and the execution with the calling program is resumed (well, usually).

The INT instruction thus acts as (sort of) an alternative calling technique. Unlike ordinary procedure calls, which pass their args on the stack, interrupts store any needed args in registers. For example, a normal function call such as:

foo(a, b, c);

would be translated by the compiler into something like:

push c
push b
push a
call foo
An interrupt however, must have any needed arguments loaded into general registers first. The register assignments for the syscall handlers are as follows:
  • eax -- syscall #
  • ecx -- number of args (0-16)
  • edx -- pointer to buffer containing args from first to last
After these registers have been set, interrupt 99 is called. What is the significance of the value 99? None really -- this is simply the interrupt number selected by the kernel for handling syscalls. More on this later.


Mapping the syscalls

Each syscall has an entry point defined by a small assembly language function. Therefore, the syscall interface is an assembly file (called syscalls.S) containing a long list of functions, one for each syscall that has been defined. This file should look like this:

.globl sys_null
.type sys_null,@function
.align 8
sys_null:
	movl  $0, %eax        ; syscall #0
	movl  $0, %ecx        ; no args
	lea   4(%esp), %edx   ; pointer to arg list
	int   $99             ; invoke syscall handler
	ret                   ; return
.globl sys_mount
.type sys_mount,@function
.align 8
sys_mount:
	movl  $1, %eax        ; syscall #1
	movl  $4, %ecx        ; mount takes 4 args
	lea   4(%esp), %edx   ; pointer to arg list
	int   $99             ; invoke syscall handler
	ret                   ; return
. . .
Or rather, it would be like the listing above, except that the code is so boiler plate, that, in fact, the syscall functions appear in the source code as a collection of #define macros.

The assignment of system services to syscall numbers is arbitrary. That is, it doesn't really matter which function is syscall #0, syscall #1, syscall #2, etc. so long as everyone is in agreement about the mapping. This mapping is defined in the syscalls.S assembly listing above, and much be matched item-for-item in the C interface header file. For our kernel, the C header is ksyscalls.h which uses an enum to define tags for each syscall:

enum {
    SYSCALL_NULL = 0,
    SYSCALL_MOUNT,
    SYSCALL_UNMOUNT,
    SYSCALL_SYNC,
    SYSCALL_OPEN,
    . . .
};


Interrupt Descriptor Table (IDT)

The code above sets us up for the interrupt call. But what happens when the int $99 instruction is invoked? Quite literally, the exception handler whose address is stored at IDT[99] is called.

The software interrupts rely on the presence of a system structure called the Interrupt Descriptor Table (IDT). This is a memory area allocated and initialized at boot time that holds a table of exception handlers. The table contains exactly 256 entries.

There is an internal x86 register, IDTR, that holds the address of this table. You cannot use this register directly -- it can only be accessed thru instructions such as the lidt (load IDT) instruction. During the stage2 bootstrap, the kernel calls lidt and sets it to the virtual address of the idt descriptor. This descriptor points to a memory area that is initialized with a vector (array) of exception handlers, one for each interrupt number (0 thru 255).

The kernel has some leeway in assigning these handlers. However, certain interrupt numbers have standard, predesignated purposes or are reserved. The table below lists the interrupt numbers and their associated actions that should be implemented by the handlers:

Intel x86 interrupt numbers:


Number    DescriptionType
0Divide-by-zerofault
1Debug exceptiontrap or fault
2Non-Maskable Interrupt (NMI)trap
3Breakpoint (INT 3)trap
4Overflow (INTO with EFlags[OF] set)trap
5Bound exception (an out-of-bounds access)trap
6Invalid Opcodetrap
7FPU not availabletrap
8*Double Faultabort
9Coprocessor Segment Overrunabort
10*Invalid TSSfault
11*Segment not presentfault
12*Stack exceptionfault
13*General Protectionfault or trap
14*Page faultfault
15Reserved. . .
16Floating-point errorfault
17Alignment Checkfault
18Machine Checkabort
19-31Reserved By Intel. . .
32-255Available for software and hardware interrupts   . . .

*These exceptions have an associated error code.

Exception Types:

  • fault - the return address points to the instruction that caused the exception. The exception handler may fix the problem and then restart the program, making it look like nothing has happened.
  • trap - the return address points to the instruction after the one that has just completed.
  • abort - the return address is not always reliably supplied. A program which causes an abort is never meant to be continued.


The exception handlers

The 256 exception handlers that are loaded into the IDT are almost identical. After pushing the specific interrupt number, they all implement the same code sequence:

  1. save all registers (including system registers)
  2. call  i386_handle_trap
  3. restore all registers previously saved
  4. return
Because of this, the assembly file that defines these handlers, arch_interrupts.S, is also written largely as a collection of #define macros.

The function i386_handle_trap() serves as the master exception handler. As such, it handles all system interrupts, not just syscalls. However, we're interested specifically in the section that deals with interrupt 99, the syscalls handler.

Here's a snippet of the i386_handle_trap() source code:

void
i386_handle_trap(struct int_frame frame)
{
    int ret = INT_NO_RESCHEDULE;
    switch(frame.vector) {
    case 8:
        ret = i386_double_fault(frame.error_code);
        break;
    case 13:
        ret = i386_general_protection_fault(frame.error_code);
        break;
    . . .
    case 99: {
        uint64 retcode;
        unsigned int args[MAX_ARGS];
        int rc;
        thread_atkernel_entry();
        if(frame.ecx <= MAX_ARGS) {
            if((addr)frame.edx >= KERNEL_BASE &&
               (addr)frame.edx <= KERNEL_TOP) {
                retcode =  ERR_VM_BAD_USER_MEMORY;
            } else {
                rc = user_memcpy(args,
                                (void *)frame.edx,
                                frame.ecx * sizeof(unsigned int));
                if(rc < 0)
                    retcode = ERR_VM_BAD_USER_MEMORY;
                else
                    ret = syscall_dispatcher(frame.eax,
                                             (void *)args,
                                             &retcode);
            }
        }
        frame.eax = retcode & 0xffffffff;
        frame.edx = retcode >> 32;
        break;
        }
	
    . . .
	
    if(frame.cs == USER_CODE_SEG || frame.vector == 99) {
        thread_atkernel_exit();
    }
}
The syscalls are handled in the case of the interrupt number 99. Again, there's no particular significance to the number 99. The Intel documentation allows for interrupt numbers 32-255 to be used freely by the OS for whatever purpose. Travis Geiselbrecht, the original author of this interrupt handling technique, probably decided that 99 was easy to remember.

The highlights of the code are:

  1. thread_atkernel_entry() is called upon entering kernel mode
  2. the number of args (in ecx) and the argv address (in edx) are checked for validity (e.g. a kernel address is bad since syscalls are only intended for user apps)
  3. user_memcpy is called to copy the args from the user stack to kernel memory
  4. if all went well, the syscall dispatcher is called, passing the syscall # (stored in eax)
  5. a 64-bit error code in returned in the [eax,edx] pair
  6. thread_atkernel_exit() is called as kernel mode is exited


The dispatcher

The routine syscall_dispatcher() is a core kernel function that finally binds the syscall numbers to their corresponding internal implementations. Here is a snippet of the syscalls.c file that contains the dispatcher:

int
syscall_dispatcher(unsigned long call_num, void *arg_buffer, uint64 *call_ret)
{
    switch(call_num) {
        case SYSCALL_NULL:
            *call_ret = 0;
            break;
        case SYSCALL_MOUNT:
            *call_ret = user_mount((const char *)arg0, (const char *)arg1,
                                   (const char *)arg2, (void *)arg3);
            break;
        case SYSCALL_UNMOUNT:
            *call_ret = user_unmount((const char *)arg0);
            break;
        case SYSCALL_SYNC:
            *call_ret = user_sync();
            break;
		
        . . .
		
    }
    return INT_RESCHEDULE;
}


Naming conventions

The "user_" prefix on the dispatched functions is not a requirement, but a common convention in the kernel code. These functions do not generally contain the main implementation, but perform any fixups needed and call the true workhorse routine. There are often analogous "sys_" prefixed functions that do the same thing -- i.e. provide a wrapper for the real implementation.

For example, the user_mount() function is found in the vfs.c file since the mount service is part of the virtual filesystem layer. This function, in turn, calls vfs_mount() which actually performs the mount. Likewise, there is a corresponding sys_mount() function in vfs.c that also calls vfs_mount(). This sys_mount() is a kernel mode version of the (userland) sys_mount assembly function found in syscalls.S.

Altho this could be a point of confusion, the idea behind it is reasonable: whether the calling code is user or kernel mode, the same style of interface is used. The userland mount() function will invoke the syscall and eventually result in the user_mount() dispatch function being called. Kernel mode programs (drivers, addons, etc.) call the sys_mount() function directly and don't use the syscall mechanism. Either point of entry results in the underlying vfs_mount() function being called.


Example run-thru

Ok, we've coverd a lot of ground in this article. The mechanism for generating and acting upon syscalls is anything but straightforward. But it can be followed and understood. Let's take a look at an example call.

You have a (userland) program with the following line:

int fd = open ("/some/file", O_RDONLY);

This will get translated into a syscall and the 'open' performed within a kernel mode function.
Here are the steps:

  • The definition of open() is found within the libc library. In the sources for libc, you will find the open.c file (in the unistd folder) that translates this into a call to sys_open(). Basically, the call has now become:

    fd = sys_open ("/some/file", O_RDONLY);

  • The sys_open() function is defined as the assembly routine within the syscalls.S file. Thus, your app needs to be linked against libc.so to resolve this symbol. This will be true for any syscall, regardless of whether the functionality is part of "standard C library" or not. It may seem strange to be linking to libc in order to resolve a call to kern_create_sem(), for example, but the syscalls interface has to be accessible from some library, and libc, for historical reasons, makes as much sense as any other.
  • The sys_open() assembly routine loads 4 into eax (the syscall # for sys_open), loads 2 into ecx (the number of args), loads the address of the first arg on the user stack into edx, then invokes the instruction int $99
  • The exception handler that receives the interrupt pushes the value 99 on the stack, pushes the contents of all the registers on the stack, and then calls i386_handle_trap().
  • Inside i386_handle_trap(), the args are copied to a local (kernel space) buffer and then passed to syscall_dispatcher().
  • The dispatcher has a large switch statement that farms the requests out to different kernel functions based on the syscall #. In this case, the syscall # is 4, which results in a call to user_open(). The original call has now become:

    *retcode = user_open ("/some/file", O_RDONLY);

    The retcode is a pointer to a 64-bit value back in the i386_handle_trap() function that is used to hold error values.

  • The user_open() function is compiled into the kernel (called kernel.x86 for the Intel build). The source is found within the vfs.c file since 'open' is a file operation handled the the virtual file system layer. The user_open() function creates a local copy of the file path arg and passes this on to vfs_open(), which is also defined in vfs.c.
  • The vfs_open() function finally performs the open command on the file... or does it? Actually, the VFS layer acts as an abstraction for handling file operations across all filesystems. So, in truth, vfs_open() simply calls the open function within the filesystem driver for the filesystem that "/some/file" is mounted on. But that process is a whole other topic...
  • Assuming that the file exists, resides on a valid, mounted volume, and there are no other problems, the file may then be actually opened. Now aint that something!
Well, as you can see, the entire process of executing a syscall is not exactly simplicity itself. It is definitely a layered, delegated process. But the layers are there for a reason -- to provide memory protection and to abstract the system services. Hey, if kernel programming was easy, your grandma would be doing it!

Hopefully this article has cleared up the process somewhat. Go back and peruse the sources and see if it all makes more sense now.

 
Optimization can be your friend by Michael Phipps 
Optimization is one of those topics that everyone thinks that they understand a little about, but is often surrounded by platitudes, mystery and rumour. I want to talk a little about when to optimize, why one should optimize and some of the more successful techniques of doing so.


Keep it simple

The first rule of optimization is, "Don't". As Extreme Programming says, "Do the simplest thing that will work". If you are working in an OO language, this is often fairly easy because of information hiding. Design your classes well and hide what you can. Put something together with only a little thought about performance. Test it and see that it works. Then consider your expected use. Many times, an inefficient program *MAY* be acceptable.

One of my first C++ programs was a word scrambler that iterated over all of the different orders of letters in a word to find all of the combinations - to solve word jumbles. The first pass at it was a quick hack with some nested loops. I had plans to search for duplicate letters, to reduce the number of combinations, etc. I found, though, that with "normal" words (< 15 characters), it ran in less than a second or two. Piping it to sort -u removed the duplicates and still it ran in very reasonable time.

In a previous job, I spent months trying to optimize a piece of (badly written, hard to understand) code. Finally, I went to management and convinced them to spend less than one month of my salary on a faster machine. The process (which was a backend process only done by one machine ever) went from 15 minutes to 20 seconds.


Measure results

The second rule is to test. Everything and often. When you make a class and test it for correctness, throw in a test or two for performance. More tests can be used if the class is important. I often write my tests to say "fail if this function can't be run between X and Y times per second". Future implementations are then automatically tested to ensure that you don't lose performance.

Profiling is the other huge tool for performance. There are a few useful indicators to look at in performance - number of times a function is executed and percentage of time that it takes up. In one application that I was writing, I found that a string constructor was called millions of times. I found that there was an errant copy constructor being used. Once fixed, performance was much better. Percentage of time is very useful, as well.

Often times, the performance of an application, as a whole, is not as relevant as one particular part. In an application that I was working on recently, there is a two minute startup time. This is acceptable, as the application can run for hours. When I worked on optimizing that application, I focused on the "processing" part and ignored the startup time. The profiler was very helpful to me in this case (it was on Solaris), as I could choose particular areas to look at. The caveat here is that sometimes instrumentation can skew results.


Keep an open mind

The third rule is to "assume nothing". This can be more difficult than it sounds - we all have preconceived notions. Some common ones are: "virtual functions are slow", "unrolling loops makes your code faster", "some algorithms are better than others". These are often either not true or are only true in some circumstances. For example:

  • Virtual functions are, at worst, one memory access more than non-virtual functions, and are often "free".
  • Unrolling loops is rarely a good idea anymore. First, compilers have learned this trick and secondly, staying inside the memory cache is a far bigger issue than it used to be - 100,000 iterations over a loop may well be faster than 1000 iterations over 100 slightly different pieces of code.
  • Sometimes linked lists may be more efficient than hash functions and bubble sort may work better than quicksort, depending on the data involved. Of course, this is not usually the case, but sometimes. In that same Solaris app, recently, I switched from a hash function to a binary search on a sorted array and improved performance by a factor of 5.


Try again

The fourth rule is to rethink what you are doing. Many times, stepping back from the problem, you can see how you can avoid calling a particular function at all, or less often. Another example from real life - I found that a particular libc function in Solaris was somewhat slow. I was able to write a small class that wrappered that function, caching the previous value, and drastically improved the speed of the code.

One of the key tradeoffs in programming is time vs space. A classic example in gaming is sin/cos/tan functions. You can precompute sine values from 0-45 degrees and compile that "table" into your code. Then trig functions become a look up or two and maybe a division. Did you really need 10 decimals of precision, anyway? This caching can also be applied at a smaller level, called "collecting common subexpressions". Code like this:

for (int i=0;i<strlen(str);i++)
    if (str[i]=="X") {
        foundIt=i;
        break;
        }
would be much better off as:
int len=strlen(str);
for (int i=0;i<len;i++)
    if (str[i]=="X") {
        foundIt=i;
        break;
        }


Use the tools

The fifth rule is to use the optimizer. Often, especially in OO code, it will help you by eliminating unneeded function calls and performing inlining where it makes sense. Of course, every rule needs a caveat, and this rule is no exception - don't use more than -O2 with gcc 2.95.


Allocators

A final topic needs brief discussion - memory allocation/deallocation. I don't think that any other issue has been more written about or troublesome in computer science. There is no perfect memory allocator. The fast ones aren't space efficient. Some work well for small sizes. Others help more with debugging. Others yet are better for low memory situations. If you can *prove* that your memory allocation is hurting you, either in space or size, maybe a custom allocator is for you.

One example of this - malloc on Solaris seems to allocate 8 bytes more than you asked for. As great as this is for debugging, I was working on an app that was allocating more than 3 million separate strings using malloc. Since the wasted space wasn't on a separate page, even the vm system was no help. I wrote a custom allocator that allocated space a meg at a time and doled it out to the strings as required. Since the usage pattern of this app was that the strings would not be destroyed until the app closed down, I didn't provide a way to free strings (just the destructor for the pool). This worked very well for that usage.


Words of wisdom...

A final few pieces of advice in the form of some quotes. Knuth said "Premature optimization is the root of all evil" and "less than 4 per cent of a program generally accounts for more than half of its running time". Finally, Amdahl's Law says that the performance improvement is equal to the percentage of the code that uses the feature times the percentage of time used compared to the original. In other words, a piece of the code that takes 10% of the time can only improve the overal performance time by 10%.

 
Rethinking marketing by Michael Phipps 
In the first (and latest, as of this writing) IRC Q&A session that I did, the question was asked : What plans have been made to succeed in the areas Be failed in, the marketing, the lack of drivers, and apps? Without these we could be in for a repeat....

and I answered:

Few, honestly. We are an OSS project. Marketing is not our job.

I expected a little bit of a response from some people in the Be community, but what I did not expect was the response on Slashdot, when the link to the Q&A session was posted. The general consensus seemed to be that without a whole team of MBAs in suits, we were all doomed.

So I thought that I would talk a little bit about marketing, OBOS's role in the community and where to go from here.

OBOS is a piece of the Be community. I think that I can say that without assuming too much. But we are not the *whole* of the Be community. We are replacing BeOS, not Be, Inc. And that is an important distinction. Be, Inc was a company. They consumed money, paid their employees and sold a product. We consume *NO* money, pay *no* employees and give away the store. Be, Inc was partially (mostly?) a company of engineers. We (OBOS) are *totally* engineers. Be focus shifted because they were out of money. We were out of money from the beginning. Be went broke. We never will.

Is that to say that we hate money or capitalism? No. We have talked about accepting donations for a long time. It just hasn't been worth the work to figure out how to, legally, and what to do with them. Hire staff? Buy books? PCs? Buy specifications (POSIX, etc)? The people who have contacted me about building a distro have promised that they would support us, financially. The same goes for those companies interested in using OBOS in proprietary products. While I am very much looking forward to seeing them successful, I wonder and fear, a little, how having money involved would change OBOS.

But, you want to hear about marketing. Marketing could be defined as the process or technique of promoting, selling, and distributing a product or service (m-w.com). Since we are not making a distro meant for individuals, and we aren't selling anything, that leaves us pretty much with promotion. Which is really what we mean when we talk about marketing.

I would like to split this up a little - there is free marketing, low cost marketing and traditional marketing. Free marketing includes things like this newsletter. Ways that you can contact people without spending money. Radio interviews, posts on friendly websites (Slashot, OSNews, etc). All of those things we are doing as is appropriate. Low cost marketing include techniques that seem free but have invisible costs. Spam mail and exchanging ad banners are two good examples. We aren't going there.

Finally, there is "traditional" marketing. TV and Radio commercials, newspaper and magazine ads, sponsering events, sending out AOL type CDs, etc. First of all, these things require money - piles of it. Not only do we not have it, but I don't see how it is likely that we would. Even if there were 3,000 people out there willing to pay us $100 (US) for a product that they could download for free, that would barely cover one ad in a major magazine, by the time you take into account producing it and placing it. For one month. So we have blown everything on that. No money for support, no money for anything else. Not even a toll free number to order. It seems unlikely that thousands more people would send in their $100 as a result of the ad.

Furthermore, we would have to be prepared for a tidal wave of interest. That would mean dozens of people available to answer questions, hold users hands, help newbie users install their first OS, etc. Oh - and that assumes that it will even work for everyone. A general purpose OS that doesn't support every video card and sound card out there is not necessarily something that you want to try to push on people in Time magazine.

I think that "pull" is better than "push". Letting the users hear about OBOS from other sources (think evangelists) is more likely to be successful. Doing an article for Dr Dobbs journal, for example, on the kernel of OBOS is far more exposure than an ad. It runs more pages, it has more detail and grabs people attention, and you get paid for it instead of spending big bucks. Be's demo tours were very interesting - I would bet that there is or will be an OBOS user within easy access range of every college out there. How hard would it be to stop in one day, talk to the computer science department and convince them to help you set up a presentation? I am pretty sure that we (OBOS) could write up a presentation for you to give, come R1 being out.

Finally, I think that our distributors can and will do a good job of marketing. Dane Scott is a prime example. I wonder how many copies of BeOS he packaged with Tune Tracker. That is a *wonderful* niche product. I think of it as a mini-tractor. Maybe a Lawn Tractor app. A couple dozen apps like that could go a long way toward making OBOS more used. And there are dozens of apps like this waiting to be written. How about a doctor's office suite (scheduling, medical notes, etc)? Or tools for a lawyers office? How about an app that connects to a phone switch in an office and collects call information? An inventory management app? I remember that someone wrote a cash register app some time ago. That is really another form of marketing. And those are markets that an ad in Time would never crack open.