Be Newsletters - Volume 4: 1999

Issue 4-14, April 7, 1999

Be Engineering Insights: And Now, Live From Be...

By Steve Sakoman

My last Newsletter article, "Be Engineering Insights: While You're Waiting For R4..." seems to have spawned one of the more peculiar diversions in the Be culture. I'm referring to the CodyCam application, and its namesake web page: http://www.sakoman.com/codycam/

For those who are new to BeOS, a brief explanation. CodyCam is a simple BeOS webcam application; the CodyCam web page demonstrates its capabilities by uploading pictures of the Be lunchroom at 5-minute intervals. The program is named for my 14-year-old son, Cody, who pestered me mercilessly to help him write shell scripts to implement the initial version of the webcam.

Our lunchroom webcam runs on a funky old Pentium 100 machine. The case cover disappeared long ago, and the monitor's focus is so bad that it's practically unusable. We installed a Hauppauge video capture card and video camera on this humble machine and it faithfully records our lunchroom comings and goings.

It seems that these simple images have attracted quite an audience. One viewer produced an mpeg movie of a single 24-hour period. When the 5-minute interval images are played back at 30 frames per second, the frenetic pace of activity actually resembles what it feels like working at Be.

In addition to providing glimpses of our lunch gatherings, engineering staff meetings, and the occasional napping engineer, the webcam has also turned out to be a means of "communication" to the outside world. Regular viewers became well-acquainted with our infamous $10 couches, and were also treated to various topical "scenes in miniature" when engineers needed another outlet of expression than writing code. They worked with the materials at hand: Legos, bits of paper, Christmas decorations, bunny people...

About a month ago we decided to upgrade the lunchroom carpeting and couches. (To see why, take a look at http://www.sakoman.com/images/couches.jpg.) A few of our engineers had some fun replacing the usual CodyCam images with a series of specially constructed images hinting at some sort of conspiracy. We enjoyed reading the conspiracy theories during the days that the work was being done -- and we were amazed at the number of offers we had from people who wanted to buy the old couches and carpeting!

This week there's a new version of CodyCam. You can get the application and its source code at http://www.sakoman.com/codycam.html

This version of the application uses the BeOS Media Kit. There's far too much code to discuss in detail here, but the basic structure is quite simple. After building a traditional UI window containing a menu bar and settings widgets for configuring the image capture and ftp parameters, the app builds a Media Kit network consisting of two nodes:

A video capture producer node (we use the system default video input).
An app-instantiated consumer node. This node displays the incoming video stream to a BView and at regular intervals saves a frame to disk using the Translation Kit. The saved image is then uploaded to a web server using ftp protocols.

As is often the case, much of the code in CodyCam is "leveraged" from the work of others. The video consumer node uses a more recent version of the BTimedEventQueue class introduced in Stephen B.'s Newsletter article a couple of weeks ago,

Developers' Workshop: Simplifying Media Nodes

The ftp functions are performed using Howard Berkey's wonderful network libraries. App preferences are saved and restored using Pavel Cisler's settings classes.

Be Engineering Insights: Tips and Tricks for Writing High Performance OpenGL Programs

By R. Jason Sams

Whenever someone asks me, "How do I make this program faster?" I respond by asking two more questions: "Are you using hardware acceleration?" and "What is your bottleneck?" If the answer to the first question is "no," the usual reply is that you need to spend $50 for a hardware accelerator. There's no way to achieve acceptable frame rates of 30fps with software rendering if you're texturing or using a window larger than about 320x200 pixels. The cost of an accelerator no longer justifies the effort it takes to get usable performance.

If you're using hardware rendering you have a foundation for fast rendering. Three primary factors affect rendering performance. The first and most common in today's application is the fill requirement; the second is the geometry requirement; and the third is the number of state changes the rendering engine must undergo.

The number of pixels that the rendering engine must draw determines fill requirements. A single large quad that covers the entire screen at 640x480 pixels would require processing 307,200 fragments. Notice the use of the word "fragment" and not "pixel." A fragment is internal data that corresponds to a pixel on the screen. It can be modified several times or completely discarded before being written to the screen as a pixel.

Some hardware can process fragments faster if they are not written to the screen. This can happen if a depth test, a stencil test, or an alpha test fails, among other reasons. Smaller primitives require less processing. A 10x10 quad would require processing only 100 fragments. Modern hardware can typically process 90 to 180 million fragments per second. By contrast, software rendering is good for around 1 to 2 million fragments per second.

The number of vertices and primitives that make up the scene being drawn determine the geometry requirements. Each primitive that is drawn requires several processing steps. A typical triangle would go through transformation, clipping, projection, assembly, culling, and on to the rasterizer. If advanced modes such as texture-coordinate generation or lighting are enabled, more steps are needed. The per vertex calculations normally take the most processing time, with data transfer to the rasterizer coming in second.

Let's look at some of the requirements for different primitive types. The table below shows the processing required to draw 10 triangles:

Type	Vertexes needed	Culling operations
Triangles	30	10
Quads	20	5
Triangle Fan	12	10
Triangle Strip	12	10
Quad Strip	12	5

It's clear from the above that you'd want to use Quad Strips wherever possible. In general, the cost of the culling operation is cheap enough that you don't see much difference between Quad Strips and Triangle Strips or Triangle Fans. You can expect to achieve around 1.5 million triangles drawn per second on a fast PII using Strips or Fans. Other primitives will be slower. If some of the primitives are being clipped or culled that value should increase.

Note: Do not expect to achieve anywhere near the manufacturer's claim for your video card. While the chip could theoretically process between 5 and 9 million triangles per second, the CPU can't process this many. Even if it could, the PCI or AGP bus couldn't get that many to the video card.

State changes are the hardest to quantify. The best advice is to avoid them. Say that you have a scene with 500 objects that use 50 different textures. In most cases it will be faster to sort the objects by texture and then draw them, than to draw them in an arbitrary order and change textures as needed. This applies to most state change requirements. The cost of changing state, compared to the cost of drawing, will become more expensive in the future. So even if you see little benefit now, you can expect to see gains in the future.

To find out which of the above is causing you problems, here are a few things to try:

Resize the window. If the frame rate changes, you're at least partially fill-rate bound. Look for places in your code where you can reduce overdraw. (Overdraw is the act of drawing the same pixel more than once.) Generally, do your best to eliminate occluded objects. (Objects that are in the field of view but not seen because they're behind another opaque object.)
Try reducing the number of state changes. One way to do this is to temporarily comment out state changes that only affect the appearance of the scene. Try the texture change commands. This will make your scene look very strange, but will let you know if that's the problem.
Try replacing some of the more geometrically complex objects with simpler ones. If reducing the triangle count while maintaining the same size objects improves performance, you are geometry bound. You may want to look at multiple levels of detail for your objects. This is the process of drawing far away objects with fewer triangles than objects that are close.

I hope this is has been helpful. We look forward to seeing those killer 3D apps.

Developers' Workshop: The Quest for the Fountain of Audio

By Douglas Wright

The new Media Kit has been discussed several times in the last few issues of the Newsletter, in articles covering topics that range from timing to playing sounds. This information is intended to help developers understand the Media Kit and how to use it in applications. As with any new API, it takes a while to figure out how to use the Media Kit effectively, and how to document the tips and tricks you learn so that everyone can use them.

The Media Kit's breadth, depth, and flexibility make it powerful and subtle. Its primary component—the BMediaNode—is the embodiment of its power and flexibility. These same qualities, however, can make node writing a more than trivial activity. That's why we've been refining the nodes that ship with the system, such as the mixer and the sound card node. And now it's time to introduce some of those refinements to our developers.

I've prepared a totally bare Audio Producer node as sample code. It's designed to synthesize data; it doesn't implement the file interface, but you can add that as needed. I've also included a sample implementation that generates a square wave; you can start with the empty implementation. It uses the new BTimedEventQueue and EndPoint classes to keep track of events, buffers, and connections. You'll find the sample code archive here:

ftp://ftp.be.com/pub/samples/media_kit/audioproducernode.zip

The code for a node is too large to present here in its entirety, but I'll give you the most important parts: the Run() loop and the HandleEvent() function. Together they are in complete control of how the node handles its responsibilities at the correct time.

They demonstrate the preferred use of the BTimedEventQueue in a producer, so even a slacker can be on time. BTimedEventQueue generates a buffer of data in SetupBuffer() and sends it when a HANDLE_BUFFER event is popped off the queue. The rest of the code is fully commented and ready for human (or alien) consumption.

How a node should work using these new helper classes will be discussed in detail this weekend at the BeDC, so I'll see you there!

void
BEmptyAudio::Run()
{
  status_t err = B_OK;
  schedulingLatency = estimate_max_scheduling_latency();
  bigtime_t latency = 0;
  bigtime_t wait_until = B_INFINITE_TIMEOUT;

  while(!mTimeToQuit)
  {
    //setup time
    latency = mOutput.DownStreamLatency() + schedulingLatency;
    if (mEvents.HasEvents())
    {
      wait_until =
        TimeSource()->RealTimeFor(mEvents.NextEventTime(),
          latency);
    }
    else
    {
      wait_until = B_INFINITE_TIMEOUT;
    }

    //wait
    err = WaitForMessages();

    //handle something
    if (err == B_TIMED_OUT)
    {
      HandleEvent();
    }
    else if (err != B_INTERRUPTED)
    {
      TERROR("Unexpected error:\n %s exiting...",
        strerror(err));
      return;
    }
  }

}

status_t
BEmptyAudio::WaitForMessages(bigtime_t wait_until)
{
  status_t err = B_OK;
  int32 code = 0;
  char message[B_MEDIA_MESSAGE_SIZE];

  if (system_time() < wait_until - ENOUGH_TIME)
  {
    err = read_port_etc(mControlPort, &code, &message,
            B_MEDIA_MESSAGE_SIZE, B_ABSOLUTE_TIMEOUT,
            wait_until);
  }
  else
    err = B_TIMED_OUT;

  if (err >= 0)
    HandleMessage(code, &message, err);
  else
    return err;
}

void
BEmptyAudio::HandleEvent()
{
  if (!mEvents.HasEvents())
    return;

  bigtime_t time = 0;
  int32 what = BTimedEventQueue::B_NO_EVENT;
  void *pointer = NULL;
  uint32 flags = BTimedEventQueue::B_NO_CLEANUP;
  int64 data = 0;

  status_t err =
    mEvents.PopEvent(&time, &what, &pointer, &flags, &data);
  if (err)
   Recycle();
      else
      {
        SendBuffer((BBuffer *)pointer, mOutput.Destination());
        mNextTime = mStartTime +
          (bigtime_t)floor((mFrameTotal + mMixFrameCount) *
          1000000.0 /
          (double)mOutput.Format().u.raw_audio.frame_rate);
        SetupBuffer(&mOutput);
      }
      break;
    case BTimedEventQueue::B_DATA_STATUS:
      /* nothing */
      break;
    case BTimedEventQueue::B_NO_EVENT:
    default:
      return;
  }
}

The 10,000 Mark

By Jean-Louis Gassée

The title refers, of course, to this week's announcement that we reached 10,000 registered BeOS developers worldwide, not to the "other" 10,000 mark you may have heard about concerning the Dow-Jones average. As we prepare for the 1999 Be Developer Conference at the end of this week, we're happy to reach this milestone in achieving a "network effect" around the BeOS.

But as milestones go, how material is this one in the rational world? Many have asked the same question regarding the Dow. In the physical world, in the numbers sphere, there isn't a special role for a specific number; it doesn't matter if the DJIA is at 9,999 or 10,001. Well, yes and no. For an investment portfolio, the difference is hardly material. But one assumption is wrong. That is, this isn't a mechanical world, but a human world of emotions, expectations, and self-fulfilling prophecies.

A symbolic milestone is just that—a symbol of something else. In the case of the stock market, it adds to existing investor confidence—or dizziness. For us in the Be community, having 10,000 BeOS developers worldwide doesn't symbolize reaching dizzying heights, but the number does add to our confidence that we're joined in the adventure by creative programmers who share our vision of the future of an area of computing. We appreciate the vote of confidence—and we're committed to doing our best to be deserving of it.

When discussing numbers of BeOS developers, we're often asked which developers are really important—a legitimate but loaded question. It assumes that some developers are more important than others. This is true, but we only know which ones after the fact. Reading the history of a platform engenders the great "It Depends" insight. In some cases, already successful developers know how to move to a new platform unencumbered by their past achievements; in other cases new entrants misread the field and either don't see what the new platform does best or misunderstand what users really want.

Our view is not to second-guess developers, but to help each and every one in the best way our knowledge and means allow. This doesn't say that we shouldn't have opinions, just that we ought to remember what happens to them and, as much as humanly possible, maintain a balance between having faith in ourselves and faith in others.