Be Newsletters - Volume 2: 1997

Issue 3-39, October 1, 1997

Be Engineering Insights: Video Basics

By Steve Sakoman

As you step off the elevator into Be's reception area, one of the first things you see is a large poster proclaiming "RESISTANCE IS NOT FUTILE". Being quite stubborn by nature, I've enthusiastically taken this philosophy to heart and have managed to resist being drafted into dreaded "newsletter article" duty.

Until now.

Valerie has triumphed in the battle between immovable object and irresistible force. And so, to atone for past sins, this is the first in a series of articles dealing with capturing, displaying, storing, and manipulating still and moving video images with the BeOS.

I always like to start a new project with a blissfully simple first cut implementation. There's something comforting in taking some time to become familiar with the fundamental issues before layering on complexity. When designing the first Be motherboard back in 1990, I began with a prototype that consisted of nothing more than a single processor, memory, and a serial port. The next version added slots and bus arbitration, the one after that additional processors, then various flavors of I/O. Ah, the good old days!

I'll use a similar approach in these articles. We'll start out nice and simple, using the infamous BeOS HelloWorld sample program as the basis for our first effort: Capturing a single frame of video from channel 5.

This program requires special hardware. Any of the following PCI video capture cards will do (retail prices are in the $89 to $199 range):

Hauppauge Models 400, 401, 405, 406 Wincast/TV cards
IXMICRO TurboTV (formerly IMS TurboTV)
miroMEDIA PCTV
Intel Smart Video Recorder III (no tuner included)

We'll be making use of the Bt848Source class to hide some of the complexities of dealing directly with the video capture driver for the above cards.

The first decision we need to make is the size image to capture. In making this decision it helps to know a bit about the character of the video signal that we'll be sampling. For a variety of historical reasons, video standards throughout the world use a technique called interlacing to transmit the image. With this technique, the odd scan lines are displayed during the first refresh interval, the even lines during the next interval. These sets of even and odd scan lines are called fields.

Interestingly, line numbers start at one, not zero—this standard was established long before computers influenced us to start counting at zero!

While interlacing is a reasonable approach for viewing moving images on TV sets, it presents some challenges for high quality image captures. These problems are most evident during image sequences that contain rapidly moving objects.

Because the odd and even fields are sampled at different points in time in the video camera, the edge of a moving object is not at the same position in the odd and even fields. If we simply combine the odd and even fields by interleaving them into a single frame, we will discover that moving objects have a severe case of the "jaggies."

There are many sophisticated techniques for dealing with this problem. We're going to take the easy way out, though. Recognizing that the issue only arises if you combine fields, we'll neatly side step it by capturing just a single field.

In the US we use a video standard known as NTSC-M. This standard has 525.5 lines of resolution, the active video portion uses about 480 of these lines (the rest contain synchronization information and optionally some interesting digitally encoded data—more on that in a future article).

We'll capture just the 240 odd lines. Each line of video contains about 640 pixels. Since we're throwing out half of the vertical information by capturing the odd field, we need to scale the horizontal samples by 1/2 to maintain the original aspect ratio. This gives us a desired image size of 320 x 240 square pixels—a format called Common Interchange Format or CIF.

With this decision out of the way, it's time to start editing our HelloWorld sample code! Only one change is required in HelloWorld.cpp to set the window size to 320 x 240. HelloWindow.cpp is unchanged.

main()
{
  HelloApplication *myApplication;
  myApplication = new HelloApplication();
  myApplication->Run();
  delete(myApplication);
  return(0);
}

HelloApplication::HelloApplication()
    : BApplication("application/x-vnd.Be-Video-sample1")
{
 HelloWindow *aWindow;
 HelloView *aView;
 BRect aRect;

 //Window size is 320 x 240
 aRect.Set(50, 50, 50 + 320 - 1, 50 + 240 - 1);
 aWindow = new HelloWindow(aRect);
 aRect.OffsetTo(B_ORIGIN);
 aView = new HelloView(aRect, "HelloView");
 aWindow->AddChild(aView);
 aWindow->Show();
}

HelloWindow::HelloWindow(BRect frame)
        : BWindow(frame, "Video", B_TITLED_WINDOW,
                  B_NOT_RESIZABLE)
{
}

bool HelloWindow::QuitRequested()
{
  be_app->PostMessage(B_QUIT_REQUESTED);
  return TRUE;
}

The major changes to the program are in our implementation of HelloView.cpp. Let's start by taking a look at the view constructor:

HelloView::HelloView(BRect rect, char *name)
    : BView(rect, name, B_FOLLOW_ALL, B_WILL_DRAW)
{
 bt848 = new Bt848Source("/dev/video/bt848/0");

 bt848->SetVideoFormat(BT848_NTSC_M);
 bt848->SelectVideoSource(BT848_TUNER);
 bt848->SetTunerLocale(BT848_US_NTSC_CABLE_HRC);
 bt848->SelectChannel(5);
 bt848->SelectAudioSource(BT848_MUTE_AUDIO);
 bt848->SetDecimation(0);
 bt848->SetBrightness(0);
 bt848->SetContrast(0);
 bt848->SetHue(0);
 bt848->SetSaturation(0);
 bt848->SetColorFormat(BT848_RGB32);
 bt848->SetCaptureMode(BT848_SINGLE);

 capture.buffer_size = (((320 * 240 * 4) +
                        B_PAGE_SIZE)/B_PAGE_SIZE) *
                        B_PAGE_SIZE;
 if (bt848->AllocateCaptureBuffer(&capture) < B_NO_ERROR)
 {
   printf("error allocating frame buffer\n");
   exit(0);
 }
 capture_buffer = (unsigned char *)capture.buffer_address;
 bm = new BBitmap(BRect(0,0,320-1,240-1),B_RGB_32_BIT,
                  false,false);
}

The first line in the constructor instantiates a Bt848Source object for the first video capture card in the machine (/dev/video/bt848/0).

We then set up the capture card to expect US style video input (BT848_NTSC_M). Developers outside the US will want to change this line to reflect their area of the world. Other options include: BT848_PAL_BDGHI, BT848_NTSC_JAPAN, BT848_PAL_M, BT848_PAL_N, and BT848_SECAM.

Since we'd like to capture an image from a TV broadcast, our next step is to configure the tuner appropriately. First we select the output of the tuner as our capture source (BT848_TUNER). If you want to capture from a video camera, you could change this to BT848_COMPOSITE or BT848_SVIDEO.

Here in Menlo Park we get our cable TV service from Cable Coop, who uses a somewhat unusual set of cable channel frequency allocations (BT848_US_NTSC_CABLE_HRC) which are offset from the more common cable frequencies (BT848_US_NTSC_CABLE_IRC) by 1.25Mhz. Those who are using an antenna should choose BT848_US_NTSC_AIR. Other supported choices include BT848_PAL_BG_CABLE and BT848_PAL_BG_AIR.

We then set the channel we'd like to capture from, make sure that the tuner audio is disabled during our capture, and set the brightness, contrast, hue and saturation controls to their center positions (zero -- the range on all of the controls is from -100 to +100).

For this capture we're going to use the highest quality color format for our pixels: BT848_RGB32. This format stores each pixel in a 32 bit word with 8 bits allocated for R, G, and B (commonly referred to as ARGB, A is always set to 0). Other options include BT848_RGB16, BT848_RGB15, BT848_RGB8, and BT848_Y8—see bt848_driver.h for a description of how these formats are stored in memory. We then set up the driver to do a single frame capture (BT848_SINGLE) rather than repetitive captures (BT848_CONTINUOUS).

We finish up the view constructor by allocating storage for a capture frame buffer (320 x 240 x 4 bytes per pixel, rounded up to the next page boundry) and creating a bitmap to use for displaying the captured image.

We also add a destructor to deallocate the resources reserved in the constructor:

HelloView::~HelloView()
{
 bt848->FreeCaptureBuffer(&capture);
 delete bt848;
 delete bm;
}

And now the five line meat of the program:

void HelloView::AttachedToWindow()
{
 bt848->ConfigureCapture(320,240,capture_buffer,(void *)0);
 bt848->StartCapture();
 bt848->SyncWithTimeout(250000);
 memcpy_nc2c(bm->Bits(), capture_buffer, 320 * 240 * 4);
}

void HelloView::Draw(BRect)
{
 DrawBitmap(bm);
}

We start by specifying that we desire a 320 x 240 capture into our capture buffer, start the capture, and then patiently block till the capture is complete (well, we're patient for a quarter second, anyway). We copy the capture buffer to the bitmap and the view's draw function draws it. There you have it—one field of channel five in a window!

If you're still awake at this point you might be wondering:

Can't we just capture directly into the bitmap?
Why on earth are we using the Bt848Source object to allocate the capture buffer?

The answer to the first question is a qualified yes. If we were to create the bitmap with the needsContiguousMemory flag set to true, we could indeed capture directly into the bitmap. This works well on BeBoxes, Intel motherboards, and some Macs. Unfortunately Macs based on the Tanzania architecture have a cache coherency problem that is tickled by PCI bus masters DMAing into main memory. We work around this problem by using the driver to allocate the buffer in non-cachable, locked, contiguous memory. This also explains why we use the special version of memcpy (memcpy_nc2c).

Full source to this and all of the sample programs in this series will be made available on our ftp site at:

ftp://ftp.be.com/pub/samples/preview/video/video_sample_1.zip

Next time we'll look at getting our captured images moving and resizable. Till then, some exercises for the reader:

Add user interface to:

do repeated captures via a button
set and store capture preferences
save the captured images using Datatypes
??? (surprise us!)

Send me your efforts and I'll share the best of them: sakoman@be.com.

Be Engineering Insights: Chelsea 'n' Me

By George Hoffman

Last weekend I went through transfer orientation at Stanford University. I believe there were some other students who went through orientation too, although it was difficult to find them among the roaming clumps of frightfully earnest media and literally hundreds of mafioso-like Secret Service agents.

It was kind of neat to have that extra symbol next to particular events in the orientation handbook that meant you shouldn't bring anything that could conceal a firearm. I amused myself by playing Spot the Undercover Bodyguards and offering to various People's Interest Correspondents to answer any questions they might have about the BeOS, but they seemed preoccupied with the First Babe. Oh, well, their loss.

Don't worry, I won't pull a William Adams and try to relate all this in some way to software development on the BeOS.

But speaking of the BeOS...

They threw a party for new students, and of course I, being a certified grade A par-tay animal, attended. The freak was flowing, the funk was being gotten out quite effectively, and Chelsea 'n' me were gettin' our boogie down to one of the infinite variations on The Techno Song, when it suddenly occurred to me:

"Golly! I bet people are curious about HFS, and writing external file systems for the BeOS."

Yes, as a member of the Be graphics team, one of my duties has been to write and debug the external HFS file system.

The code uses the core of Robert Leslie's hfsutils package to do most of the work involved in actually mucking with data on the disk. As this package is GNU copylefted, the code for the file system will be available soon after Preview Release 2 is in the mail for the general public to critique. hfsutils was very useful in this endeavor and Robert was helpful in confirming and fixing the few bugs that I found. Thanks Robert! Your Preview Release 2 CD will go out with everyone else's.

Although much of the work was handled by hfsutils, there was also a good amount of voodoo involved in getting HFS running reliably using our new File System Independent Layer (which, by the way, rocks steady). Some of this will be difficult (or downright impossible) to understand unless you've at least glanced over the file system protocol. Just a warning.

Parlez-vous filez-systemp d'independent-o?

(I started French I on Wednesday. How am I doin' so far? I'm so excited...I'll finally be hip to the Be secret lingo.)

The basis of the FSIL abstraction is the "vnode", a virtual inode, if you will. Vnodes are identified by a vnode_id, which is a 64-bit identifier, scoped by volume. The way the FSIL interacts with the file system is to request a vnode with a vnode_id and perform operations on this vnode (open, close, read, write, stat, etc.) A vnode can also be "walked," which refers to finding a vnode_id given a directory vnode and a filename.

This abstraction of having each vnode identified by a 64-bit vnode_id is nice, and vnode_ids can often be mapped directly to block numbers or even disk addresses, making things very fast and easy to manage (although requiring a bit more error checking).

Under HFS, because the equivalents of inodes are actually stored as leaf nodes within one large B*Tree, blocks or disk addresses were out. Fortunately, each HFS node has a unique identifier, called a CNID (catalog node ID). Unfortunately, the catalog B*Tree is not actually keyed by these IDs, but rather by a combination of the parent directory's CNID and the name of the file.

In order to efficiently respond to requests from the FSIL in the form they are given, CNIDs are used as vnode_ids and a hash table is used to map a CNID to a parent CNID and filename. In this way, given only the CNID, a file can first be looked up in the hash table and then the parent CNID and filename can be used to find the actual node in the catalog B*Tree. This ends up being fairly efficient because the first time a file is referenced is usually during a read of its parent directory; at this time, the file is cached in the hash table.

Slowing down the system as much as possible

Much of the weirdness centered around getting HFS, which is designed to be absolutely single-threaded, to work with a FSIL like ours, which is designed to take full advantage of a multi-threaded file system like BFS.

Understand that when I refer to a file system as "multi-threaded" or "single-threaded" I'm simply referring to the layout on disk and the kind of access to which that layout lends itself.

HFS stores all meta-data about files (i.e. all data other than the actual file contents themselves) in two honking-big B*Trees, the catalog and extents-overflow files. It would be difficult, and not highly productive, to try to allow much concurrent read-write access to such a beast. BFS, on the other hand, has an independent B+Tree for each directory, a structure which encourages concurrent access.

In any case, because the FSIL is designed to allow concurrent access of this kind, there are some issues that you need to worry about when copping out (as I did) and serializing access to the file system.

Because there are parallel structures in the FSIL and the file system itself for each vnode, and because the FSIL makes sure that only one copy of a particular vnode is ever kept in memory, any time the file system needs to access a particular vnode, it must call into the kernel with get_vnode(), which in turn may call the file system's own read_vnode function, if the vnode is not already in memory.

Even though read_vnode is passed a flag which tells it whether it is being reentered in this way (to allow a more efficient locking strategy), a serialized file system _must_ unlock the volume before calling get_vnode(). Not doing so can cause a deadlock. The reason for this is somewhat esoteric; basically, another thread can attempt to read this vnode non-reenterantly (is that a word?) after the volume has been locked and so block on the volume lock. In the meantime, out in the FSIL, the vnode has been allocated but marked busy. Now you call get_vnode (with the volume locked, because you didn't listen to me) and the FSIL waits forever for the vnode in question to become unbusy.

Now, it's easy enough for me to say to unlock the volume before calling get_vnode, but there are some cases when you absolutely need to know if a file exists and be sure that this knowledge is consistent with the state of the disk; furthermore, if the file does exist, you need its vnode.

Take the example of renaming a file. If the location to which you are attempting to rename the target file is occupied, you need to delete the old file first. Simply deleting it on disk won't work, because the FSIL might already have it in memory. But if we unlock the volume, call get_vnode() and then re-lock, we run the risk of the file having been created between the attempted read of the vnode and re-locking the volume.

The answer is to enter a tight loop in which you unlock the volume, call get_vnode(), and re-lock. If you retrieved a vnode, then great! The FSIL will make sure it stays around until you call put_vnode() to release it; you are guaranteed a consistent state. However, if no vnode is returned, the file may have been created since you checked but before you locked again. Therefore, you must do another lookup, while locked, and ensure that there is in fact no such file on disk. If there is, your state is inconsistent, so repeat the loop. In the vast majority of cases, this loop will never be executed more than once. Purists may cringe at the lack of guaranteed forward progress. It's not pretty, but it's the price of a clean, multithreading-friendly abstraction.

Pretending you're sophisticated

One of the fun parts about writing a BeOS external file system is fooling the system into thinking that you're cool. The file system protocol recognizes attributes (which are cool). Most file systems do not support attributes, but many have other extensive meta-data associated with files which can be mapped into attributes for many cool and even downright useful effects. HFS, for example, has a quaint node layout which hard-codes information about file placement, window size and position, and file type and handler information right alongside file length and block run data.

With the BeOS FSP, even cute little file systems like HFS can act bad-ass. The resource fork, of course, maps directly to an attribute. The file system also does some limited translations between well-known MIME types and their Mac OS 4-character file type equivalents, presenting the translations in the standard MIME type attribute, "BEOS:TYPE", and represents unknown types as a MIME type which can easily be translated back (a file of Mac OS type code "RULZ", for instance, will have a BeOS MIME type of "application/x-MacOS-RULZ"). The file creator and other interesting flags and meta-data are also represented as attributes.

In the BeOS, every file and directory can have a custom icon, presented in the dual attributes "BEOS:M:STD_ICON" and "BEOS:L:STD_ICON", which contain, respectively, the custom mini and large icons. Although the format of the desktop database of an HFS volume is private and subject to change, and thus no attempt is made to read from it, applications and other files with custom icons within their resource fork will have the icons extracted and these attributes "spoofed" for them as BeOS icons. Icon placement within directory windows and size and placement of those windows are also translated into Tracker-friendly attributes, the format of which will be "documented" in the HFS code.

Another advantage of this approach is that copying an HFS file from an HFS volume to another HFS volume, or even from HFS to BFS and back to a different HFS volume, should preserve the resource fork, type, creator, flags, and other meta-data associated with the file (given some implicit limitations associated with the desktop database not being modified).

Pretending I'm sophisticated

So anyway, have a blast writing those file systems! I've gotta go register for classes. Aside from French I, I'm thinking maybe I'll take a programming course and finally learn C. It'll be a big project to finally convert the app_server over from Logo, but the rewards should be great. Party on!

News From The Front

By William Adams

[Webmaster's Note: William published a follow-up to this article in Be Newsletter 2-#40. In the follow-up he points out the flaws of the implementation in the article below. This follow-up should be read by anyone considering implementing what he's suggested below.]

So, since my bottom line has been spreading, I've recently taken up running again. Now one thing about running is that there is a lot of time to do mental exercises. What to do while you're pounding the pavement at 6am? Think about programming the BeOS of course!

So you're sitting around doing multi-threaded programming and you are anxious to squeeze every little last ounce of performance out of your app. Of course your algorithms are perfect, so you look toward speeding up various system services. Your first victim, semaphores. Don't they go into kernel space and eat up a context switch and waste all sorts of evil time? No, not really, but if you are so inclined, you might seek a more light-weight locking mechanism.

Frequent readers of the Be Newsletter no doubt sat with rapt attention last year as the Benaphore was laid out in all its detail. The Benaphore is a faster semaphore in that it does a quick check to see if the heavy weight semaphore is actually needed, if it isn't, then skips the calls into the kernel, if it is, then you're no worse off than before.

A benaphore is pretty good, and it ultimately rests on a semaphore. There is another locking mechanism commonly used in kernel device driver code. That is, the spinlock. What's a spinlock? Well, it's basically a locking mechanism utilized in situations where you really aren't expecting to spend much time in a critical section of code. Typically, you are just changing the state of one or two variables and that's it. The lock blocks the locking process by wasting CPU cycles.

I created a little spinlock class in my bebits library because I like having this mechanism in higher level code than just the device driver. First I created a generic abstract base class that has the locking protocol than any general locking class can implement.

class PBinaryLock
{
  public:
    virtual int Lock() = 0;
    virtual int Unlock() = 0;
};

Then I created a concrete class

class ASpinLock : public PBinaryLock
{
  public:
    ASpinLock();
    int Lock();
    int Unlock();

  private:
    volatile long fLockStatus;
};

And here's the actual code.

long test_and_set(volatile long *addr)
{
  long old_value;
  old_value = write_32_swap (addr, 1);
  if (old_value == 0)
    return 0;
  return 1;
}

The test_and_set() function is the crux of all locking mechanisms of this variety. It ideally implements an atomic operation which is necessary in any multithreaded, multiprocessor system. At the very least, you need to be able to check the value of a variable and set it in one fell swoop. If you can't guarantee this, then you would check the variable, and some other process may come in and check and change it before you're done. That write_32_swap() function is the most important function in the BeOS kernel.

ASpinLock::ASpinLock()
  : fLockStatus(0)
{}

Here in the Lock call, we test the fLocalStatus variable. As long as it's 1, indicating the lock is already held, we will continue looping doing nothing. As soon as whoever holds the lock calls Unlock(), the variable will be set to 0, and we will grab it.

int ASpinLock::Lock()
{
  while (test_and_set(&fLockStatus) == 1)
    ;
  return 0;
}

int ASpinLock::Unlock()
{
  // Release the lock by setting its value to
  // 0 atomically. Assuming this does a flush
  // of any caches so that other processors get
  // the word.
  // fLockStatus = 0;

  write_32_swap (&fLockStatus, 0);
  return 0;
}

There you have it. A very simple spinlock class. The usage is simple. If you wanted some more automatic usage, you could implement a destructor that Unlocks automatically:

somefunc()
{
  ASpinLock lock;

  // critical section.
  lock.Lock()

  // do critical stuff

  lock.Unlock();
}

I personally use these simple locks to lock access to things like queues, lists, and other container structures. They're small, lightweight, and get the job done with a minimal amount of fuss, or system resource utilization. And in a machine with multiple CPUs, you don't mind wasting a few cycles here and there do you?

So, is it your algorithm, or is it the system services that need to be optimized? Well, here's one tool which will help you figure that out. Now you'll have something to think about when you're out on your run around the block.

Software On/Off the Web

By Jean-Louis Gassée

From time to time, presenting to prospective investors, new hires, developers, reporters or even market research analysts, I get the frown, the Web marketing and distribution frown. Using poetic license, I sometimes describe the Web as a "superconductor" for the marketing message, the purchasing transaction, and the product itself. Others refer to a "friction-free" business model. The frown says: "Do you really believe BeOS developers can build a viable business using the Web as their sole conduit to customers?"

Of course I do. But rather than offer yet another profession of my faith, I'd like to turn to what others have been doing for a while—meaning they found a way to put food on the boardroom table on an iterative basis.

Before the Internet begot the Web, hardy BBS operators and Compuserve customers could download shareware, freeware or honorware. Try it, send me money if you really want to keep using it. From there, Compuserve evolved a more organized registration and payment structure, and companies such as Kagi Software collected after-the-fact shareware payments for loved and respected Macintosh utilities such as Peter Lewis' Anarchie.

Initially, with the Internet's reputation for being a hacker's nest, the fear of seeing one's credit card number fall into unscrupulous hands got in the way of commerce. There was more fear of the unknown than rational basis to such fear. After all, credit card issuers shoulder most if not all of the loss when a credit card number is stolen. Isn't a restaurant or the dumpster at the mall a better place to get paper receipts bearing the precious numbers? This isn't to say fraud, or fear of fraud, don't exist on the Web, but initial reticence has abated.

My own experience in buying software off the Web has been mostly positive. I'm still puzzled when people insist on sending me a floppy for a 31K utility (an alarm clock for a Palm Pilot) instead of letting me download it after I submit my Visa number, but this was done very quickly and courteously.

In retrospect, Eudora's transition from its heroic shareware origins has been very smooth. You can still download a very nice free version of Eudora and, now, you can buy and get Web delivery of the feature-expanded Eudora Pro version.

Qualcomm, being a bigger and better financed company, manages to get shelf space at Fry's, distributor sales, as well as sales from their own Web site. They charge full retail, letting retailers and distributors compete on price, leaving the final choice to the customer. This methodology also avoids the sometimes irritating feeling one gets when a product is advertised and one can't buy it. Regis McKenna has a wonderfully descriptive but unprintable name for the feeling, it has to do with spheres, electrons and one of the colors of the Be logo.

The most frequent mode of operation is ordering the product, downloading it on the spot and getting a password by e-mail minutes to hours after the purchase, once the credit-card number has been verified. The password is used to unlock the compressed archive, or during the installation process.

Speaking of frequency, a site such as http://www.windows95.com/ features mountains of Windows 95 products delivered electronically. And for the Palm Pilot, there is a ring of Web sites featuring software and accessories (start at http://www.usr.com/palm/) from which I purchased and downloaded a nice e-mail client and an Excel-compatible spreadsheet. I'd rather play spreadsheet than MineHunt. There is even a "lending library" from which you can download literary classics for reading during long flights, or meetings...

Seriously, selling software on the Net is no longer reserved for gearheads and obscure utilities. BeOS application developers don't have to be pioneers in all facets of their business, others have already blazed the Web commerce trail.