SMP

Blog post by axeld on Thu, 2005-10-20 10:05

I’m done implementing sub transactions for now - I haven’t yet tested detaching sub transactions, but everything seems to work fine. Time will tell :-)
A complete Tracker build now dropped from 13.5 minutes to 5.4 minutes - that’s great, but BeOS R5 does the same job on this machine in around 2.5 minutes, so even while this is an improvement, we still have a long road ahead of us. I can only guess where we lose those 3 minutes for now, but I am sure we’ll find out well before R1. One of the responsible components should be the caching system, as it still only looks up single blocks/pages, instead of doing some bigger reads and read-ahead.

Anyway, since Adi is still working on the app_server, my next assignment is getting Haiku to work again on SMP machines. While it may seem like luxury right now, having an SMP machine to test a multi-threaded system on is almost mandatory. Let’s see how many related bugs have sneaked into the system - I only know about one particular piece of code that won’t work well on those machines (and I am to blame for that one, out of pure laziness).

The machine I am testing on is a dual PIII (with Intel BX chipset) that was generously donated (or lent :-)) by Ingo, one of the best developers we have on the team.

Sub-Transactions

Blog post by axeld on Wed, 2005-10-19 15:38

A small update to the BFS incompatibility: I’ve now ported the original logging structure to the R5 version of BFS as well, so that the tools like bfs_shell can now successfully mount “dirty” volumes, too. I also found another bug in Be’s implementation, and needed to cut down the log entry array by one to make it work with larger transactions.

Now I am working on implementing sub transactions. If you have tried out Haiku and compiled some stuff or just redirected some shell output to a file, you undoubtedly are aware that this takes ages on the current system.
The reason for this is that BFS starts a new transaction for every write to a file that enlarges its file size - and that’s indeed a very common case. Since writing back a transaction also includes flushing the drive caches, this isn’t a very cheap operation - it slows down BFS a lot.

The original approach taken by Be Inc. was to combine several smaller transactions to a bigger transaction - problem solved. The downside to this approach is that you lose the ability to undo a transaction. If you need to undo some actions, you have to manually undo the changes in the transaction that would have belonged to the small transaction.
That works but also complicates the code a lot, and is a welcome for any kind of bugs (and that’s one more reason why file systems take ages to become mature).

In Haiku, we introduce the concept of a sub transaction: you can start a transaction in the context of the current transaction, and then abort only the sub transaction instead of the whole thing. As soon as the sub transaction is acknowledged, its changes are merged with the parent transaction - at that point, you cannot revert its changes anymore, you can only still revert the whole transaction.
The only downside of this approach is that it uses more memory, as it has to store the changes of the sub transaction alongside those of the parent. The largest transaction that is possible with a standard BFS volume currently consists of 4096 blocks - so even the worst case should be acceptable.
If a sub transaction grows too much, it can be detached from its parent - since the parent transaction itself is done already, it can safely be written back to disk.

I hope to finish implementing sub transactions and use them in BFS until some time tomorrow. Depending on the number of bugs I add to the code, it might also go faster, though :-)

Another BFS surprise

Blog post by axeld on Tue, 2005-10-18 23:25

Turns out BFS logging code is not that intelligent - it uses block_runs in the log area, but it doesn’t make use of them. In other words: it only accepts block_runs with length 1 - which effectively kills the whole idea of using them. It’s now as space consuming as the single block number arrays I had before, but doesn’t share the binary search capability we had earlier.

While our code now could use block_runs how they should be used, I have disabled joining separate block_runs to make our BFS fully compatible to Be’s in this regard. If we someday leave compatibility with the current BFS behind, we can enable it again, of course.

While this is probably just a fault in the implementation of the original BFS, it’s not the first time we have to live with a sub-optimal solution in order to retain compatibility. The good thing is, since we should be 100% compatible to BFS now, it should also be the last of these surprises now.

Analyze This

Blog post by axeld on Tue, 2005-10-18 10:00

This morning, I went through analyzing the BFS log area structure. Turns out it’s very different from what I did for our BFS.
Our current log structure looks like this:


block 1 - n:
uint64 number of blocks
off_t[] array of block numbers
block n+1 - m:
real block data

While the one from BFS looks like this:

block 1:
uint32 number of runs
uint32 max. number of runs
block_run[] array of block runs
block 2 - m:
real block data

BFS only has one header block, so it can only store a certain number of blocks per log entry. On the other hand, it uses block runs instead of single block numbers which potentially compacts the size of the block array, but also makes lookups a lot more expensive.
Like a block number, a block run is 8 bytes wide, and is a composed data type that looks like this:

uint32 allocation_group;
uint16 start;
uint16 length;

BFS divides the whole volume into allocation groups - each of which can combine up to 65536 blocks. This way, it can also represent a number of sequential blocks. This structure is used a lot throughout BFS, so it’s not surprising to find it again in the log area.

So I will now convert our BFS to use that same format, so that you can safely mount uncleanly unmounted volumes from both operating systems, BeOS and Haiku, and in both directions.

BFS incompatibilities

Blog post by axeld on Mon, 2005-10-17 14:05

First of all, I successfully booted Haiku from CD-ROM from several machines today. It took a bit longer than I thought, as no emulator that I have access to seems to support multi-session CDs, and not every BIOS I have works by the book. The boot device selection is still very simplistic, so it might not end up booting completely from CD if you just inserted it, and didn’t choose “Boot from CD-ROM” in the boot loader - but you’ll have to bear with that currently. I’ll probably fix that tomorrow.
Anyway, you could build you own bootable CD image with the “makehaikufloppy” script that’s now in our top-level directory (it’s still rough, and you have to build the whole system manually or via “makehdimage” before). You just need “mkisofs” and use the resulting image in this way:

$ mkisofs -b boot.image -c boot.catalog -R -o <output-ISO-image> <path-to-directory-with-boot-image-and-other-stuff-that-should-go-into-the-boot-session>


As those of you, that attended the Haiku presentation at BeGeistert are aware of, we initially had some problems getting Haiku to run.
The reason behind these problems were incompatibilities between our version of BFS and Be’s version: the log area is currently written differently in both implementations. As soon as you mount a dirty volume with the wrong operating system, chances are that blocks are written in random order to your hard drive, and thus, corrupting the file system - as Haiku currently crashes rather often, you get into this situation faster than you’d like.
Since we expect early adopters to double boot into BeOS (and for good reason), we should definitely get rid of this annoying risk to lose data. It will also be much easier to track down real remaining problems in BFS if all these simple traps are removed.
Therefore, this will be the next thing to work on for me. I will start to understand Be’s logging format, and then I’ll make ours compatible. Thanks to the wonders of “fsh” (no reboot necessary to uncleanly unmount a disk) this shouldn’t take too long - I expect to get it done sometime tomorrow.

CD boot update

Blog post by axeld on Fri, 2005-10-14 15:10

Everything is in place now, and the boot loader is even passing all information to the kernel to be able to boot from a CD. It’s not yet working though, as the VFS is only evaluating the partition offset of the boot volume, and nothing more.

It’s probably only a tiny bit left, so I try to finish it tomorrow - in my spare time, as I usually don’t work during the weekend :-)

We have also agreed on not making demo CDs (images only, of course) available before the whole system runs a bit more stable. Compared to a hard disk image, a CD image is likely to be tested by a lot more people - and therefore, the first impression should not be too bad.

CD boot

Blog post by axeld on Thu, 2005-10-13 10:53

Since Ingo and I started working on CD booting at BeGeistert, we have (or rather, he has) written a TAR file system for the boot loader.
When your IBM compatible computer boots, the BIOS emulates a boot floppy for a CD-ROM instead of giving you access to the disk directly. In order to access the whole disk, we need a CD-ROM driver - and therefore, we also need the kernel to execute the driver.

Be’s and our solution writes the kernel and all modules needed for booting from CD-ROM (or any other device unsupported by the BIOS) behind the boot loader to the boot floppy (ie. boot session of the CD). As on-disk structure, we use standard gzipped TAR files that contain all the needed files.
The boot loader will start the kernel from the TAR file, and the running kernel will then detect the CD-ROM and try booting from there - at least that’s the theory. Right now, we have the TAR file system working in the userland boot loader test environment.

Getting Haiku to boot from CD to the usual Terminal window is my first assignment as an Haiku Inc. employee. If no unforeseen problems arise, I hope to get it done today or tomorrow.

Haiku's First Employee

Blog post by axeld on Thu, 2005-10-13 08:34

This blog is supposed to accompany my Haiku development efforts while being employed by the non-profit organisation behind Haiku , Haiku Inc.

Thanks to the donations you made to Haiku Inc., I will work full time until the end of november - that means 8 hours a day, 5 days a week. I’m not getting rich by doing this, but it should be enough to pay my bills. I don’t even get more money if you would donate more - it would just make such an event more likely to happen again. Thanks for making this happening, anyway.

I intend to regularly update this blog during the next few weeks to give you an overview over what I am currently working on, and how I progress doing it.