GeneralMaximus's blog

ZFS Port: Three-Quarter Term Report

Blog post by GeneralMaximus on Sat, 2011-08-06 04:46

Briefly, my goals for the three quarter term were: port libzfs, port the commandline tools zfs and zpool, and write a kernel module to communicate with userland tools via ioctl() calls on a /dev/zfs. Another goal was to make sure our port of ZFS passes all tests in ztest.

With the exception of a few missing routines, libzfs builds fine on Haiku. So does zpool. zfs requires some love, but nothing major remains to be done. In fact, with the exception of a few routines that I need to implement in libsolcompat (our Solaris compatibility library), the port builds almost perfectly on Haiku. But getting it to build is only half the battle ;)

The issue that's holding me back at the moment is that our port fails ztest, and the multithreaded nature of ZFS makes bugs extremely hard to track down and fix. For example, about a week ago ztest would fail when trying to write to disk. That turned out to be a fairly trivial issue -- wrong flags passed to an open() call because my definition of a constant was wrong -- but took me four days to track down. Now I'm facing an issue where all the threads in ztest deadlock after while and the program sits there forever, doing nothing. Since the ZFS code spawns so many threads, it's very hard to figure out where the problem originates.

I wanted to get ztest under control this week, but I failed. Now I've been studying how Solaris expects threads and synchronization primitives to behave from the excellent Solaris Internals book. I will hopefully be able to fix ztest before the next week ends and wrap up the missing routines in libsolcompat. If we pass ztest, it means the code we ported works perfectly.

My goal for the final stretch of the GSoC was to implement the ZFS POSIX Layer. That would allow us to actually perform read/write operations on ZFS partitions. Sadly, I might not have time to do this before the coding period ends, but I'll give it my best shot.

I hope to get back here with good news soon.

ZFS Port: Midterm Report

Blog post by GeneralMaximus on Tue, 2011-07-19 09:44

My midterm goal was porting libzpool -- which contains most of the ZFS code -- to Haiku. Another midterm goal was to get ztest -- the ZFS testing tool --- to run on Haiku. Being able to run ztest in a loop for an entire day means that about 80% of the ported code is working fine (though the remaining 20% is the most difficult part of the entire porting process). ztest is a userland test, so actual file system modules or disks are not involved in the testing procedure -- ztest creates block files in a temporary directory and treats them as disks.

It took me days of fighting with the linker and the compiler, but I'm happy to report that both ztest and libzpool build on Haiku without errors! Does that mean I can run ztest for a day without problems? Sadly, that is not the case. ztest is unable to create ZFS storage pools and fails within a second of starting up. I am currently trying to investigate and fix this crash. Fixing this one crash will reveal more crashes, and running ztest with several threads will reveal subtle threading issues. This means I have my work cut out for me ;)

Meanwhile, I have also started porting libzfs, which is the library used by the zfs and zpool administration tools to communicate with ZFS code in the kernel. This communication occurs as ioctl() calls on /dev/zfs. My goals for the quarter term are getting libzfs, along with zpool and zfs, to build on Haiku. Of course, an additional goal is to get ztest to run without crashing.

You can follow the project at http://github.com/GeneralMaximus/zfs-haiku. Building ztest is as easy as cloning the repository, changing into the zfs-haiku directory, and typing "jam ztest". The ztest executable is generated in the debug.X86 directory.

ZFS Port: Quarter Term Report

Blog post by GeneralMaximus on Fri, 2011-06-17 03:32

My quarter term goals for the ZFS port included porting all the libzpool dependencies to Haiku. Out of four major dependencies -- libavl, libnvpair, libuutil and libumem -- I already have two -- libavl and libnvpair -- building on Haiku. libumem and libuutil will take another few days, which puts me at least a week behind my original schedule.

I'm currently working on porting libuutil, which is presenting a few roadblocks but nothing that can't be fixed in one day's work.

When I have some free time, I want to take a break from working on the port and do some cleanup. So far I've indiscriminately copied all the Solaris headers I need into my own repository. This is bad. I eventually want to use as many Haiku headers as possible, only importing those definitions from Solaris that are missing from the Haiku headers.

After all the libraries are working, the next step is porting libzpool itself.

My source repository is located at http://github.com/GeneralMaximus/zfs-haiku.

ZFS Port: Community Bonding Report

Blog post by GeneralMaximus on Mon, 2011-05-30 19:39

I was busy with finals throughout the Community Bonding period, which left me with little time to work on GSoC-related tasks. I still have 3 exams left with the last one being on June 7. That's when the fun starts. For now I'm merely playing with ZFS on FreeBSD on a virtual machine. I still need to make my way through at least the ZFS On-Disk Specification. Even though the information contained in this document is not strictly required for porting ZFS to Haiku, it's a useful read nonetheless. It also makes me look like a rockstar when I open it in coffee shops.

According to my proposal, this is what comes next:

  • Finish porting dependencies (libavl, libnvpair, libuutil, libumem).
  • Begin writing an OpenSolaris compatibility layer. This involves studying how threads, mutexes, condition variables etc. work on Solaris and then writing wrappers that behave in a similar manner around the Haiku API.

This means approximately one week spent on porting and testing dependencies and a fair bit of time spent on learning about low-level Solaris interfaces. One week is an optimistic estimation for a port of any kind, but some of the dependencies are merely libraries that implement useful data structures. This means they might build on Haiku without major changes. I can't say anything further unless I've taken a better look at them myself. The end result of the quarter term should be an OptionalPackage that contains all the external ZFS dependencies.

June 7 is not far :)

GSoC Introduction: ZFS Port

Blog post by GeneralMaximus on Sat, 2011-04-30 18:08

I'm Ankur Sethi, a 20 year old hacker from New Delhi, India. I mostly program in Python and Objective-C (on Mac OS X/iOS). This summer, I will work on porting ZFS to Haiku as part of Google Summer of Code 2011. My proposal lives here.

ZFS is a combined file system and logical volume manager built by Sun Microsystems (now Oracle) for OpenSolaris. Besides having a 'Z' in the name -- which automatically grants it +100 awesome points -- ZFS sports a feature set that will enable developers to build some incredibly neat applications on top of Haiku. For example, ZFS supports files and volumes up to 16 exabytes in size. It is designed from the ground up with a focus on protecting data from silent corruption (bit rot, cosmic radiation, etc.) Thanks to its copy-on-write nature, creating snapshots on ZFS is quick, easy and cheap, which takes the pain out of creating backups. This Wikipedia article does a better job than I ever could of describing why ZFS is, as Oracle's marketing department will be happy to let you know, the last word in filesystems.

I will be spending the Community Bonding period studying how the FreeBSD port of ZFS and zfs-fuse work. I will also be reading some of the available literature on ZFS. Once the coding period starts, I will use this blog to keep everyone updated on what I'm doing.

Looking forward to a great summer :)

Full Text Indexing: Search UI

Blog post by GeneralMaximus on Fri, 2009-07-31 15:34

So far, I have been working on the indexing part of Beacon, which is nearly complete. In the coming weeks, I will be running beacond (now index_server) as a service in the background so that I can find and squash whatever bugs remain in the code. For now, index_server is blazing fast, but that might be because the indexes I test against are just a few megabytes in size. From what I hear, though, CLucene can easily handle indexes which are several gigabytes in size without blinking an eye, so speed might not be an issue as the indexes grow. Anyway, any potential performance bottlenecks will only show themselves once people start using index_server regularly.

And now for the good part: I have a basic search UI for Beacon up and running. For now, it can only perform simple keyword searches. In the future, I would love to integrate full text search into the Tracker "Find" UI, but for now I'm concentrating on improving this simple search tool.

Here is a screenshot:

screenshot3

/Files/pg/ is the directory where I'm keeping about 600MB of Project Gutenberg texts for stressing out index_server.

The next step is, of course, writing DataTranslators for a few file formats (as I've said before, PDF is top priority) and writing a simple preferences UI. I hope to post here soon with more good news :)

PS: Anybody interested in learning the CLucene query syntax can take a look at this page (although, in the future, Beacon will take a BeOS query and convert it into a CLucene query transparently).

Full Text Indexing: Status Update

Blog post by GeneralMaximus on Tue, 2009-06-30 13:18

After more than a week of thinking, "Today is the day I'll write that blog post", here I am with a status update on my HCD2009 project. I have only a few more points to add to what Matt has already posted here.

First of all, the previously unnamed full text indexing and search tool now has a name: Beacon. The indexing daemon currently in the works is called beacond. This is what beacond can do right now:

  • Monitor files for changes and add new/modified files to the index. Only plain text files are supported for now.
  • Handle mounting/unmounting of BFS volumes. Start watching volumes when they are mounted, and stop watching them when they are unmounted.
  • Selectively exclude certain folders from being indexed.

Right now, I'm mostly concerned with polishing beacond. A few short term goals are:

  • Reduce memory usage. Currently, beacond eats up about 60MB of memory, which is way too much for what it does.
  • Perform the actual indexing operation in a separate thread. This is required so that the daemon does not become unresponsive during long indexing operations.
  • Write a small tool which can search the index created by beacond (for demonstration and testing purposes only).
  • Several minor tweaks (properly saving/loading settings, better build system etc.).
  • Write a few DataTranslators so that beacond can be tested with different kinds of files. PDF is top priority.

In the long run, my major goals will be (1) seamlessly integrating Beacon with the existing Find tool in Haiku and (2) supporting more file types. But for now, the focus is on getting the daemon right.

If anybody wishes to check Beacon out, here is the project homepage (hosted on Google Code).

Syndicate content