Full Text Indexing: Status Update

Blog post by general_maximus on Tue, 2009-06-30 13:18

After more than a week of thinking, "Today is the day I'll write that blog post", here I am with a status update on my HCD2009 project. I have only a few more points to add to what Matt has already posted here.

First of all, the previously unnamed full text indexing and search tool now has a name: Beacon. The indexing daemon currently in the works is called beacond. This is what beacond can do right now:

  • Monitor files for changes and add new/modified files to the index. Only plain text files are supported for now.
  • Handle mounting/unmounting of BFS volumes. Start watching volumes when they are mounted, and stop watching them when they are unmounted.
  • Selectively exclude certain folders from being indexed.

Right now, I'm mostly concerned with polishing beacond. A few short term goals are:

  • Reduce memory usage. Currently, beacond eats up about 60MB of memory, which is way too much for what it does.
  • Perform the actual indexing operation in a separate thread. This is required so that the daemon does not become unresponsive during long indexing operations.
  • Write a small tool which can search the index created by beacond (for demonstration and testing purposes only).
  • Several minor tweaks (properly saving/loading settings, better build system etc.).
  • Write a few DataTranslators so that beacond can be tested with different kinds of files. PDF is top priority.

In the long run, my major goals will be (1) seamlessly integrating Beacon with the existing Find tool in Haiku and (2) supporting more file types. But for now, the focus is on getting the daemon right.

If anybody wishes to check Beacon out, here is the project homepage (hosted on Google Code).