Working on WebPositive

Blog post by PulkoMandy on Fri, 2013-08-30 00:56

As you may know, I'm going to spend some time again as a full-time Haiku developer. This time, I'll be working on improving WebPositive and the WebKit port to bring a better web browsing experience to Haiku users.

During the past weeks I've managed to spare some free time to get up to speed on the various pieces of code involved and how to work with them. This first blog post summarizes the current state of affairs and I'll set some goals (with your help) for the next monthes.

Architecture overview
A web browser is not as simple as it looks. There are several different things working together to bring your web pages to the screen:

  • A rendering engine: we use WebKit, the same engine used by Apple Safari. Google Chrome used to share this engine as well, but they now have forked it to something called "Blink".
  • A network backend: the backend manages the http protocol, including cookies, and also the https stuff. The default network backend in WebKit uses the curl http library. This backend is designed for testing purposes, but it is what we currently use for the Haiku port of webkit. It has big performance problems and we had to disable any form of caching because of some bugs in it.
  • A browser shell: the shell is just an user interface to the other components. We have two of them: HaikuLauncher is a very simple one used only for testing purposes, and WebPositive is a more complete one you all know and use.

The main performance bottleneck in the current implementation is the network backend. There has been some work, started in 2010 as part of Google Summer of Code by my then-schoolmate Christophe Huriaux (aka Shisui) to build a better backend API. This is known as the Services Kit.

The Services Kit currently handles HTTP and HTTPS requests in a way similar to what curl does. It is part of the APIs Haiku applications can use. The advantages of the Services Kit when compared to curl are:

  • A better cookie jar: curl was not designed to power a complete web browser. Its performance decreases a lot when there are many cookies to manage. While it works well enough for simple automated tasks (downloading content from a web page after logging in, for example), after some time surfing the web you will have lots and lots of cookies and parsing them makes any http request very slow. The Services Kit cookie jar uses an algorithm that scales better and finds cookies much faster.
  • Extensibility to other protocols: curl works only with http and https. A web browser will need support for FTP, and possibly other protocols such as gopher. In the future, more protocols may be needed as well. The Services Kit allows protocol add-ons to extend its functionality.
  • Be-style API: the Services Kit can notify a BHandler of page events using BMessages. This makes it very easy to plug it into the typical heavily multi-threaded Haiku application, and lets the network activity happen in an internal worker thread. As a result, the UI is more responsive and any busy-looping is removed.

So, this all sounds amazing, but the Services Kit is not finished yet, and the integration with WebKit was started, but didn't went far enough to be made available to everyone.

Work already done

To get started with all this I have already done the following:

  • Services Kit and WebKit integration: I have worked with some help from Hamish Morrisson (GSoC 2012 student who ported OpenJDK to Haiku) to bring the experimental WebKit/Services Kit integration back in working order. Hamish made some changes and improvements to the Services Kit, and the WebKit side had not been updated to match. With this fixed, I now have a WebKit version that does not use curl anymore. This build feels faster when loading web pages, unfortunately it also has some easy to trigger crashing bugs (when middle-clicking a link, for example) and missing features (such as file:// URLs support, and more importantly, cookies). So, no testing binary for you just yet. I started fixing some issues, such as POST data not being sent in the proper format.
  • Update for fRiSS: as the current maintainer of the fRiSS RSS feed reader, I have also started updating it to use the Services Kit. The goal was to have a very simple testing application for basic Service Kit functionality, as debugging something as big as the web browser makes things rather complex. Smaller applications like fRiSS allow easier testing of some parts of the Service Kit.

The work also included a lot of boring tasks, such as locating the latest version of our webkit fork (currently on aldeck's github account), getting it to build (takes several hours on my current computer, I ordered an SSD for it to make this more workable), and updating some readmes and other documents that did not reflect the current state of things.

Work to come

With this ground work behind me, I plan to start with the following:

  • Document the Services Kit: it is currently not documented in the Haiku Book. I will add it there to make it easier to understand it and help others start using it.
  • Debug the webkit integration: make sure all websites are still working fine with the new backend.
  • Improvements to WebPositive: search engine customization, and others
  • WebKit upgrade: our WebKit port is not in sync with their trunk development. Updating it usually requires some work because they are changing stuff at a rate we can't follow.
  • Setup a WebKit build bot: previous attempts to upstream our changes to WebKit were cancelled for lack of a build bot to test the port. Getting the build bot running should be easier as the work on package manager brings us (as a side effect) a cross compiler we can use for this.

Comments

Re: Working on WebPositive

This sounds great! Will your work be in a public branch we can track? I'm really excited to see the Services Kit actually documented too. There's been some work I've wanted to do, but found the lack of documentation on the Services Kit quite a hindrance :(

Re: Working on WebPositive

Thanks for the update.

I understand that cURL may not be the best choice for a browser backend, but:
"curl works only with http and https"
is not true. cURL supports multiple protocols, as can be read on it's homepage: http://curl.haxx.se/libcurl/
You can manage the cookies yourself too - there's no requirement to let cURL handle them.

Regards,
ahwayakchih

Re: Working on WebPositive

Yay! Awesome.

Re: Working on WebPositive

Your post is full or false information. I really wonder how you obtain your information and why you are confident enough about it to exclude any wording that might hint at something being your unvalidated assumption versus something you really checked up on.

  • The cURL backend in WebCore is not designed for testing purposes. It was actually used by ports, and when I was working on WebKit, it was still used by one version of the Windows port (CE?). Most of the prominent WebKit ports however use a bridge to their own native network services implementation. It means the cURL backend received little attention and there was talk about removing it from WebCore, if memory serves. It doesn't mean however that the cURL backend was bad, or that it couldn't be extended, or that the problems we see are not within our own code that integrates this backend with our port (!). [Notice how I include some wording that makes it clear what I know as fact and what I recall from my untrustworthy memory. :-)]
  • As far as protocols go, you got it all wrong. cURL is the library that supports various protocols, whereas our immature Services Kit supports not even HTTPS.
  • No, we did not have to disable caching due to some bugs (just where have you got that from?!?), I just never got around to implement any caching of network resources. This is actually one of the biggest bottlenecks in our port. And switching to our Services Kit will not help the least bit.
  • What you write about the Services Kit being based on BHandler notifications and how it can help make the UI more responsive is clueless. Everything WebCore related has to run on the main thread, this being the BApplication thread in WebPositive, even rendering web pages. How network (or any other) notifications arrive on this thread is completely irrelevant to UI responsiveness. There is no busy-looping in the cURL backend that I am aware of. Notifications work via timers.
  • It is correct that a growing cookie jar becomes a serious performance problem. If the Services Kit indeed has a better scaling implementation, that's great. However, when I checked in the GSoC work on the Services Kit into the Haiku repository, I recall that the cookie implementation was very incomplete. This will be a big problem of its own.
  • Setting up a WebKit build bot is the last thing you need to worry about. The Haiku port has been removed from the official repository. Even if you had a build bot, you would have to make the test suit work. That is a major piece of work. The work is not in making every test pass, but to mark the expected test results for the Haiku port, so that new test results can be compared to the status quo and breakage can be detected.

I welcome your intentions to work on the Services Kit integration. However, you will have to work on Services Kit itself. A lot. It makes absolutely no sense (to me) to start with the documentation, as long as the Kit is still in this incomplete state.

As far as problems in WebPositive go:

The cURL network backend works right now. The cookie jar storage is not good. It could be sped up with caching (independantly of switching to Services Kit). There are likely still bugs somewhere along the path, as for example the chat feature fails to work in GMail. And there must be something wrong in the wiring code that makes it perform badly compared to the Services Kit prototype. I welcome your plan to switch to Services Kit, but you need to be aware that it is utterly incomplete compared to cURL or other backends. You make it sound like your work will be making the switch. No, it will be improving Services Kit to the point where you can even make the switch without major regressions.

The other big problem area in WebPositive is the rendering. The Haiku drawing APIs simply are too limited. We would need affine transformations, clipping paths, alpha masks, more blending modes, other vector features like line dashing. Missing shadow support makes a lot of pages render wrongly.

The text rendering API in Haiku is insufficient for complex text layout, preventing web pages in languages that need complex layout to render properly.

One of the biggest problems is dealing with SSL certification errors. Users need a way to bypass these errors and temporarily or permanently accept untrusted certificates.

Re: Working on WebPositive

Hi,
Thanks for the information. I'll take it into account :)

You're right, I didn't work much on WebKit and WebPositive in the past and I'm still learning about the issues and challenges. I tried to gather as much information as I could from previous posts on the website and some of the mailing list archives, but sometimes it's hard to find up to date and accurate information. I'll try to be clearer on this next time.

I definitely have HTTPS connections running with a Service Kit build of WebPositive. nielx started the work on this and we're now able to load a page over HTTPS. This isn't a complete implementation yet, as, as far as I can tell, there is no checking of certificates and no way to access them from the API. This is the missing part for a complete HTTPS implementation, but since our current solution doens't allow overriding the failure for an unthrusted certificate, I'd say both backends are equally unfinished on this regard.

I didn't write that the Services Kit already handles all protocols, only that it can be extended using add-ons to do so. We will need to handle things like file:// URLs, for example, this currently doesn't work in Service Kit and does work with curl. I'm not sure where the problem comes from, then, but the integration of curl in WebKit/WebPositive/Haiku makes FTP transfers unusable. You're right, curl does already support gopher. But Web+ does open gopher URLs in BeZilla right now. So there's some missing integration somewhere for this as well. So the problems actually lies somewhere in the integration.

I'm well aware of the problems with the current Service Kit code. However, I think it is a good addition to the Haiku API, and currently, nothing is using it. As a result, trying to get WebKit running on it is a good stress test for both sides and already exposed some issues. My arguments about the BHandler-based notification also goes that way, it makes it easier to work on native Haiku applications and avoid locking issues when working with both the network and some UI code. Even if that doesn't apply to WebKit, it's still the main justification for writing the Service Kit with a Be-style API.

As for writing documentation for the Service Kit, even in the current state, here are the things it can bring:

  • More people using it, bringing more test exposure and more bug reports.
  • Getting an easier entry point to the way the kit works. I think it is now clear for everyone I need to do more homework on this. Writing the documentation is a good learning experience as well.
  • Even if the implementation is not up to the task, the API may not need to change that much. Or maybe it will, but once I spot some things that has to change, I will not waste time documenting it. I can work on this in parallel (or after) the redesign tasks.

You have noticed that the WebKit buildbot comes very last in my TODO list. I will of course not start with this, and work on it only after things are working and up to date again.

On the graphic rendering side, I didn't start researching anything yet. I think there is more than enough work on the networking backend, and the slowness of Web+ is what worries me more for now.

Re: Working on WebPositive

Regarding shorter build cycle, did you look at using gyp (meta builder) + ninja (builder) + gold (fast linker) ?

Here some hints for Linux host, but I guess some can easily productive under Haiku too.
https://code.google.com/p/chromium/wiki/LinuxFasterBuilds.

Another alternative is cross-building instead of native building Web+.

Re: Working on WebPositive

Porting gold to Haiku is not as easy as it sounds. There are some tricks for getting a working ELF binary that are not easily solved.

As for gyp/ninja, this would need a rewrite of the Haiku build which currently uses Jamfiles.
As stated on the page, the first hint is the most efficient one, so maybe I'll try setting up a distcc farm.

Anyway, I'll be working mostly on Service Kit and the Haiku specific code in WebKit, this does not require recompiling very big parts and has a very acceptable code/build/test cycle time. Things could get harder when working on updating WebKit itself, but I'll look into that only a bit later.

Re: Working on WebPositive

The other big problem area in WebPositive is the rendering. The Haiku drawing APIs simply are too limited. We would need affine transformations, clipping paths, alpha masks, more blending modes, other vector features like line dashing. Missing shadow support makes a lot of pages render wrongly.

The text rendering API in Haiku is insufficient for complex text layout, preventing web pages in languages that need complex layout to render properly.

Would you be interested in contract work to work on updating Haiku's drawing and text rendering APIs so that Web+ would be able to render more webpages correctly stippi?

Re: Working on WebPositive

jscipione wrote:

Would you be interested in contract work to work on updating Haiku's drawing and text rendering APIs so that Web+ would be able to render more webpages correctly stippi?

I am quite busy with my work project. In the current situation, I won't be able to do full-time contract work for Haiku, Inc. I would certainly be interested to work on this problem area, if the situation changed. It's fun and interesting work and I know the concerned code-base very well.

Re: Working on WebPositive

If your current machine takes hours to build WebKit you really should consider getting a new one. I would think the build is mostly CPU bound (compiling lots of C++), so a SSD probably won't help a lot in this respect (though it certainly doesn't harm). On my aging Core i7-860 (i.e. first generation) with a regular HD the haikuwebkit packages build in about 25 minutes, which already includes several minutes for building the source package. So with a more recent CPU you should get rather decent compile times.

BTW the WebKit build system could need some love. Automatically installing libraries in home during the build process is not OK (and doesn't work with PM). Please also cf. the patch for building the packages.

Re: Working on WebPositive

Yes, this computer is rather old and it's no surprise that compiling takes some time. It's an Intel Core 2 P8600, so I expected some problems in that area. The SSD can easily be moved to my next machine when I decide to change anyway, so that's not a waste of money in any case.

Thanks for the patch. I noticed this strange idea of removing the only working browser on my computer to replace it with the experimental version. I'll apply it.

Re: Working on WebPositive

Nice to see some work on WebPositive, I hope the extensive work needed doesn't frustrate you too much, remember our little Haiku community is always here to support you.

Re: Working on WebPositive

Awesome job. I was actually wondering what happened to Services Kit -- when I couldn't find any doco, I thought it had been abandoned. I look forward to your progress :)

Re: Working on WebPositive

Hi PulkoMandy,

Although I recognise that you see value in bringing the Services Kit up to a usable state, I wonder if it makes sense to use it for WebPositive at this early stage? Perhaps a POSIX portable backend is better (or just fixing up the holes in the current curl-based one). From a quick google perhaps the Chromium Network Stack is a good fit - http://www.chromium.org/developers/design-documents/network-stack - they do have cookie support, websocket support, etc, and obviously the bindings to webkit are already done.

I appreciate that porting isn't as much fun as designing and writing new code, and it's your time and your project, so feel free just to do a Services Kit based approach. It just feels to me like adding more Haiku-specifc bindings to WebKit that no one else will helps us to maintain will make the already difficult task of keeping up with changes in WebKit even harder.

Simon

Re: Working on WebPositive

The service kit is not anywhere near a WebKit-only development. It will also be useful for any application dealing with HTTP/HTTPS (in the current state) and other network protocols (in the future). This is very useful to ease the development of other network-oriented applications on Haiku. The fRiSS RSS reader is an example of this, and I'm fairly sure there are others.

WebKit and WebPositive happen to be a very good test case and stress test for this library, and a good way to find bugs and design errors in it.

Also, the code is already written and mostly working. It needs some fixes here and there and I'm likely to encounter areas where the design can be improved, but I've been making some progress even while working a few hours a week (I will start working full time only in October). I now have basic support for cookies in place, for example.

The network backend from Chromium is not thread-safe, as they designed it to work in their "one page, one process" system. We are not working this way in WebPositive (there is only one instance of the app running) and it looks like it may not fit our needs that well. It may be fitted into WebKit, which is single-threaded as well, but then there would be no way for other apps to make use of it.

The WebKit API for plugging a network backend is fairly clear and not too complicated, and each ports (or at least most of them) comes with its own code for handling this network communications. We have the advantage that our Service Kit system library is designed with this in mind, so it is quite easy to interface.

I don't think it is worth dropping this existing implementation and starting over again. We have something that works (not completely, but for the most part), and a design that makes it reuseable for many applications. I'm not going to replace this with an application-specific solution.

Re: Working on WebPositive

@PulkoMandy

I am going to donate in the next couple of weeks as my budget permits. I just want to know, from the end user perspective, what can I expect to see from this? What will my internet experience be like when you are done as compared to the way it is now?

David

Edit: I realized this post sounded a bit snarky and I don't mean it that way. I just want to know what this will do to the user experience, or is it only 'back end' kind off stuff.

Re: Working on WebPositive

the first part of the work will happen mostly in the backend, but there should be visible page loading speed and download speed improvements in WebPositive.

Once that's done, I'll work in other areas to make Web+ work better. This includes support for downloadable fonts, some other advanced rendering primitives as stippi mentionned earlier, a working page and data cache to make Web+ even faster, and some other issues I'll discover while testing various websites.

The next step is an update to our WebKit port. We're quite far from the main development branch again, and this update will bring us more HTML5 features.

And then, there is some work to do in Web+ itself such as search engine customization, and some user interfaces for the improvements above. We'll need a "clear cache" button, a cookie manager, and so on.

All of this builds up to quite a lot of work, and could keep me busy for several monthes.
I'll do another blog post next week or maybe a bit later with more information.

Re: Working on WebPositive

Thanks for the reply and clarification. :)