WebKit

WebKit weekly report #8

Blog post by PulkoMandy on Fri, 2013-11-22 08:19

Hello there!

So, this wasn't a fun week. Last week I had finished merging all commits from WebKit main repo. I got Web+ to run fairly well with these changes and I wanted to merge this into Haiku. However, while this works for the gcc2-hybrid version, I was not able to build a package for gcc4 and gcc4h. As a result, there were no nightlies published (the script wants all the architectures to work for a given revision before it gets published).

When trying to build the gcc4 package, I ran into several different problems: haikuporter triggering a KDL when running webkit "make install", the CMake script I used to build the gcc2h packages not working as it should on gcc4, and a few more. Since I don't usually run a gcc4 install, I had to do all this from a temporary install on an SD card, which is very slow at compiling, or, actually, at doing anything. Even a simple rm -r can take a minute or so to complete.

Meanwhile, I'm making some (slow) progress on getting the testsuite to run. For this to run I have to fix a tool called DumpRenderTree. This is a small executable wrapped around WebKit that renders a web page, and dumps to the console either a tree structure of it, or a plain-text dump of the page contents. It should also generate PNGs of the pages in some case. The testsuite runs DumpRenderTree on a set of pages and compares the results with "expected" files, and makes sure everything matches.

The problem I have is DumpRenderTree is always dumping the render tree instead of the plain text dump the "expected" files have. To switch between the modes, DumpRenderTree expects the page to call Javascript code to configure the output. The javascript interface can also do some other useful things, like clearing cookies, or otherwise interacting with the tool.

Since the WebKit API is specific to each port, the DumpRenderTree tool must be rewritten to use each of these APIs. I updated our version, modelling it on the EFL one. I now have something that should work, but somehow the JS callbacks aren't working.

I tried building WebKit with all assertions enabled to mak sure everything was right. I had to fix some minor harmless things to get HaikuLauncher to run, but DumpRenderTree would hit asserts over and over again. I tried disabling one, only to hit the next. Something looked strange, with methods as basic as isMainThread() failing (this is just comparing a TLS value with a global). I added some traces, and noticed the TLS was ok, but the global was changing value. I spent some time trying to look for something overwriting it, but then I noticed it was also changing address. This shouldn't be possible. Well, that hinted me to the actual cause for the problem: some files are getting linked twice into DumpRenderTree. what happens is we first build a big "libWebKit" library. This exports only our WebKit API, and inside it, there is also all the core code (WTF, JavaScriptCore and WebCore), the cross-platform parts of WebKit. The idea is we want the library to only export our official API, and not let apps mess with the internals.

However, DumpRenderTree does need to mess with the internals. It adds custom JS APIs for the app to access, reads internal data to extract frame content as text, and takes some shotcuts calling WebCore directly to do things in a way that works on all platforms. So, DumpRenderTree links not only our libWebKit, but also the WTF, JSC and WebCore libs again. Since we build these as static libs, we end up with code both in DRT itself and in libWebKit. Things then start breaking apart because these can't see each other. We get duplicated static variables, initializations called on one side that set variables the other side can't see, etc.

So, I moved back to the shared mode. This is what we used to do when WebKit was built with Jam. In this mode, WTF, JSC and WebCore are built as shared libraries, and libwebKit only links against them instead of embedding them. This allows things to exist only once in the running executable and DRT to access all the internal as it should.

But... that didn't work. I hit a command-line length limitation.

In the Jam build, we used to build multiple small static libraries (webcore1.a, webcore2.a, etc) and then link them together in a big shared library. This allows us to send a smaller set of files to ar to create the static libs and stay under the maximal command-line length. CMake uses a different trick to keep command lines short: it uses response files. These were added to gcc and the binutils to solve the very same problem on Windows. The idea is to put the command-line args into a file, and give it to the tool on the command line using te @file syntax. This works well with ar (when building the static libs), but our gcc will unfold the thing before calling some of the internal tools, and we hit the problem again when invoking collect2 (this is an internal executable used by gcc) or ld. It seems this is because gcc doesn't know that the linker it is calling is compatible with response files, so it tries unfolding the command line to be more compatible.

I'm now rebuilding gcc with the --with-gnu-ld configure option. This should force it into using response files when invoking the linker, and finally get our libraries built. Not this will change the ABI again and the haikuwebkit packages for both gcc2hybrid and gcc4 will have to be rebuilt. I'm not sure yet if we should do this only for the tests, or if we should also do it for the distributed libraries. It makes sense to distribute the ones we have tested, but the single-library solution should be faster (less symbol lookup) and use less memory at runtime.

If the gcc rebuild doesn't work, I'll have to use the same trick we did with Jam, making invasive changes to the cmake build. This would make it more difficult to merge with the official WebKit sources in the future.

Also, some sidenotes:

  • My patches to CD, IM, and IUP have been upstreamed. We now have a portable UI toolkit that uses native widgets
  • Kallisti5 is working on providing a server that runs Haiku. We will use this once he is done setting up to run the CMake testsuite/nightly builds, and run the WebKit tests. This is a step towards getting parts of our port merged upstream again.

WebKit weekly report #7

Blog post by PulkoMandy on Fri, 2013-11-15 08:39

Hello world!

This week I reached a major milestone in my work, as I'm done merging all WebKit commits all the way to november 2013. HaikuLauncher is still running fine, and I will make the small required changes in WebPositive so it works fine again. Expect an updated WebPositive in trunk very soon now.

This week merges were as boring as the previous week ones: addition of the NiX port, removal of the Qt one (next version of Qt will go with Blink), replacement of a lot of never-null pointers to use references instead, rename of the KURL class to URL, and some API changes. WebKit has started using some C++11 features to help with cleaner and faster code.

I've also started looking at the test system. WebKit uses a tool called DumpRenderTree that will produce a text representation of the render tree. This allows running tests that are not dependant on any platform. There are also 'pixel tests', where WebKit renders a web page, and compares the results with a PNG file. And finally, there are some javascript-only tests for JavaScriptCore.

I've updated our version of DumpRenderTree (since WebKit has no common API on different platforms, each platform must provide an implementation for it). I've got it to run in the mode where you give it a single test on the command line, which is good for manual testing, but the testsuite instead feeds test names on the standard input, and this somehow crashes our version currently. I'll debug that and run the 33000+ tests to get a better idea of where we are.

There will be some more work to do before we can run the complete testsuite, for example tests involving HTTP require an apache web server port (anyone has a recipe for this ?), and many methods in the test frameworks aren't implemented yet. These allow the tests to access some browser settings from Javascript, for example they can disable cookies, change cache policies, and so on. This allows testing of WebKit with all possible settings.

I want to get the testsuite running to help with future work on adding new features and the merges we'll have to do. This will help getting a better idea of the status of our port, and make sure we don't break things when merging new versions of WebKit in. Also, the tests are usually simple and the failing one will point to the features we are missing, making the port much easier to fix and improve.

WebKit weekly report #6

Blog post by PulkoMandy on Fri, 2013-11-08 07:32

Hello world!

Sorry for no report last week, I was not in front of the computer on Friday.

Anyway, I got the HTTP authentication working last week. This was the last missing feature in the Services Kit version of WebKit when compared to the current cURL one. The next step is to fix the new rendering bugs.

The rendering side of things is mostly built into WebKit, so I didn't want to fix it on the old version we are still running. So, I have started merging WebKit changes all the way to the current revision. Unfortunately, our WebKit repository wasn't created the right way, and the commit hashes for WebKit commits didn't match the ones for the official WebKit repo. I had to create a new repository, and manually match the commits and play with git rebase, merge and cherry-pick to rewrite all ourwork against the official WebKit commits. This took some time, as there are 120000 WebKit commits in our repository, plus our own changes.

I have removed the old repository and uploaded a new one. My work happens in the 'rebased' branch, which is the default one.

With the repository rebased onto the official WebKit, it is now possible to use git merge to merge commits directly from the official WebKit mirror.. Our port was about 2 years behind, in WebKit world that's more than 30000 commits. I didn't merge those all at once, instead I'm working with ranges of about 2000 to 3000 commits. While one range is merging and compiling, I can review the commits for the following one, and take note of the important changes. This helps a lot knowing what happened when I need to do a merge or fix the build.

I have merged commits up to april 2013 so far. This is the point where the Chromium port gets removed, and later on the wxWidgets port. The GTK one switches to CMake as a build system (the same we use now). These are the changes that creates more conflicts with us, so I'm going to merge this part in smaller chunks (I tried a big one and got a build error I didn't know how to solve).

Following the removal of the Chromium port, WebKit entered a phase of cleanup and simplification. This includes removing Chromium-specific stuff, making the code simpler in some places, and also using some C++11 features that help with detecting errors at compile time.

One important change is the introduction of "platform strategies". Before that, porting webKit involved implementing many classes where the header file was shared accross platforms, but not the C++ sources. This led to a lot of #ifdefs-guarded platform-specific code all over the place. The platformStrategy class makes all the platform specific code located in the same place, and the shared code go elsewhere. Our port will be much easier to keep working this way.

So, which new features are brought in by this huge merge ? Well, none so far. Most of the features in WebKit are optional, and to keep things easier, we disabled most of them. As features are sometimes removed when no port uses them, I'm forst going to merge all the changes, then see what can easily be enabled. Some of the features involve only multi-platfomr code, but others require some porting.

Anyway, the up-to-date WebKit is faster, smaller, and has much less bugs, which are already very good things. When I'm done with the merge I will have a look at running the WebKit tests, as I'm pretty sure this will help catch some bugs and at least give us a better idea of the work left to do. Running a testsuite is now much easier thanks to the use of CMake, where the guis working on the EFL port made some changes to the build system to help with setting this up. I will also test all the websites I have listed on my TODO/check list, to make sure things that used to work are still working.

I hope to have something in shape for a merge next week or the one after that. The remaining part of the month will be dedicated to enabling some of the extra features, but with only 2 or 3 weeks left I will not be able to work on the most complex ones. Haiku, Inc currently can't fund another month of development, so it looks like I'll have to stop there, unless there are more people donating money this month.

WebKit weekly report #5

Blog post by PulkoMandy on Fri, 2013-10-25 06:50

Hello there !

Well, not so much progress on WebKit this week. I spent most of the time working on CMake code to get it to generate hpkg files. I got something that works well enough to link Web+ to it, so I can test things with the actual browser instead of HaikuLauncher.
Today I added the cookie jar persistence, so Web+ remembers all the cookies when you exit and relaunch it. I also started working on HTTP authentication. These are the two features I couldn't test with HaikuLauncher, as it lacks some code for them (saving cookies on exit, and showing the HTTP authentication password prompt window).

I also implemented the protocol handler for file: URLs, so now I can browse the Be Book and Web+ default home page works as well.

With the HTTP authentication fixed, I think I will be on par with the current code, so I could merge this into Haiku now for you all to use. However, there are some new rendering glitches that maybe I should fix first. What do you think?

WebKit weekly report #4

Blog post by PulkoMandy on Fri, 2013-10-18 07:07

Time for a report again !

So, over last week-end and monday morning I finally got Ninja working. I already said some words about it, Ninja is meant to replace Make for building projects. It has less features, because it is designed to run files that are generated, rather than hand-written. In my case, CMake generate the Ninja failes. I had problems building Ninja that turned out to be abug in our Python port. I didn't fix it, but I found a way to work around it.

The use for all this buildsystem work? Well, we're now using one of WebKit standard build systems (CMake), and using it in a way that's fast! CMake can also generate makefiles, but running these would take about a minute just to figure out which files to recompile. With Ninja, this is done in a matter of seconds, because the simpler rules are faster to parse. Ninja was actually written just for that purpose. One minute may not seem much when the full build of WebKit takes about an hour, but when I'm working on just one or two files to track down a bug, the edit/compile/test cycle is now much faster.

So, with this out of the way I was able to more easily try out things and fix the issues as I found them. I mentionned the Opera cookie testsuite in last week report : most of the tests are now passing, and we even get better results than the old Curl backend in some places. The remaining issues are rather minor ones, but I can fix them if you can find a website where they actually matter.

In one of the blog post comments someone mentioned that Windows Live Mail (or Outlook, as it's now called) didn't allow to display messages. I tried to do that with HaikuLauncher and the new code and noticed I couldn't even login to the website. Getting this to work was a bit tricky, leading to fixes in 3 different places: settings cookies from Javascript code, proper handling of URL redirections, and telling WebKit that "aspx" files served without a mimetype are web pages.

I already told you about the cookies, so here are some more details about the two other issues: the URL redirection problem happened when an HTTP server gave us a 302 answer (that is, a redirection) with a relative URL. The HTTP specification says an URL should be used, most likely starting with "http://". But the URL specification was later replaced with the URI one, which allows URIs without that part, for example "/path/to/page.aspx?param=foo". Our BUrl class wasn't able to understand such things and was desperately looking for the "//" part, leading to an endless loop. I replaced the homegrown parser with a regular expressionbased one, using the regex that is kindly provided inthe RFC for URIs. I had to fix our RegExp class in the shared kit (this is a collection of internal classes that are not, yet, part of the official API). So, I had to fix the RegExp class as it didn't handle optional capture groups in the regular expression properly. This includes things like: the URI may, or may not start with "scheme:", then, maybe there will be an authority part, of the form "//authority", then a path, which is anything following the authority and up to th first ? or # or the end of the URI, and then there is the query and fragment, and they are also optional. RegExp allow to write this in a more compact and computer understandable format and get the parts extracted to separate strings.

As for the "aspx" file handling, the HTTP protocol says that the server should give a "content-type" header with the MIME type of the file it's going to send. This fits well with the use of MIME types in Haiku and would be very nice. However, Microsoft servers for login.live.com sometimes don't include the header. Web+ used to assume the file was to be displayed on screen in that case. You may already have seen it trying to render a video file as plain text, as a huge page full of strange characters. Well, that issue was fixed, and if in doubt, it will now download the file to disk instead. A list of well-known file extensions is used to attempt a guess when the "content-type" is missing. I added aspx to that list and mapped it to text/html, so, the URLs ending in aspx will now be shown on screen.

I had a run of testing after all these fixes and things are getting much better. It helped fix some more issues in github and gist, and I could also browse some documents in Google Drive. I think we still have some problems with file upload, however, I'll have to look into that. Uploading video to youtube, as someone pointed out in the forums, also fails.

I caught a crash when opening a new window in HaikuLauncher, but I'm not sure if that would be a problem for Web+. Also, I wanted to work on HTTP authentication (the login/password prompt) and this is not available in HaikuLauncher, so a working Web+ will be needed. To get there I wanted to build a package for WebKit. To do this, I first tried writing an haikuporter recipe. I didn't get very good results with this. First,it means working separately from my development git checkout, so it takes twice the sapce on disk (and we're talking multiple gigabytes) and it takes forever to checkout. Second, there's no easy way to use distcc with haikuporter, so things are keeping my CPU very busy for a long time. And since this isn't my development tree, there are a lot of things to rebuild.

So, I decided to try a different solution: using CMake built-in support for generating packages. I tried that yesterday and found out that it wasn't working. I tried on a very small test project, and with help from the CMake IRC channel I found the issue: our CMake port is lacking some features that are available in the Linux version, and not needed on Windows. This confuses the Ninja generator which isn't ready for that case. This is a rather technical issue related to the way executables and libraries are linked together. This is different in the tests (the lib and executable sit together in a directory) and installed version (they are installed somewhere on the filesystem and the OS, more precisely the runtime_loader) must bind them together. On Linux, CMake makes up for that by rewriting part of the executables when installing or packaging them. This isn't enabled in the Haiku version, so it will instead try to link executables again when installing them. However, the Ninja generator is skipping the link step (it was never tested in that case) and tries to install or package non-existing files. This shouldn't be too hard to get working just like on Linux, more news next week !

There is another missing step to make this complete: CPack, the CMake component that generates packages, knows how to make tar archives (with various compression schemes), zip archives, and deb and rpm packages. This is not very useful for us, so I started adding support for hpkg files there. This will make it super-easy to get a package built from any project using cmake: "make package", or "ninja package", depending on the generator you decided to use.

Anyway, I've uploaded an unsupported build in a manually crafted zip file so you can try things for yourself: http://pulkomandy.tk/drop/HaikuWebKit_20131018.zip
I'm waiting for your reports on how well it's working. Note you need a fairly recent build of Haiku (a nightly from today or yesterday should do) as I made some changes to the BHttpRequest class again, to fix some API design issues that led to memory leaks. That class was also added to the Haiku Book.

That's it for this week!

WebKit weekly report #3

Blog post by PulkoMandy on Fri, 2013-10-11 06:53

Hello again, it's time for another report !

I made pretty good progress this week.

The issues I had last week with POST data are fixed. I had removed a non-working piece of code but replaced it with another that was broken in a different way. The problem was the way POST data was added to the http request. Fixing this properly required some changes to the Services Kit API. I removed some classes to make things simpler and introduced a stub for the central BUrlProtocolHandler class, which takes an Url as a parameter and builds a request for it using the appropriate protocol. The BUrlProtocolHttp class was renamed to BHttpRequest, and the API was tweaked to use multiple methods to configure it, instead of a single SetOption(option_name, value) method. This allows seting options with multiple parameters, and is more type-safe.

I started writing some documentation in the Haiku book, for both the Service Kit and the new Network Kit. BeOS already had a Network Kit, and the same API is available in Haiku, but we also have a newer, more flexible and more powerful API. Unfortunately it is undocumented (and unfinished), so there is not a lot of users for it. I hope the documentation will help change that. I'm far from done, however, with just 3 classes available in the Haiku Book (http://api.haiku-os.org).

I finally uploaded WebKit dependencies to the package manager. So, with a recent nightly build, it will be even easier to get started with building WebKit. As I had them around, I also added packages for vim and Caya.

With all this set, I can log in to gmail/google mail again. This means things are working rather well. I even got the web chat to show my online contacts, something that isn't working in the current versions of Web+. I'm also able to log into github, and I found another set of non-working things there (most of them were already broken in older Web+).

I'm now trying to run the tests from http://testsuites.opera.com/cookies. These will help me find and fix some more bugs with cookies management. I already found that cookies set to expire before 1970 would stay forever, because of a bug in the BDateTime class from Support Kit. There are some other issues detected by this test, and it does not make much requests, which makes it easier to debug than an actual production website.

No preview build this week yet, but I'll try to update the WebKit recipe at haikuports so I can cook a package for you to try.

Oh, I forgot to mention our patches to CMake were upstreamed. Version 2.8.13 will have all of them and should build out of the box on Haiku. Also, augiedoggie provided me with a most package, so I can cross that out the TODO list (which is here: https://gist.github.com/pulkomandy/6685664#file-bnetapi-webkit-bugs-md)

WebKit weekly report #2

Blog post by PulkoMandy on Fri, 2013-10-04 06:26

It's Friday again !

So, in my last blog post I told you I was converting our WebKit build files to CMake. This week I managed to get a working HaikuLauncher (the test browser that comes in the WebKit tree) and surf the web a bit with it.

Syndicate content