Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

Blog post by mmlr on Sun, 2009-02-01 02:38

Out of no real particular motivation I wanted to build a native GCC4 for Haiku. We've had a GCC 4.1.2 cross-compiler for a pretty long time now, but since there were some issues with GCC4 built Haiku installations and especially since there never was a native toolchain for GCC4 based Haiku, it has always been a second class citizen. You could experiment around with it and we've had hybrid builds able to use software for both GCC2 and GCC4 Haiku on the same install, but since you had to use the cross-compiler to build GCC4 Haiku apps it's always been a bit less convenient that just building for GCC2 Haiku. But there's a lot of software around that simply can't be built using GCC2 anymore. The reason for that being the use of coding conventions or simply features that weren't available in GCC2. So a native GCC4, meaning a GCC4 running inside Haiku and building GCC4 Haiku apps, is a really important thing to have for porting and building many future applications.

Since I briefly looked into updating the GCC4 cross-compiler in summer 2008 already (without getting very far because of lack of time), I already knew pretty much what to get and where to get it. I also already built a diff between the vendor and trunk version of the 4.1.2 cross-compiler in the buildtools repository. So I had a rough idea of what I needed to touch to get things going.

I was about to build a native GCC4. This means that I was going to build it on Haiku itself. Of course you could actually cross-compile a native compiler in a two step process on another platform as well. But since I use Haiku as my main and only operating system here, this was no real option. So this meant that a GCC2 Haiku with a GCC2 compiler would be the host for all the fun. Considering that GCC is a huge project, consisting out of many subprojects and thousands of source files, this would be a pretty tough stress test for Haiku. I wasn't even sure if the GCC 2.95.3 we are using was up to the task, but it turned out that this didn't pose any real problem.

One of the less convenient things about the GCC 4.3 series is that they require two external libraries: GMP and MPFR. Both are multiple precision math libraries. I won't pretend to have any idea on what they do or what they're used for, so I just accepted the fact that those were two dependencies to get first. Luckily both libraries are pretty much self contained and easy to port. The only thing I needed to do was to update the config.guess and config.sub scripts to a version that knew about Haiku as a target platform. The rest was the usual "configure; make; make check; make install". I configured them with just "--prefix=/boot/common" and they built and installed fine.

Encouraged by that smooth process I went on to GCC. The first questions come up right away when configuring. GCC has lots of configurable parameters and I wasn't really sure what to put there. Luckily I could just peek at how the cross-compiler scripts in the repository configured GCC 4.1.2. So I came up with "CFLAGS="-O2" CXXFLAGS="-O2" configure --prefix=/boot/develop/tools/gcc-4.3.2-haiku-090121 --target=i586-pc-haiku --disable-nls --disable-shared --enable-languages=c,c++" as my configure line. Yes, that's a configure invocation directly in the GCC source dir. Not really having any experience with this type of build system and of course only reading the (very good) configuration instructions after the fact, lead me to do the configure directly in the root source dir. The funny thing is that this little error later on would uncover a bug in libiberty that is now reported in the GCC bug tracker with a patch attached.

Since we are on platform "i586-pc-haiku" and are compiling a GCC for target "i586-pc-haiku" this means we're doing a native build. This also means that bootstrapping will be enabled resulting in a three step process: First a pseudo GCC is built that is a subset of a complete GCC just capable enough to actually build the full GCC. Through that process you don't actually compile GCC4 directly with your host compiler. This means that using our GCC 2.95.3 as host and more exotic compilers on other platforms just have to be able to build a working subset of GCC. This lessens the dependency on the host compiler and exposure to any potential issues in them.

The second so called "stage" is then to build a full GCC4 using the compiler built in stage 1. When creating a cross-compiler the process would end here. But since we're bootstrapping, there is a stage 3. Stage 3 essentially rebuilds the whole GCC4 again, this time using the compiler built in stage 2. This is to verify that the compiler that was built is actually able to build working programs. And this also explains why the process can't have a stage 3 for cross-compilers - if the compiler created in stage 2 is a cross-compiler creating binaries for other platforms you obviously can't use it to build a stage 3 compiler running on the host again. When the stage 3 compiler is built, the build system will verify that both compilers are valid. It does that by simply comparing the object files created by both stages. As both stage 2 and stage 3 compiler are built essentially by the same code, the resulting object files should be identical. If this verification passes, optional parts of the GCC4 package like the libstdc++ are built and the process is done - that's our goal.

I started out with a plain unzipped tree of GCC 4.3.2 (yep .2) sources. So before going anywhere I would need to actually port that GCC of course. Using the diff I had produced of GCC 4.1.2 this was pretty straight forward though. Some things have moved, but you could easily track them down.

Now I could start the whole process and it would actually get pretty far already. Some minor tweaks here and there and it would build the stage 1 compiler. This was the easy part, because now things started to go strange. Since the GCC built in stage 1 uses the config that you set up, you will notice misconfiguration when building the stage 2 compiler with it. And it seemed pretty misconfigured. First, things were found that weren't actually there and I started patching and changing ifdefs in some of the files. This was about when I really needed to go to bed and leaving it for that day. Thinking about it more and more while in bed and in the morning I came to the conclusion that this can't be it, that this would never create a proper patch and that there must be a root cause why things were failing this way. So instead of any further patching I started looking at the config.log to find out why configure would detect stuff that wasn't there. Now there are basically two things that are commonly done: 1. checking if a certain header/function/variable/define is present and 2. whether or not a certain symbol (function entry point in a library) is available. The checks that were looking for headers and prototypes did work. It does the checks by spitting out small source files and trying to compile them. So if you are looking for a header that isn't there compilation of the test program will just fail. The checks for available symbols however did not fail, the test programs simply compiled and linked.

That was when it became pretty obvious. Knowing from prior experinece that there was the option to allow or disallow missing symbols I figured that the configuration I set up would simply allow missing symbols. This meant that the test programs would compile and link, but wouldn't be able to run, because the stuff they refer to simply isn't in any library the system provides. This does however go unnoticed, because configure doesn't try to run the binaries, it simply checks if they compile and link. I then took a look at the specs and the theory turned out to be true, there was no -no-undefined present. BeOS did partially allow undefined symbols, that is they allowed undefined symbols in libraries but not in executables. This was done because they were linking their libraries with symbols not available at compile time. They were later runtime patched to be available. Some might remember the libroot.so.patch which patched the memset/memcpy to an optimized version depending on the CPU found. So -no-undefined was conditionally added for executables in the BeOS specs. Since we don't really want undefined symbols for Haiku, I've just added -no-undefined to the specs unconditionally. Should this later pose a problem, this can easily be changed in the sources or overridden in a custom specs file.

So with the updated builtin-specs I restarted the process, rebuilding much of the pseudo-gcc as well as starting the stage 2 compilation again. Now the configure looked much better, it only found what was there and I could remove all the hacks I've put in the different files. But of course this wasn't the only problem. Now it started with undefined stuff like "NAME_MAX" and "PATH_MAX". Anyone more or less familiar with the matter should now be able to tell: your limits.h is missing or messed up. And it was missing of course. I checked the diff and saw that there was an overridden variable for the test to find limits.h. For some reason I don't clearly remember I removed that line and replaced it with NATIVE_SYSTEM_HEADER_DIR instead. I think it was because the GCC build system tried to find the system headers in the default "/usr/include", which is of course not correct for Haiku. Anyway I got it working by moving some stuff around. Since I did all of that mostly during the timeframe I should have slept instead you have to bear with the fact that I wasn't always fully awake and don't remember all the details. After all they are pretty boring anyway.

So after the basic config was corrected I actually got the stage 2 compiler built. And since the stage 3 is essentially the same as stage 2, this one ran through equally well. This meant I finally had a native GCC 4.3.2 for Haiku. We're done, start the party! Ehm not quite of course. Having GCC 4.3.2 was of course a big step, but the real goal was to have a GCC4 Haiku running the GCC4 compiler natively. And I was still on GCC2 Haiku and since this wasn't even a hybrid build, the new GCC4 couldn't even compile anything runable that used C++.

Instead of party the goal now changed to get Haiku compiled with that new GCC. Since, as mentioned, we're on a non-hybrid GCC2 Haiku, the compiler can't be just dropped in as a replacement for GCC2. You still have to use GCC2 for compiling the host tools required by Haiku. This means setting up the Haiku build using GCC2 as the host compiler and just using the native GCC4 as if it was a cross-compiler. Since the Haiku build system supports such a setup, this was not very hard to do. But of course this is the latest point you could possibly notice that you forgot something. I didn't forget it, but mostly didn't care so far. Having GCC is by far not everything you need to build binaries. The only thing GCC does is it produces assembler output for the target architecture. It does that through the GCC frontend and it's language backends and a lot of optimization tricks and maybe a bit of black magic, but in the end it only produces assembler output. The rest of the process, meaning creating object files from that, archiving them into a static library or linking them into a binary is done by the binutils. The binutils are run as a separate project and provide the ld, as, ar, nm, objdump, ranlib and strip. It's the binutils that actually understand the ELF format for example. The good thing about the binutils being separate is that you can update them individually. Since I knew that we had more or less up to date binutils already (2.17) I was just lazy and instead of building the current version (2.19) I just copied the old ones into place so they could be used with the GCC4 I had set up.

So off you go with the Haiku build. But I didn't get very far. Of course Haiku was buildable with GCC 4.1.2, but we are updating two minor versions here. Many things have changed and especially many things have become more strict than in previous versions of GCC. So this meant that I wouldn't only have to update GCC and eventually the binutils, but that I would have to fix all the issues to make the Haiku tree compilable with the GCC 4.3 series as well. Luckily I've found porting guide explaining most of the issues I ran into and offering solutions. You can see the process in Haiku r28981 through r28995.

Success, I now got a bootable GCC 4.3.2 built Haiku. I've set it up to produce a hybrid build and then I was shutting down to go to the "other side" booting into the GCC4 Haiku. Just to add some emphasis to this, I did that whole process up to now in two sessions. I used Haiku to build a monster project, did update Haiku sources and checked them in while always reading webmail in Firefox, doing research to find solutions to the upcoming problems, chatting on #haiku and listening to music. All under Haiku, a still pre-alpha operating system. We're talking heavy workloads and uptimes of up to 20 hours here after all.

Surprisingly booting into that GCC4 installation worked out of the box. There have always been some stability issues with GCC4 built Haiku installations, so I didn't expect too much stability. But it turned out that besides three unexpected sudden reboots, things would go smoothly. Again we're talking uptimes of up to 20 hours. In principle I wouldn't have had to rebuild GCC4 again, because this thing is pure C code. So even though it was compiled on and for a GCC2 host, it wouldn't matter at all. Call me paranoid, but I still wanted to rebuild GCC4 on GCC4 just to make extra sure that we really have a fully native toolchain in the end. So I created a fresh unzipped tree and applied the patch I made for my final GCC 4.3.2 sources. Expecting to just start the process again and ending up with a proper fully native GCC4. After all this should pretty much exactly the situation of the stage 2 in the previous build, so why should it fail?

But of course it failed miserably. It failed even before compiling anything, because it claimed that the C compiler wasn't able to build binaries. Huh? OK, going through config.log it seemed as if the compiler toolchain wouldn't work. The C compiler backend "cc1" wasn't found. Tried to reproduce that testcase, I wasn't able to. It worked perfectly fine, with exactly that file. So, running a pre-alpha OS I started suspecting a problem with Haiku. Of course it was strange that this wouldn't turn up when compiling Haiku with exactly that compiler. So I was digging into the sources and added debug output here and there. Chatting with Rene Gollent we took appart some of the stuff that was going on. GCC claimed that it tried to execlv (I think) "cc1" but it didn't work. So we were looking through the whole path such a call takes. We didn't spot anything obvious. After a bit of more testing it turned out that exactly this test case would only fail when run inside the root source dir of GCC. Going through it systematically I copied over the files from that dir into a testbed and checked if it would still compile. It did. So I went on creating subdirectories analogous to the ones in the source tree. And there it was: as soon as there was a "gcc" directory, things would fail. Easy, the binary is called "gcc" and there is a subdir called "gcc", it must be that Haiku tries something executing "gcc" and thinks that directory is executable, which it is of course because the executable bit on directories means "traverse ok" and not "execute ok". But looking at the sources it was exactly this case that was handled! So the bug simply didn't seem to be in Haiku.

Then you have to work from the other side. I looked up what was actually called on the GCC side. And it really was "cc1" and exec. And it really couldn't find it, because it obviously wasn't in the source dir. After a bit of searching through the sources and trying to make sense of the things I saw I came to the conclusion that the prefix that was computed must simply be wrong. And tracking the prefix creation down I arrived at a pretty familiar block of code. The code looked almost exactly like the code in Haikus exec implementation. That's because it resolved the binary path using the PATH environment variable. But the difference was that exactly the directory case where X_OK is set but it's not a regular file at all wasn't handled. So this was a bug in libiberty which caused it to fail to resolve the path to the own gcc binary, making it unable to resolve the relative path to the "cc1" binary as well. Funnily I was only able to find that by doing the configure from the directory where you shouldn't do it at all. So I patched that, registered at the GCC bug tracker and submitted a bug with a diff. You can see it here, turns out another reported bug was reported that exhibited the same issue, but they didn't track it down to that problem yet.

Having this issue resolved, I was able to rebuild the whole GCC4 and build a fully native one. By that time the GCC folks released GCC 4.3.3, so I made a new diff, dumped my GCC 4.3.2 tree and started over again. Since Haiku hasn't really been optimized yet, we're talking build times of 3-4 hours here. Anyway I got it working straight from that patch again. This time I also wanted to get the current binutils, so I made a diff of the 2.17 binutils we had in the buildtools trunk and applied that (by hand, I like doing that because I can then more clearly see what's going on and if something nearby may need attention as well or if something can actually be left out). Binutils were friendly and built pretty much right away. So that really was a success. A native GCC 4.3.3 plus current binutils 2.19. I packaged it up, uploaded it and added it as an optional package ot Haiku.

The rest of the story was updating binutils and GCC in the Haiku buildtools repository through my totally crappy internet connection, a hundred attempts to fix configuration issues for the cross-compiler setup and finding issues that other platforms exhibited when building those tools. You can find that in the commits and in the bug tracker, so I'll refrain from prolonging this story.

Now we have a native, up to date GCC and the current binutils for Haiku. This should open up the door for a lot of exciting stuff. Ports of software that weren't feasible before because of using GCC2 and optimizations in Haiku that weren't available back then. Have fun!

Comments

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

Wonderful achievement my friend! I love to see progress in small (mind the term) projects such as haiku. I've been watching the haiku project for some time now and hopefully this event will spark many more significant developments.

cheers,

JP (the_ringmaster)

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

I have also created a front-page news article for this great news:

http://www.haiku-os.org/news/2009-01-31/haiku_finally_gets_a_native_gcc4...

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

Great to hear! I can't wait to test it out.

Developers are holding off porting their applications to Haiku because it still uses gcc 2.95.3.

The switch to gcc 4 is required.

In the past I supported Axel saying sticking to gcc 2.95.3 was right but with more thought and time it becomes apparent that Haiku will have to switch to gcc 4 soon OR at very least make it easy for developers to create and use gcc 4 development environment.

You have Firefox 3, Webkit, Boost Library ( Gnash ), etc. which all require gcc 4. Many other programs are also calling for newer gcc too. If Haiku was using gcc 3 then things would be fine and could get by easy but gcc 2.95.3 has neared its end.

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

What scares me is that you make it sound easy :)

Great achievement.

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

This is just a really awesome achievement, Michael! I am happily using a GCC4 Haiku since a couple days, and I am very happy that the full tool chain is there. I ported some of my software and am developing for Haiku on this installation. I have not had any trouble yet. It's great fun! Thanks a lot for doing this hard work!

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

Great work Michael!
Did you found any bugs with GCC 4.3.3 vs 4.1.2. IIRC there were some issues with tracker and radeon driver in 4.1.2.

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

Any thoughts about LLVM? From what I understand it will have to be ported in order for the next versions of mesa/opengl to build with gallium3d which might possibly improve software opengl speed a tad

also would compiling haiku with LLVM be possible in theory? perhaps for an embedded systems version?

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

LLVM is not hard at all to port to Haiku ... it almost builds out of the box, only problem is a name conflict in PI ... from math.h and they use PI as a variable name inside LLVM everywhere :P Ive got a working, tho not run tests on copy on my haiku box, its running LDC (LLVM D Compiler) just fine with it also :) But no luck with llvm-gcc

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

I have downloaded the buildtools repository from SVN, which contains the new gcc and the new binutils.
Am I right in understanding that these have been patched and configured to a point that they will build under haiku without modification? I have been trying over and over with different ./configure options with no luck at all. I was initially getting errors relating ld not being able to find crtbegin.o during a certain part of the build. After a few hours of googling I overcame that error by specifically setting the path to the compiler using "CC=/home/develop/tools/gcc-2.95.3-haiku-081024/bin/gcc" on the configure line.
I am now getting the error "undefined reference to __popcountsi2" while building bitmap.o in the gcc/bitmap.c file. I have googled this one to death and cant find a solution. I initially thought it had something to do with the GMP and MPFR libraries that mmlr mentions in his blog, but upon further investigation I think it is unrelated.

I have also tried compiling the new binutils, but I am getting the error "noreturn function does return." in the bfd/bfd.c file. From what I could determine from some extensive google searching on this topic, it seems that there is a function (exit()?) which is being linked in from haiku which is not meant to return but does.

Anyway, if I could get any help with my problems would be greatly appreciated, been driving me nuts. I am really keen to do a native gcc4 build of haiku.

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

I've been building revisions of Haiku in Ubuntu 8.10 for awhile now, using GCC 2.95. Does anyone have (or could write) a step-by-step "How To" on building a GCC4 Haiku from within a GCC2-built Haiku (and then, afterwards, a GCC4 Haiku from within a GCC4 Haiku)? I'd like to give it a whirl on my Acer Aspire X1200.

Since I can download all the tools, configure/build them, download the Haiku source tree and JAM Haiku *all* in under 1 hour in Ubuntu 8.10 (the Acer Aspire has an Athlon64 X2 5000+, 4Gb RAM, SATA 3.0Gbit HD, DVD-RW, etc. It's a smokin' lil' computer!), I'm hoping I can get 'er done in two hours, at most, from within Haiku, itself.

Always looking forward to a new Haiku adventure!

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

It's not necessary to first go through a GCC2 Haiku to get a GCC4 Haiku. It's a bit a waste of time indeed. The way to go is to just directly cross-compile a GCC4 Haiku and adding at least the Development optional package. With that you will have the native GCC4 available and can build a GCC4 Haiku within the GCC4 install.

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

Ok, I did that. The GCC4 tools built and GCC4 Haiku was JAM'd. But... it gave me an offset of 0! If I install that (using "jam -q"), it won't boot. JAM used to create the proper offset, but now it just puts it at 0. Any idea why? How can I make JAM create the proper offset, like it used to, even just two days ago?

Why doesn't it create the proper offset anymore?

Re: Native GCC 4.3.3 for Haiku - Tales of updating the GCC4 port

when first compile gcc4.3 from gcc2 with

../gcc/configure --prefix=/boot/common
make

get

/buildtools/gccbuild/intl/../../gcc/intl/relocatable.c:148: undefined reference to `libiconv_set_relocation_prefix'
collect2: ld returned 1 exit status
make[2]: *** [makedepend] Error 1
make[2]: Leaving directory `/