Contract weekly report #56 - Media fixes and more!

Blog post by PulkoMandy on Fri, 2014-12-12 09:18

Hello world!

With the fixes done this week, we now have less than 2500 open tickets left before R1. I had crossed this bar last week already, but not for long as new tickets sometimes come faster than we can close old ones. I think now we are under that bar in a more durable way.

News from Google Code-In

After the very busy first week of Google Code-In, things are getting quieter again as the students move on to more complex and longer tasks. You can see the stats at http://gci.puckipedia.com/org/haiku (the hosting for last week’s leaderboard was a bit problematic, this one is more stable). We currently have more than 200 tasks completed. We continued on last year’s idea of getting the students to write recipes for various BeOS and ported software, and as there are some returning students who are getting bored with that, so we also added some pure coding tasks, either improving existing apps, fixing bugs in Haiku, or writing new applications. Here are some highlights:

  • Dozens of recipes written or fixed and small bugs fixed in several applications: BurnItNow (CD burning), Celestia (sky map), ArtPaint (drawing), VirtualBeLive (video editing), and much, much more
  • Some new screensavers such as the beautiful Substrate (ported from XScreensaver): https://github.com/atalax/haiku-substrate
  • A TODO list application with synchronization with the Google Tasks API: https://github.com/AdrianArroyoCalle/haiku-todo
  • A port of the xmoto game to Haiku
  • A recipe to build a BeZilla package

There are many more tasks open, and we will see more software and documentation coming from the students as the contest continues. They are also learning how to code for Haiku, and some of them already plan to try going for Google Summer of Code with us in a few years.

And fixes on my side

The students are getting up to speed with Haiku and writing recipes, so after the hectic first week I had some time again to concentrate on fixing more involved bugs. So, let’s have a look at the commit log on my side for this week.

  • hrev48445: Magnify layout fixed for bigger font sizes.
  • hrev48448: Improve Workspaces Zoom() implementation to adjust the aspect ratio from the current window size, instead of resizing the window to a fixed size.
  • hrev48456: Some fixes to the Mouse preferences layout (with most of the investigation and fixes done by Laurent Chea, new patch contributor for Haiku).
  • hrev48457: moved the VL-Gothic font to a package so it is easier to update. The font was previously available directly in the Haiku package. It provides support for the Japanese language and is also our "fallback" font, where we try to find glyphs when they are missing from other fonts.
  • hrev48447: Fixed a bug with BGradients where they would draw random colors when the color stops didn't extend to the full gradient. For performance reasons, agg (the library used as a backend for app_server drawing) does not perform any bounds checks when getting colors from a gradient. When the gradient stops are not covering the full gradient range, this would result in semi-random colors being drawn. I fixed this, not by adding bounds checking (that would make drawing slower), but by inserting a "first" and "last" color stops in the gradient when it is transferred from the application to the app_server. This way the change has a much smaller performance impact and still results in the correct drawing.
  • hrev48464: Fix the layout of DataTranslations to avoid the list view changing size.
  • hrev48465: Use the preferences-set tab color when highlighting tabs for stack and tile.
  • hrev48466: Fix the layout of the FileTypes window where a button would be truncated in some locales and font sizes (again with help from Laurent Chea)
  • hrev48470: Fixed BHttpForm copy constructor so it is possible to use SetPostFields for posting data to an HTTP request. Before this only AdoptPostFields worked. Thanks to Adriàn Arroyo Calle, one of our returning GCI students, for finding and reporting this problem.
  • hrev48477: Changed BColorControl to show the resulting color in the sliders rather than just RGB ramps. Based on an old patch from stpere.
  • hrev48478, hrev48483, hrev48485: Fixed a difference of Behavior with BeOS in BStringItem, where we would not allow subclasses to easily change the drawing color for custom BStringItems.
  • hrev48479: Fixed a drawing glitch in BTabView when font hinting is disabled. The non-integer string width led to rounding errors in the drawing code, making parts of the tab shift by one pixel.

Media fixes: linear interpolating resampler

Well, the tasks above are still rather small ones. The meat of the changes this week is in the media and game kits, where a few long-standing issues were fixed.

hrev48458, hrev48459, hrev48460: fix the linear interpolator resampler. I wrote this resampler in 2010 in an attempt to improve the audio quality. The resampler is used to convert the sample rate of any media kit node to the sampling rate used by the soundcard. The BeOS and Haiku media kits allow media nodes (anything producing sound) to work at any sample rate they wish. This can be the usual 44100Hz (the sample rate used by audio CDs) or 48000Hz (used by DATs and some other standards), but it can also be any arbitrary sample rate. For example when writing an emulator you can get much better sound quality if the sample rate matches the one used by the emulated system (in the case of http://ace.cpcscene.net, which I’m porting to Haiku on my free time, a 125kHz sample rate is used, because the sound chip for the Amstrad CPC works at this frequency).

There is a problem with this, however: modern sound cards are not able to handle this at all. What they do is output a single sound stream to the speakers, at a fixed frequency. How do we manage this then? We use the system mixer. This is a special media node which will get the output of all the other nodes, convert them to the format needed by the sound card output, then mix them together into a single output stream.

Before 2010, the sample rate conversion was done in a very simple way. If, for example, your sound card used 48000Hz output, and you played a sound file at a rate of 44100Hz, we would convert 44100 source samples to 48000 destination samples by doubling some samples until the right number of samples was used. Or to take a simple example, to convert a stream with 3 samples [A, B, C] to a new stream with 4 samples, the result would be [A, A, B, C]. The mixer is also able to perform the reverse operation, by dropping samples. Converting the 4 samples [A, B, C, D] to 3 samples would result in [A, B, D].

This method is very simple, and does not use much CPU, but it sounds bad. When the sample rates are very different, things get much worse. Since 192kHz is now a common output rate for sound cards, you can get things like [A, A, A, A, A, A, B, B, B, B, B] generated as the output. Can we do better? Yes, with linear interpolation. The idea is fairly simple. Instead of repeating the same sample, we can instead do a weighted average of the “previous” and “next” sample to compute a more accurate resampling of the sound. For example, resampling 3 samples to 6, we can get [A, B, C] converted to [A, (A+B)/2, B, (B+C)/2, C, (C+D)/2]. And the problems start with this last sample. As you can see, it is an average of the last sample from the source buffer, C, and… the first sample from the next buffer. Which we don’t know yet.

My 2010 implementation of linear interpolation did some weird things to try to avoid this problem, and it didn’t work. It sounded somewhat better than the drop/repeat method, but not as good as I expected. I didn’t found the solution at the time and I moved on to other things, calling it “good enough”. But it wasn’t, and while looking at our old tickets this week, I was reminded of this, with even a sample file making the problem very obvious. So it was time to look at this piece of code and finish the work.

So, how can we handle this sample that comes by the future? We can’t. But instead, we can shift things around so we instead need a sample from the past. So, let’s say our vector is [B, C, D] and we want to resample to 6 samples. The result will be [(A+B)/2, B, (B+C)/2, C, (C+D)/2, D], with A the last sample from the previous buffer. This works with a small drawback: it adds a delay of one sample. However, a delay of 1 sample at the high sample rates we are working with is not noticeable, it will be 1/44100th of a second, or 0.022ms. Nothing to worry about.

Before implementing this, I did two other things. I rewrote the existing resampling code so it uses a template to be type-generic. Our “resampler” actually does more than just the resampling: it also converts between different types and bit widths and applies a gain. So you can resample an 8 bit signal (signed or unsigned), a 16 bit signal, a 32 bit signal, or a float signal. Same for the output. Well, in the current implementation either the input or the output must be in floating point format, to perform int-to-int conversions we put two of these resamplers back-to-back (with one doing just format conversion). Still, the resampler code was repeated for each format, so we had 9 different versions of the drop/repeat resampler, and 9 different versions of the interpolator, with very similar code. I rewrote this to use a template function instead, which means I write the code once and the C++ compiler generates all the variants for me.

With only one function to fix, I could now write a small test application to help me test my work. While you can easily hear the difference between a good and a bad resampler with some specifically crafted source files, it is quite hard to tell exactly what is wrong this way. So, the MixerToy instead draws a diagram of the sound wave and its resampled version on screen. It took me about half a day to write the app, but then seeing and fixing the problems was done in a few minutes. And the effort in writing the app is not lost, as it can be used to write more complex resamplers (better quality, but more CPU hungry: these could be used for offline processing, for example in MediaConverter), and it could also be extended to test arbitrary media nodes (filters, for example).

Media fixes: ffmpeg decoder improvements

I had noticed several tickets about videos not playing properly in MediaPlayer. I downloaded all the videos from these tickets which had samples, and set down to track the issues and fix them.

Some videos would leave the window black or draw garbage on a small part of the window at the top. I tracked this down to missing colorspace conversions. One of the techniques used for video conversion is to code the video in different color spaces. Instead of the usual RGB32 format, several variations of YUV or YCbCr are used. The idea of these formats is to store the luminance (Y, telling wether the pixel is dark or bright), and separately, the color information (Cb and Cr, which are the “blue complement” and “red complement”, telling how the red and blue channels are different from the luminance. The green channel can be computed from this). The idea of this processing is that the eye has high resolution for luminosity, but not so much for colors. Videos take advantage of this in two ways. First, the information about luminosity and color are stored separately. This makes them compress better because the relation between adjacent pixels is stronger (the fact that a pixel is light or dark depends on the light source, while its color depends on the object being captured ; separating these almost unrelated informations make it easier to compress each better). Second, since the eye is not too sensitive to colors, the color information can be stored at a lower resolution. Video formats have 1 color “pixel” for 2 to 16 luminance “pixels”. While ffmpeg handles the decompression of the video, it leaves the resulting picture in the native encoding for the video format.

When video drivers support it, we can send this data directly to the video card, which has hardware support in its overlay hardware for decoding such things and displaying them properly (not all formats are always supported, and sometimes we convert from an YCbCr format to another for this). When there is no overlay support in the driver, we must instead convert the picture to the current screen color space so it can be drawn as “standard” pixels. We had implementation of the most common formats, but some were missing. Now we have support for two more: YUV410, used in the first generation QuickTime format (with 1 color for a grid of 4x4 luminance pixels - hrev48467); and YUV420P10LE, used in the H264 “Hi10” profile (1 color for a grid of 2x2 pixels, components stored on 10 bits instead of 8 - hrev48484).

Another problem was some videos playing too fast, or even showing a few frames at the start then stopping to render at all. This was tracked down to an incomplete implementation of framerate change support in Colin’s work to support DVB-T. I could unfortunately not fix it, so I have disabled that particular change until a solution is found (hrev48469).

I also commented out an assert that triggered a bit too much (hrev48475). This avoids a crash of MediaPlayer on some videos, but they are still not replayed correctly (they show garbage or play out of order for now). The assert was right in pointing at something done wrong, but corrupted video is better than an application crash.

Return to Monkey IslandAudio

Another bug report was about APE files not playing. APE (Monkey’s Audio) is another attempt to replace and improve on MP3. It was somewhat popular in the 90s and raced against Musepack and a few others. In the end, ogg/vorbis mostly won that battle.

Still, there are some APE files around (not so much, actually the hardest part of this ticket was locating sample files) and Haiku was not able to play them. It turns out our old version of ffmpeg has an unfinished version of the Monkey’s Audio support which can only decode the 8-bit version of the format. All 2 sample files I could find were in the 16-bit format. I tried backporting the changes from a later ffmpeg version to fix this, but for some reason this didn’t work. But, while researching this I noticed that we also had a separate plugin for reading APE files, contributed to Haiku by japanese developer SHINTA. After some fixes to our nasm build rule to make include directives work again (hrev48472), I could get it to compile and run. I found that it worked great with the command line tools (playfile, playsound), but not at all with MediaPlayer, where it would use a lot of CPU and not decode much. I found that the media framework MediaPlayer relies on (originally from Clockwerk) supports far more features than we currently use. It could for example play videos backwards, which is something the BeOS MediaPlayer could do. Useless, but nice for that extra “wow” in demo shows. To implement this, the framework needs to cache chunks of the video (as the decoding can only be done forwards) and then replay them in the inverse direction. The implementation of this cache sends a lot of “seek” requests to the decoder plugin. When playing forward, these are in-place seeks which don’t actually move anything. But the monkey was not too smart here and re-decoded everything from the last keyframe. Which is a bad thing, since APE has very few keyframes, usually one every 7 to 15 seconds. I modified the seek code to check for seeks within the already in-memory chunk of audio, and now these are implemented by just setting a pointer to the right position. Much faster than decoding several seconds of audio. And that’s it, crystal-clear replay of APE files in MediaPlayer with low CPU use. The ape_reader is available in Haiku images starting at hrev48474. Another nice thing about this is it allows testing of media apps (MediaPlayer, MediaConverter) with more than one decoder, which is a good way to make sure we don’t hardwire things too much to specific behavior of the ffmpeg one.

Worms Armageddon

A bit of back story on this one. You probably know the Worms Armageddon game. A little known fact about it is when it was released, there were plans for a BeOS port done by Wildcard Design (who also ported some other great games to BeOS). Unfortunately the work on this one was never finished and all we have is an almost complete “leaked beta”.

The leaked beta doesn’t run on Haiku. It has a bug at startup where it allocates a really big array on the stack, and on Haiku this crashes (it worked by chance on BeOS). Probably the final version of the code would not have had this issue, but we have to work with what we have. Fortunately someone already took the time to investigate this problem and we could find a way to fix it with a patch to the binary in order to properly allocate stack space as required. And this got the game running. Well, almost.

The menu would work great, but when you started to play a game, the sound would be garbage, and it would often crash. The workaround for that was to kill media_server. the game would then run stable, but without sound. Our worms hacker was not able to track the issue further and stopped his work there. Things sit in that state for a couple years, but I had that in my mental TODO list. So I took the time to dig in that issue.

My first step was to have a look at Cortex and the Media preferences while the game was running. There I noticed that it was using BGameSound for all its sound output purposes. This was also visible in the backtrace for the crashes (remember that I’m working without the source for the game itself here). So I recompiled the game kit in debug mode to be able to debug things further, and also added some tracing to see which code paths were used.

After some investigation I found a first issue: the Play method is getting called after a sound is finished playing, which led to “pure virtual method called” crashes. I added an extra check to intercept this before any damage is done, and the Play method will abort before it hits more problems. That fix is probably not completely correct, I should get back to it, but it allowed me to debug things further.

Well, not much further, I soon hit another problem where a buffer would be accessed out of bounds. My helpful tracing shown that Worms Armageddon was using mono audio samples, but the Game Kit was assuming stereo (and twice as big buffers). This couldn’t work, so I short-circuited that part of the code and added a simpler code path for mono audio. This is however not completely correct: since the game kit supports panning a mono sample to the right or left, we should make it always use a stereo output and “expand” the mono channel to stereo, taking the panning into account. I opened a ticket about this so I don’t forget to get back to it (hrev48486).

So, this fixed the crashes. But, it didn’t fix the sounds sounding like garbage. I tried various things, simplified some code to make it more readable so I could find what was going on, and finally I found it. Worms Armageddon didn’t set an endianness for the samples it was playing. The media formats are described by a frame rate, number of channels, sample type, and endianness. The endianness can be either B_MEDIA_LITTLE_ENDIAN (1) or B_MEDIA_BIG_ENDIAN (2). And Worms was sending us… 0. In Media Kit conventions, this would be a wildcard, meaning “do whatever is most suitable, I’ll figure it out on my side”. That doesn’t make much sense at first glance for the endianness since the data is in a fixed endianness, so this was never checked for. And the code defaulted to big endian. I added an extra check for this case and convert the 0 to B_MEDIA_HOST_ENDIAN, that is, in the case of x86 machines, little endian. I think that’s as close to a wildcard for this as it gets. And Worms Armageddon is happy with that, we now get some working in-game sounds!

Weekend hacks

These are tasks done during my week-end, and not billed to Haiku, Inc.

  • hrev48451: Fixed the ARM-none-eabi toolchain for proper multilib support. While writing software for a Cortex-M4 and trying to use the FPU, I found that the toolchain didn't work right for that. Now it's fixed and ARM embedded software development is possible on Haiku, which is one more case where I don't need to run Linux or Windows anymore.