Haiku Activity & Contract Report, June 2023
As is the usual way of things, the monthly Activity Report is hereby combined with my Contract Report.
This report covers hrev57061 through hrev57126.
“user mutexes” refactor
The biggest changes last month were a series of commits by waddlesplash, all related to the
user_mutex API and the consumers of it. This API is the kernel portion of the implementation of basically anything related to mutexes or locks in userland, including
pthread_barrier, unnamed semaphores (via
sem_open), rwlocks, and more. It bears some resemblance in concept to Linux’s
futex API, but is very different in both design and implementation.
The commit series started out with some basic refactors, cleanups, and bug-fixes of the existing implementation, some of which had only been noticed recently. For instance, the implementation dealt exclusively with physical memory addresses, but used
addr_t (which is 32 bits on 32-bit systems) and not
phys_addr_t (which is 64 bits on 32-bit systems with PAE.) That was corrected first, along with some cleanup and consolidation of atomic and userland memory access routines.
Next came a major overhaul of the
pthread_barrier implementation. The existing one was known to be so broken as to be basically unusable; its usage was disabled entirely in Mesa and other software ports, and these problems were encountered while writing other tests and benchmarks for
mutex_lock routines. waddlesplash altered an existing
pthread_barrier test-case to reproduce the problems (deadlocks/hangs), and then completely rewrote the implementation, fixing all the known issues with it and improving performance at the same time.
After that, waddlesplash added more assertions and sanity checks to certain mutex locking routines, including one designed to guard against double-locks in
pthread_mutex_lock. This one was, under certain high-contention scenarios, actually possible to trigger: jessicah encountered the problem it guards against in GHC; without such an assertion, that problem took a long while to track down properly. (This was the bug that triggered all this work in the first place.)
Understanding how that assertion could possibly triggered required a lot of thought and careful reading of both userland and kernel source code. The fix, in hrev57076 has a commit message almost longer than the change itself, and changes both kernel and userland behavior (including a syscall rename) to prevent the problem from occurring.
The next series of changes focused on cleanup and fixing a few more (potential) problems throughout the code: a race in the “switch lock” syscall (which could not have caused any problems previously, however, as the API’s one consumer,
pthread_cond, would not use it in such a way as to trigger the problem), a race leading to a deadlock in the “unblock all threads” features (which was a regression from some of the prior changes), an incorrect usage of flags in
pthread_cond’s wait routine, and finally code to handle another corner case in
pthread_barrier (trying to destroy the barrier while not all threads have returned from the “critical section”.)
Following this, waddlesplash performed major overhauls of the kernel implementation of the APIs. First, he modified the code (which was already using
ConditionVariables) to take more advantage of their features, allowing for the deletion of a lot of manual queuing/dequeuing logic, and reducing lock contention on the single system-wide “user mutex entries” table. He next granularized locking: the global lock for the “user mutex entries” table was turned into a read-write lock, and each “entry” (corresponding to one mutex) also gained its own read-write lock, allowing the time each thread spent holding the global lock to be greatly reduced.
(This last set of changes mostly served to pave the way for the following ones; while in some artificial benchmarks they improved performance greatly, in other real-world benchmarks they decreased performance. However, as will be seen shortly, all the lost performance was regained and then lots more by the end.)
The next step was to break up the global “user mutex entries” table, and create ones for each team (process). (The global one must still be used for mutexes in shared memory, but those are very rarely used, and in practice the per-team ones are used almost always.) After the refactor, alterations were made to use virtual addresses instead of physical ones for all per-team user mutexes, which means that memory pages no longer have to be locked in RAM before waiting on them. This change almost doubled GLTeapot’s FPS in a contention-heavy virtual machine (~320 before to ~600 FPS after; before beginning on any of these changes, the best it could do was ~340).
Finally, waddlesplash added a
user_mutex KDL command which allows for easy inspection of user mutex objects that threads are waiting on (in addition to some other KDL changes that, while helpful here, are not user-mutex-specific; those are detailed below.)
So far there have not been any more regressions reported following these changes, though there are some bugs in applications (or in Haiku’s
libroot) that the assertions uncovered which have yet to be fixed. Some users have reported that menu navigation seems “snappier” now, to them, following these changes (though this is, admittedly, very unscientific.)
zdykstra fixed a minor oversight in Terminal that could sometimes prevent the (new) “Theme” window from appearing at all.
KENZ fixed a corner-case in DWARF file parsing in Debugger encountered while trying to debug some objects produced by Mozilla build systems.
Zardshard (a GSoC ‘23 student) implemented “references images” in Icon-O-Matic. This allows arbitrary (bitmap) images to be displayed within Icon-O-Matic, useful for artists who are working from existing image when drawing icons.
humdinger made ShowImage properly prevent the screensaver from triggering while in full-screen mode.
twang12-cpu implemented sound alerts for “low battery” and “finished charging” in PowerStatus.
humdinger fixed the layout of the “Set status” window in Mail, and added localization.
Command line tools
trungnt2910 added support for the
sockaddr structure type to
strace, as well as the arguments passed to the
load_image syscalls and flags for the
user_mutex syscalls. He also added some utility functions to
strace’s internals for reading values more efficiently (and in at least one case, more correctly), and used this to clean up the code.
humdinger added an “info” command to
pkgman, which will display the “Summary” and “Description” of any package (including ones not installed.)
korli fixed a regression in
ifconfig that prevented standard commands (
down, etc.) from working.
Freaxed contributed a fix to
BAboutWindow, resolving a problem involving invalid string pointers.
waddlesplash added missing getters and setters for
waddlesplash fixed a double-lock kernel assertion panic in the
usb_disk driver (a regression from the previous month’s refactors.)
PulkoMandy added support for reporting EDID monitor data under EFI framebuffers.
X512 fixed a typo in the XHCI (USB3) driver specifying an incorrect size for a root hub descriptor.
phcoder fixed PS/2 keyboard probing on Google Chromebooks, which do not report expected results for the “reset” command.
korli (after an initial attempt by waddlesplash) adjusted the
emuxki audio driver to work with userland buffers properly, fixing some KDLs.
phcoder fixed crashes in the
i2c_hid driver (not yet added to builds by default) on bogus or invalid HID devices.
waddlesplash imported some fixes from OpenBSD to the
libroot & kernel
trungnt2910 fixed a double-lock kernel assertion panic in the VM subsystem that could be triggered upon certain
mmap invocations when there were existing mappings.
trungnt2910 fixed the behavior of
pthread_setschedparam to work more in line with
sched_get_priority_min, fixing a problem where threads could be unwittingly promoted into “real-time” priorities.
waddlesplash modified the
threads KDL command to print the first 4 characters of condition variable names, instead of just “cvar”, when threads are blocked on condition variables. This allows condition-variable types to be easily distinguished when reading through a long thread list: e.g. port read/write are now shown as
cvar:port, while a user-mutex wait would be
trungnt2910 fixed cutting areas (which happens when
mmap‘ing over existing areas) when
mprotect has been called on them. Previously the protections set by
mprotect were not properly handled.
realloc_etc, to complement the existing
malloc_etc, and made use of it in the new page-protections code.
waddlesplash de-duplified and cleaned up some code in the kernel
rw_lock implementation, and then completely rewrote the implementation of the
RW_LOCK_DEBUG mode. Previously, this mode just turned
rw_locks into another sort of
recursive_locks, which allowed the
ASSERT_READ_LOCKED macro to work, but also meant that it would not differentiate between a read lock and a write lock, and also that concurrent readers and writers would behave very differently. The new implementation preserves the reader-writer distinction and instead tracks what threads have a read lock in thread-local storage, which is much less efficient but allows for more checks to be done. (This debugging option is disabled even in nightly builds, custom builds have to be made with it enabled in order to use it.)
waddlesplash fixed a “missing lock” assertion in the kernel VFS mount routine. (This problem was discovered by the new
rw_lock debugging mode.)
nielx made a variety of changes to the build system, but mostly lots of different small areas in the source tree, to get Haiku (at least partially) buildable under GCC 13, removing unused includes, fixing new
-Werrors, and lots more.
waddlesplash removed some last remnants of the
zip tool, which was deleted from the tree years ago (we still use
unzip during builds, however.)
Zardshard added a UML class diagram for Debugger’s internals, so that developers can become acquainted with its architecture more rapidly.
nielx fixed some errors and warnings given by Doxygen during generation of the Haiku Book.
ARM & RISC-V
davidkaroly fixed handling of the “Accessed” flag and implemented the “Modified” flag for the ARM MMU. He also enabled “TEX remap”, also as part of memory management initialization. He then removed some unused and unneeded code, and added support for some more UART FDT options in the bootloader.
kallisti5 adjusted some test scripts for booting RISC-V images in QEMU to use emulated hardware with more compatibility.
X512 and kallisti5 updated the packages set used for RISC-V builds, added more build profiles, and added Wi-Fi drivers to the build.
X512 exposed FDT data to userland, which can be useful for debugging purposes or examining what decisions the kernel and drivers made during startup.
That’s all, folks!
Thanks again to all who contribute to Haiku, and especially those donors who make my contract possible!
- Haiku Activity & Contract Report, August 2023
- Haiku Activity & Contract Report, July 2023
- Haiku Activity & Contract Report, June 2023
- Haiku Activity & Contract Report, May 2023
- Haiku Activity & Contract Report, April 2023
- Haiku's (Kernel) Condition Variables API: Design & Implementation
- Haiku Activity & Contract Report, March 2023
- Haiku Activity & Contract Report, February 2023
- Haiku Activity & Contract Report, January 2023
- Haiku Activity & Contract Report, December 2022