December 4, 2013

Testers wanted: Aura in Chromium Dev channel (33.x)

If you're using hard masked www-client/chromium dev channel packages (currently at version 33.x) you're probably used to testing things and encountering breakages from time to time.

If you feel bored or adventurous, or both, or would just like to try the new Aura UI of Chromium (which will become default at some point), I encourage you to enable USE="aura" for www-client/chromium. Upstream still keeps it disabled by default, and that includes Google Chrome. In Gentoo it's easy to make a choice about this.

Aura is a new UI architecture which is GPU accelerated. You can read about the technical details in its documentation. Especially because of the use of hardware acceleration it'd be useful to get many people to test it. If these changes break in your configuration, you can know now (while it's still early in development process), report a bug (consider posting the bug link in the comments so I can ensure it gets proper attention), and get back to the working configuration.

Possible workaround for GPU problems is using --disable-gpu command-line flag. Note that this still means there is a bug that is unlikely to get fixed if you don't report it. It's also useful to include contents of about:gpu page in your bug report.

Finally, see the screenshots below for an idea how it looks like (click thumbnails to see original images):

with aura
without aura
You can start compiling now, chromium-33.0.1726.0 has just been added to the tree.

November 26, 2013

Third party libraries in Chromium sources

One of the most common obstacles or issues related to inclusion of Chromium in Linux distro repositories are bundled libraries. Last attempt to blog about it I know about is Evan Martin's Forking upstream software post. I decided to take another look.

It is important to know that even if something appears in a third_party directory in Chromium codebase, it is not necessarily a bundled library. Third party code - yes, but not necessarily a bundled library. What's the difference? Well, even Fedora in its excellent No Bundled Libraries article lists e.g. copylibs as a possible exception. What about code that was never intended to be used as a shared library, is part of larger codebase, but is still useful? This will come up in some examples below.

Here is my list of third_party code still present in Gentoo's Chromium packages as of version 33.0.1711.3 (dev channel). This means that the libraries that have been successfully unbundled are not included. Similarly, code that is not used on Linux is not included. This takes into account intended audience - mostly Linux users, some of them packagers. A star (*) means that the code was already there in 2009.
  1. base/third_party/dmg_fp (*) - David M. Gay's floating point routines (dtoa, g_fmt). I don't think there is a shared library for these. There is also crbug.com/95729 about using V8's routines.
  2. base/third_party/dynamic_annotations - single .c file with corresponding header containing annotations for dynamic tools like valgrind, tsan. Doesn't seem to be worth extracting, which could likely add unwanted dependencies on these tools.
  3. base/third_party/icu (*) - this could be extracted to use system icu; supporting nacl may be a challenge there (base is most likely also compiled by the nacl toolchain, and it's not obvious to me how shared libraries would work there - if at all, or whether it would make sense).
  4. base/third_party/nspr (*) - it may be possible to remove it now that Gentoo dropped nacl support for other reasons (crbug.com/269560).
  5. base/third_party/symbolize - part of another Google's project, google-glog. Technically should be possible to extract, and glog even made a release with a tarball.
  6. base/third_party/valgrind (*) - bundled to avoid depending on valgrind just for build... IMHO fine.
  7. base/third_party/xdg_mime (*) - looks like the code was not intended to be used as a library, but maybe the intention was to avoid forking a process. Probably worth a closer look.
  8. base/third_party/xdg_user_dirs (*) - see this comment in the source code:
    /*
      This file is not licenced under the GPL like the rest of the code.
      Its is under the MIT license, to encourage reuse by cut-and-paste.
    
      Copyright (c) 2007 Red Hat, inc  ...*/
    
    
    
  9. breakpad/src/third_party/curl - great candidate for unbundling (or just disabling breakpad for Chromium builds in a way that doesn't try to touch curl even when disabled).
  10. chrome/third_party/mozilla_security_manager - parts of Mozilla code; doesn't seem to be designed as a shared library; has local modifications.
  11. crypto/third_party/nss - selected files extracted from NSS; there are some modifications, but with enough effort it may be possible to unbundle.
  12. net/third_party/mozilla_security_manager - parts of Mozilla code, different from the chrome bits above.
  13. net/third_party/nss (*) - parts of Mozilla's NSS (libssl) with experimental patches. Note that NSS developer is working there, so this can be seen as even more bleeding-edge than NSS trunk.
  14. third_party/WebKit (*) - now Blink, developed as part of the same Chromium project, but a fork of third party code. Not designed to be used as a shared library.
  15. third_party/angle_dx11 - developed by Google/Chromium developers; doesn't seem to be designed to be used as a shared library, but with enough effort it should be possible.
  16. third_party/cacheinvalidation - same as above.
  17. third_party/cld (*) - developed by Google/Chromium developers, probably to be replaced with cld2, which will hopefully be closer to a shared library design.
  18. third_party/cros_system_api - related to ChromeOS, not really bundled but rather just part of the project.
  19. third_party/ffmpeg (*) - Chrome uses very recent ffmpeg; I think the local modifications status has improved greatly since 2009: looks like patches make it upstream pretty quickly.
  20. third_party/flot - JS library, AFAIK there isn't really a concept of having system JS libraries. It could actually be useful to have one, but it's not obvious.
  21. third_party/hunspell (*) - modified to support running under sandbox and loading dictionaries in a different format; maintainers do respond but are very busy. This is doable but requires a fair amount of effort to figure out what to do with the different dictionary format.
  22. third_party/iccjpeg - taken out of lcms library, and the maintainers don't want to expose it.
  23. third_party/jstemplate (*) - Google's JS templating library.
  24. third_party/khronos - GL headers, unfortunately with local modifications.
  25. third_party/leveldatabase - needs a redesign to allow applying Chromium-specific behavior (env_chromium.cc) at run-time instead of at compile time. I've seen a Debian package for leveldb, looks like there is some interest in using it as a library.
  26. third_party/libjingle (*) - used to have semi-inactive upstream, now seems to become a part of WebRTC. When things stabilize more, worth another look.
  27. third_party/libphonenumber - upstream seems to be more focused on Java version of it, which actually has releases; C++ version doesn't seem to be designed to be used as a shared library.
  28. third_party/libsrtp - used to be inactive but now has a new home at https://github.com/cisco/libsrtp and there are Googlers helping out with it. Worth taking another look when things stabilize. Note that even if it compiles it doesn't mean it works, see bug #459932.
  29. third_party/libusb - locally made incompatible change needs to be upstreamed (crbug.com/266149).
  30. third_party/libvpx - waiting for upstream release supporting vp9, see bug #487926.
  31. third_party/libwebp - waiting for upstream release supporting APIs Chromium depends on, see http://crbug.com/288019.
  32. third_party/libxml/chromium - this is ugly: code is actually part of Chromium codebase; at least it's not really bundled.
  33. third_party/libXNVCtrl - part of nvidia-settings. Not sure if it's intended to be used as a shared library, but it seems totally possible technically, and I even remember some success reports with it.
  34. third_party/libyuv - Google/Chromium project. Should be possible to use as a shared library, but doesn't seem to make releases.
  35. third_party/lss - Linux Syscall Support; a header based on Linux kernel headers.
  36. third_party/lzma_sdk (*) - lzma library from 7-zip.org ; it would be great to replace it with xz-utils which distros package.
  37. third_party/mesa - I think only headers are used, but it's complicated.
  38. third_party/modp_b64 (*) - README.chromium points to https://code.google.com/p/stringencoders/. Doesn't seem to be design to be used as a shared library, but it seems possible.
  39. third_party/mt19937ar - not designed as a shared library, rather small; can be removed after move to C++11 (looks like <random> would support needed functionality).
  40. third_party/npapi (*) - NPAPI headers with modifications.
  41. third_party/ots (*) - OpenType sanitizer, may be possible to package as a shared library, although it doesn't seem to have releases.
  42. third_party/polymer - JS library by Google, see polymer-project.org.
  43. third_party/pywebsocket (*) - Python WebSocket server used for testing. Should be possible to package it separately.
  44. third_party/qcms - color management library. Last upstream commits seem to be over a year ago, but the bundled copy continued to receive various updates, at least for more recent toolchain support.
  45. third_party/sfntly - font-related library; doesn't seem to have releases, doesn't seem to be designed to be a shared library.
  46. third_party/skia (*) - graphics library, changes very often.
  47. third_party/smhasher - hash function library - doesn't seem to have releases or be designed to be a shared library.
  48. third_party/sqlite (*) - available as a package, the biggest obstacle is lack of a good API to use it in a multi-process sandboxed context and also test it. See http://crbug.com/22208. That obstacle would disappear however when Chromium drops support for abandoned webdatabase spec.
  49. third_party/tcmalloc (*) - although theoretically available separately, the Chromium copy is heavily modified, and that includes hardening changes important for security.
  50. third_party/tlslite (*) - Python crypto library, only used for testing but appears to be modified in a non-compatible way.
  51. third_party/trace-viewer - not obvious what it really is, and it contains several more bundled libraries inside.
  52. third_party/undoview - code extracted from gtksourceview.
  53. third_party/usrsctp - user-space SCTP implementation with local changes.
  54. third_party/webdriver - mostly some minified JS embedded in C++ code.
  55. third_party/webrtc - Real-Time Communications library - doesn't seem to have releases, and seems to be moving pretty fast.
  56. third_party/widevine - stubs for proprietary content distribution module.
  57. third_party/x86inc - asm code extracted from x264 with local modifications; I don't really see a good way to provide that as a system package.
  58. third_party/zlib/google - this is ugly: code is actually part of Chromium codebase; at least it's not really bundled.
  59. url/third_party/mozilla - parts of Mozilla code; doesn't seem to be designed as a shared library; has local modifications.
  60. v8 (*) - although the path doesn't contain third_party, I consider it bundled code. See When the libraries you use are moving too fast for the reasons it's there. While technically not part of Evan's 2009 list, it was obviously there since the beginning.
60 entries look like a lot. I would like that number to be smaller. On the other hand, note that many of these codebases were not designed to be used as shared libraries, some were developed as part of Chromium project, and that the project is very careful to put code it borrows from outside in third_party directories, whereas it's not uncommon for open source projects in general to incorporate such code directly into their codebases. In Chromium it's just much more visible.

Also note that while 23 of these items still exist, for some entries from 2009 we're now using system libraries, at least in Gentoo. Just to give you a few examples (the list is not necessarily complete - star means it's present on the 2009 list):
  1. flac
  2. harfbuzz (*)
  3. icu (*)
  4. jsoncpp
  5. libevent (*)
  6. libjpeg (*)
  7. libpng (*)
  8. libxml (*)
  9. libxslt (*)
  10. minizip
  11. nspr (*)
  12. openssl
  13. opus
  14. protobuf (*)
  15. re2
  16. snappy
  17. speex
  18. xdg-utils (*)
  19. yasm (*)
  20. zlib (*)
I'm interested in your opinions, so feel free to add your comment below. If you liked this post, you may also like State of Chromium Open Source packages.

November 18, 2013

When the libraries you use are moving too fast

Recently I masked dev-lang/v8 on Gentoo with the following message:

# Pawel Hajdan jr (13 Nov 2013)
# Masked for removal in 30 days. Does not have stable
# API resulting in compile breakages in reverse
# dependencies. Combined with short release cycle
# (6 weeks) this makes it pretty much unusable as
# a shared library. See bug #417879, bug #420995,
# bug #471582, bug #477300, bug #484786, bug #490214.
# Also, the following discussions: 
# - http://thread.gmane.org/gmane.linux.gentoo.devel/88222
# - http://thread.gmane.org/gmane.linux.gentoo.devel/88811
dev-lang/v8

All packages depending on shared v8 library are now either bundling it or are masked.

This is obviously a quite controversial change. People are opposed to bundling libraries for good reasons. I'd like to make it clear that I'm also strongly in favor of using system libraries when possible. I'm also pragmatic though: in case of v8 this resulted in multiple bug experienced by users - just see the links above. With the API of v8 changing every 6 weeks and security fixes being pushed every now and then, these other packages depending on v8 just don't keep up.

Now the v8 team has made some nice improvements, as you can see on https://code.google.com/p/v8/wiki/Source:
V8 public API (basically the files under include/ directory) may change over time. New types/methods may be added without breaking existing functionality. When we decide that want to drop some existing class/methods, we first mark it with V8_DEPRECATED macro which will cause compile time warnings when the deprecated methods are called by the embedder. We keep deprecated method for one branch and then remove it. E.g. if v8::CpuProfiler::FindCpuProfile was plain non deprecated in 3.17 branch, marked as V8_DEPRECATED in 3.18, it may well be removed in 3.19 branch.
Indeed I see V8_DEPRECATED being used in new v8 changes instead of removing APIs immediately. I heard sometimes they need to remove things immediately anyway, when some APIs are inherently buggy and lead to memory errors like leaks or double-frees.

Ideally I'd like to see a longer grace period for removal of APIs (not just one release, since it's only 6 weeks). Then maybe ABI stability in the scope of one release could become a consideration. The consistent usage of V8_DEPRECATED will for sure lead to more data about how much more effort is now needed to maintain the V8 API. If it turns out to be manageable, I hope to see these further improvements being tried.

By the way, with ffmpeg/libav we're pretty much in a very similar situation: Chromium uses bleeding-edge ffmpeg code, and other packages just can't keep up with API updates. Sometimes the APIs Chromium depends on are not part of any ffmpeg/libav release: Chromium developers actively contribute to upstream ffmpeg codebase and it's reasonable to iterate quickly instead of waiting for a release. I think even Fedora has exceptions for bundling libraries in such circumstances.

I don't expect this post to appease most people from the packaging community. This is just stating where we currently are. Maybe having some more stable layer that could be added on top of Chromium codebase in a manner similar to Ozone-Wayland would be one way. Still, that'd be a considerable engineering effort, mostly to keep old things working rather than developing the future. I think it has some value, the main question is just what to do with limited time developers have.

I'm experimentally enabling comments for this post - feel free to share your thoughts.

April 15, 2013

Best articles about Blink rendering engine according to me

It is now over a week since announcement of Blink, a rendering engine for the Chromium project.

I hope it could be useful to provide links to the best articles about it, which have good, technical contents.

Thoughts on Blink from HTML5 Test is a good summary about history of Chrome, WebKit, and puts this recent announcement in context. For even more context (nothing about Blink) you can read Paul Irish's excellent WebKit for Developers post.

Peter-Paul Koch (probably best known for quirksmode.org) has good articles about Blink: Blink and Blinkbait.

I also found it interesting to ready Krzysztof Kowalczyk's Thoughts on Blink.

Highly recommended Google+ posts by Chromium developers:
If you're interested in the technical details or want to participate in the discussions, why not follow blink-dev, the mailing list of the project?

February 9, 2013

Bumpy upgrades: udev-171 -> udev-197, iptables-1.4.13 -> iptables-1.4.16.3

I guess many people may hit similar problems, so here is my experience of the upgrades. Generally it was pretty smooth, but required paying attention to the details and some documentation/forums lookups.

udev-171 -> udev-197 upgrade

  1. Make sure you have CONFIG_DEVTMPFS=y in kernel .config, otherwise the system becomes unbootable for sure (I think the error message during boot mentions that config option, which is good).
  2. The ebuild also asks for CONFIG_BLK_DEV_BSG=y, not sure if that's strictly needed but I'm including it here for completeness.
  3. Things work fine for me without DEVTMPFS_MOUNT. I haven't tried with it enabled, I guess it's optional.
  4. I do not have a split /usr. YMMV then if you do.
  5. Make sure to run "rc-update del udev-postmount".
  6. Expect network device names to change (I guess this is a non-issue for systems with a single network card). This can really mess up things in quite surprising ways. It seems /etc/udev/rules.d/70-persistent-net.rules no longer works (bug #453494). Note that the "new way" to do the same thing (http://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames) is disabled by default in Gentoo (see /etc/udev/rules.d/80-net-name-slot.rules). For now I've adjusted my firewall and other configs, but I think I'll need to figure out the new persistent net naming system.

iptables-1.4.13 -> iptables-1.4.16.3

* Loading iptables state and starting firewall ...
WARNING: The state match is obsolete. Use conntrack instead.
iptables-restore v1.4.16.3: state: option "--state" must be specified

It can be really non-obvious what to do with this one. Change your rules from e.g. "-m state --state RELATED" to "-m conntrack --ctstate RELATED". See http://forums.gentoo.org/viewtopic-t-940302.html for more info.
  Also note that iptables-restore doesn't really provide good error messages, e.g. "iptables-restore: line 48 failed". I didn't find a way to make it say what exactly was wrong (the line in question was just a COMMIT line, it didn't actually identify the real offending line). These mysterious errors are usually caused by missing kernel support for some firewall features/targets.

two upgrades together

Actually what adds to the confusion is having these two upgrades done simultaneously. This makes it harder to identify which upgrade is responsible for which breakage. For an even smoother ride, I'd recommend upgrading iptables first, making sure the updated rules work, and then proceed with udev.

January 28, 2013

State of Chromium Open Source packages

Let me present an informal an unofficial state of Chromium Open Source packages as I see it. Note a possible bias: I'm a Chromium developer (and this post represents my views, not the projects'), and a Gentoo Linux developer (and Chromium package maintenance lead - this is a team effort, and the entire team deserves credit, especially for keeping stable and beta ebuilds up to date).

  1. Gentoo Linux - ships stable, beta and dev channels. Security updates are promptly pushed to stable. NaCl (NativeClient) is enabled, although pNaCl (Portable NaCl) is disabled. Up to 23 use_system_... gyp switches are enabled (depending on USE flags).
  2. Arch Linux - ships stable channel, promptly reacts to security updates. NaCl is enabled, following Gentoo closely - I consider that good, and I'm glad people find that code useful. :) 5 use_system_... gyp switches are enabled. A notable thing is that the PKGBUILD is one of the shortest and simplest among Chromium packages - this seems to follow from The Arch Way. There is also chromium-dev on AUR - it is more heavily based on the Gentoo package, and tracks the upstream dev channel. Uses 19 use_system_... gyp switches.
  3. FreeBSD / OpenBSD - ship stable channel, and are doing pretty well, especially when taking amount of BSD-specific patching into account. NaCl is disabled.
  4. ALT Linux - ships stable channel. NaCl seems to be disabled by default, I'm not sure what's actually shipped in compiled package. Uses 11 use_system_... gyp switches.
  5. Debian - ancient 6.x version in Squeeze, 22.x in sid at the time of this writing. This is two major milestones behind, and is missing security updates. Not recommended at this moment. :( If you are on Debian, my advice is to use Google Chrome, since official debs should work, and monitor state of the open source Chromium package. You can always return to it when it gets updated.
  6. Fedora - not in official repositories, but Tom "spot" Callaway has an unofficial repo. Note: currently the version in that repo is 23.x, one major version behind on stable. Tom wrote an article in 2009 called Chromium: Why it isn't in Fedora yet as a proper package, so there is definitely an interest to get it packaged for Fedora, which I appreciate. Many of the issues he wrote about are now fixed, and I hope to work on getting the remaining ones fixed. Please stay tuned!
This is not intended to be an exhaustive list. I'm aware of openSUSE packages, there seems to be something happening for Ubuntu, and I've heard of Slackware, Pardus, PCLinuxOS and CentOS packaging. I do not follow these closely enough though to provide a meaningful "review".

Some conclusions: different distros package Chromium differently. Pay attention to the packaging lag: with about 6 weeks upstream release cycle and each major update being a security one, this matters. Support for NativeClient is another point. There are extension and Web Store apps that use it, and when more and more sites start to use it, this will become increasingly important. Then it is interesting why on some distros some bundled libraries are used even though upstream provides an option to use a system library that is known to work on other distros.

Finally, I like how different maintainers look at each other's packages, and how patches and bugs are frequently being sent upstream.

January 4, 2013

Signal handler safety, re-entering malloc

This is a story from real-world development. From signal(7):


   Async-signal-safe functions
       A  signal  handler  function must be very careful,
       since processing elsewhere may be interrupted at some
       arbitrary point in the execution of the program.
       POSIX has the concept of "safe function".  If a signal
       interrupts the execution of an  unsafe  function,
       and handler calls an unsafe function, then the behavior
       of the program is undefined.

After that a list of safe functions follows, and one notable things is that malloc and free are async-signal-unsafe!

I hit this issue while enabling tcmalloc's debugallocation for Chromium Debug builds. We have a StackDumpSignalHandler for tests, which prints a stack trace on various crashing signals for easier debugging. It's very useful, and worked fine for a pretty long while (which means that "but it works!" is not a valid argument for doing unsafe things).

Now when I enabled debugallocation, I noticed hangs triggered by the stack trace display. In one example, this stack trace:

@0  0x00000000019c6c85 in tcmalloc::Abort () at third_party/tcmalloc/chromium/src/base/abort.cc:15
@1  0x00000000019b39c1 in LogPrintf (severity=-4,
    pat=0x32aeb18 "memory allocation/deallocation mismatch at %p: allocated with %s being deallocated with %s", ap=0x7fff52c379e8)
    at third_party/tcmalloc/chromium/src/base/logging.h:210
@2  0x00000000019b3a8b in RAW_LOG (lvl=-4,
    pat=0x32aeb18 "memory allocation/deallocation mismatch at %p: allocated with %s being deallocated with %s")
    at third_party/tcmalloc/chromium/src/base/logging.h:230
@3  0x00000000019c3fb1 in MallocBlock::CheckLocked (this=0x7fd18f143400, type=-21308287)
    at ./third_party/tcmalloc/chromium/src/debugallocation.cc:461
@4  0x00000000019c3c42 in MallocBlock::CheckAndClear (this=0x7fd18f143400, type=-21308287)
    at ./third_party/tcmalloc/chromium/src/debugallocation.cc:401
@5  0x00000000019c436a in MallocBlock::Deallocate (this=0x7fd18f143400, type=-21308287)
    at ./third_party/tcmalloc/chromium/src/debugallocation.cc:557
@6  0x00000000019c1929 in DebugDeallocate (ptr=0x7fd18f143420, type=-21308287)
    at ./third_party/tcmalloc/chromium/src/debugallocation.cc:998
@7  0x00000000028d1482 in tc_delete (p=0x7fd18f143420) at ./third_party/tcmalloc/chromium/src/debugallocation.cc:1232
@8  0x000000000097dc04 in cc::ResourceProvider::deleteResourceInternal (this=0x7fd191827da0, it=...) at cc/resource_provider.cc:242
@9  0x000000000097daaf in cc::ResourceProvider::deleteResource (this=0x7fd191827da0, id=1) at cc/resource_provider.cc:230
@10 0x00000000006f9824 in (anonymous namespace)::ResourceProviderTest_Basic_Test::TestBody (this=0x7fd18dc5abf0)
    at cc/resource_provider_unittest.cc:328
@11 0x00000000008ec801 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x7fd18dc5abf0,
    method=&virtual testing::Test::TestBody(), location=0x29463ab "the test body") at testing/gtest/src/gtest.cc:2071
@12 0x00000000008e9665 in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=0x7fd18dc5abf0,
    method=&virtual testing::Test::TestBody(), location=0x29463ab "the test body") at testing/gtest/src/gtest.cc:2123
@13 0x00000000008dee0d in testing::Test::Run (this=0x7fd18dc5abf0) at testing/gtest/src/gtest.cc:2143
@14 0x00000000008df3ea in testing::TestInfo::Run (this=0x7fd191823020) at testing/gtest/src/gtest.cc:2319
@15 0x00000000008df8dc in testing::TestCase::Run (this=0x7fd19181f0d0) at testing/gtest/src/gtest.cc:2426
@16 0x00000000008e3eea in testing::internal::UnitTestImpl::RunAllTests (this=0x7fd19829dd60) at testing/gtest/src/gtest.cc:4249

generates SIGSEGV (tcmalloc::Abort). This is just debugallocation having stricter checks about usage of dynamically allocated memory. Now the StackDumpSignalHandler kicks in, and internally calls malloc. But we're already inside malloc code as you can see on the above stack trace (see frame @7, bold font), and re-entering it tries to take locks that are already held, resulting in a hang.

The fix required several changes:
  • no dynamic memory, and that includes std::string and std::vector, which use it internally
  • no buffered stdio or iostreams, they are not async-signal-safe (that includes fflush)
  • custom code for number-to-string conversion that doesn't need dynamically allocated memory (snprintf is not on the list of safe functions as of POSIX.1-2008; it seems to work on a glibc-2.15-based system, but as said before this is not a good assumption to make); in this code I've named it itoa_r, and it supports both base-10 and base-16 conversions, and also negative numbers for base-10
  • warming up backtrace(3): now this is really tricky, and backtrace(3) itself is not whitelisted for being safe; in fact, on the very first call it does some memory allocations; for now I've just added a call to backtrace() from a context that is safe and happens before the signal handler may be executed; implementing backtrace(3) in a known-safe way would be another fun thing to do
Note that for the above, I've also added a unit test that triggers the deadlock scenario. This will hopefully catch cases where calling backtrace(3) leads to trouble.

For more info, feel free to read the articles below: